Mats

@mats_cgo

Joined November 2009

8KPosts 230Followers 2KFollowing

You might like

@davydog187

@imSachinUK

@w0rddriven

@RiekoKubota

@bside_bryan

$zkwhited's profile picture. ¯\_(ツ)_/¯ "It depends."$

@zkwhited

@mogadget

Mats reposted

Quankai Gao

@UUUUUsher

10 h

🚀 Introducing InstantSfM: Fully Sparse and Parallel Structure-from-Motion. ✅ Python + GPU-optimized implementation, no C++ anymore! ✅ 40× faster than COLMAP with 5K images on single GPU! ✅ Scales beyond 100 images (more than VGGT/VGGSfM can consume)! ✅ Support metric scale.

UUUUUsher's tweet image. 🚀 Introducing InstantSfM: Fully Sparse and Parallel Structure-from-Motion.
✅ Python + GPU-optimized implementation, no C++ anymore!
✅ 40× faster than COLMAP with 5K images on single GPU!
✅ Scales beyond 100 images (more than VGGT/VGGSfM can consume)!
✅ Support metric scale.

Mats reposted

Surya Dantuluri

@sdand

8 h

To solve AGI, we must first solve Geoguessr For that I built vlm-gym, a simple RL gym written in scratch, in JAX for Qwen3VL-4B (released yesterday) And added Geospot, a RL environment for geolocation and learned VLMs can learn how to geoguess. More:

Mats reposted

Alex Albert

@alexalbert__

8 h

Today we're introducing Skills in claude dot ai, Claude Code, and the API. Skills let you package specialized knowledge into reusable capabilities that Claude loads on demand as agents tackle more complex tasks. Here's how they work and why they matter for the future of agents:

alexalbert__'s tweet image. Today we're introducing Skills in claude dot ai, Claude Code, and the API.

Skills let you package specialized knowledge into reusable capabilities that Claude loads on demand as agents tackle more complex tasks.

Here's how they work and why they matter for the future of agents:

Mats reposted

Will Eastcott

@willeastcott

12 h

Google Maps v2.0? 🌍 Here's @PlayCanvas streaming a scene with 2 BILLION Gaussians! 💪 Coming sooooon!

Mats reposted

PaddlePaddle

@PaddlePaddle

12 h

🚀 PaddleOCR-VL is here! Introducing PaddleOCR-VL (0.9B) — the ultra-compact Vision-Language model that reaches SOTA accuracy across text, tables, formulas, charts & handwriting. Breaking the limits of document parsing!🌍 Powered by: • NaViT dynamic vision encoder • ERNIE…

PaddlePaddle's tweet image. 🚀 PaddleOCR-VL is here!

Introducing PaddleOCR-VL (0.9B) — the ultra-compact Vision-Language model that reaches SOTA accuracy across text, tables, formulas, charts &amp; handwriting. Breaking the limits of document parsing!🌍

Powered by:
• NaViT dynamic vision encoder
• ERNIE…

Mats reposted

Tengfei Wang

@DylanTFWang

20 h

⚡️Generating 3DGS scenes in 5 seconds on a single GPU⚡️ #FlashWorld enables ⚡️*fast*⚡️ (10~100x faster than previous methods) and 🔥*high-quality*🔥 3D world generation, from a single image or text prompt. Code: github.com/imlixinyang/Fl… Page: imlixinyang.github.io/FlashWorld-Pro…

Mats reposted

MrNeRF

@janusch_patas

19 h

InstantSfM: Fully Sparse and Parallel Structure-from-Motion TLDR: InstantSfM is a fully sparse and parallel Structure-from-Motion pipeline. It leverages GPU acceleration to achieve up to 40× speedup over traditional methods like COLMAP, while maintaining or improving…

Mats reposted

rajan agarwal

@_rajanagarwal

Oct 15

make nanochat multimodal for < $10! this evening, i trained nanochatVL: via a projection model (llava-style) between SigLIP ViT and @karpathy nanochat to extend its understanding to images it's a huge wip rn, but have a few promising results! now i can finally sleep

Andrej Karpathy

@karpathy

Oct 13

Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,…

karpathy's tweet image. Excited to release new repo: nanochat!
(it's among the most unhinged I've written).

Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,…

Mats reposted

Sundar Pichai

@sundarpichai

Oct 15

The model + resources are now on HuggingFace and GitHub so researchers can keep building and experimenting. More details here: blog.google/technology/ai/…

sundarpichai's tweet card. We’re launching a new 27 billion parameter foundation model for single-cell analysis built on the Gemma family of open models.

How a Gemma model helped discover a new potential cancer therapy pathway

Source: blog.google

Mats reposted

MrNeRF

@janusch_patas

18 h

FlashWorld: High-quality 3D Scene Generation within Seconds Contributions: • We introduce a dual-mode pretraining strategy built on a video diffusion model to train a multi-view diffusion model. This model is capable of operating in both MV-oriented and 3D-oriented modes. •…

Mats reposted

merve

@mervenoyann

Oct 14

the team at @Alibaba_Qwen literally cooks 🤠 they released a VLM cookbook that shows you how to do various tasks from OCR to object grounding using Qwen3-VL 👏

mervenoyann's tweet image. the team at @Alibaba_Qwen literally cooks 🤠

they released a VLM cookbook that shows you how to do various tasks from OCR to object grounding using Qwen3-VL 👏

Mats reposted

Francis Engelmann

@FrancisEngelman

Oct 14

Spatial representations are central to world models🌍 SuperDec is an extremely compact 3D scene representation (replacing millions of Gaussians with just a few hundred primitives) ideal for abstract reasoning and planning in 3D ➡️super-dec.github.io ✨Oral @ICCVConference

Elisabetta Fedele

@efedele16

Oct 14

Are photorealistic representations all we need? In SuperDec, we turn millions of points into compact and modular abstractions made of just a few superquadrics!🧩 Try our code and get a compact representation of your favorite scene!🚀 👾: github.com/elisabettafede…

Mats reposted

Alex Danilowicz

@alexdanilowicz

Oct 15

Excited about this one. Direct result of customer feedback. We saw hundreds of chat apps being created, and now you can actually hook it up to OpenAI. x.com/magicpatterns/…

Magic Patterns

@magicpatterns

Oct 15

You can now connect your Magic Patterns design to Open AI in our new integrations tab. Building a chatbot or chat app is one of the most popular use cases, so we made it real.

Mats reposted

Zed

@zeddotdev

Oct 15

🥁🥁🥁

Mats reposted

Rohan Paul

@rohanpaul_ai

Oct 14

This paper shows a 2-brain design so a voice model can think and speak with near 0 delay. It scores 92.8% on a math speech test at 0 latency, and 82.5 on a dialogue test. Waiting for a full chain of thought slows replies, and mixing thinking and speaking in one model causes…

rohanpaul_ai's tweet image. This paper shows a 2-brain design so a voice model can think and speak with near 0 delay.

It scores 92.8% on a math speech test at 0 latency, and 82.5 on a dialogue test.

Waiting for a full chain of thought slows replies, and mixing thinking and speaking in one model causes…

Mats reposted

Rohan Paul

@rohanpaul_ai

Oct 14

🇨🇳 There Are More Robots Working in China Than the Rest of the World Combined They recorded a world record of 2mn+ industrial robots working in factories.

Rohan Paul

@rohanpaul_ai

Oct 13

🤖🇨🇳 China’s added 295,000 industrial robots in 2024 and a stock above 2mn. While US added about 34,000, Germany 27,000, and the UK 2,500. “If we lose this, we do not have a future at Ford,” says Jim Farley, CEO at Ford Robot intensity is also way higher in China, with 567…

rohanpaul_ai's tweet image. 🤖🇨🇳 China’s added 295,000 industrial robots in 2024 and a stock above 2mn.

While US added about 34,000, Germany 27,000, and the UK 2,500.

“If we lose this, we do not have a future at Ford,” says Jim Farley, CEO at Ford

Robot intensity is also way higher in China, with 567…

Mats reposted

Maxime Labonne

@maximelabonne

Oct 13

New LFM2 release 🥳 It's a Japanese PII extractor with only 350M parameters. It's extremely fast and on par with GPT-5 (!) in terms of quality. Check it out, it's available today on @huggingface!

Mats reposted

Andrej Karpathy

@karpathy

Oct 13

Mats reposted

Igor Cotruta

@igocrite

Oct 13

Search every version of your Revenue DAX definitions across all PBIX files in sub-second time with DuckDB and the new pbix2vpax() scalar function

igocrite's tweet image. Search every version of your Revenue DAX definitions
across all PBIX files
in sub-second time
with DuckDB and the new pbix2vpax() scalar function

Mats reposted

Surya Dantuluri

@sdand

Oct 13

I made a RL policy that guesses where a picture was taken without GPS data It continuously learns, updating its weights with every use in realtime -- over the weekend it improved 13.9% with <100 images Best of all, it does this without ever storing any image data, link below

Surya Dantuluri

@sdand

Jul 11, 2024

made an app that guesses where you are in the world with just a picture using image embeddings trained on street view data first time using swiftui, consumer apps in general, TestFlight below