Tired to go back to the original papers again and again? Our monograph: a systematic and fundamental recipe you can rely on! 📘 We’re excited to release 《The Principles of Diffusion Models》— with @DrYangSong, @gimdong58085414, @mittu1204, and @StefanoErmon. It traces the core…
Our latest post explores on-policy distillation, a training approach that unites the error-correcting relevance of RL with the reward density of SFT. When training it for math reasoning and as an internal chat assistant, we find that on-policy distillation can outperform other…
In diffusion LMs, discrete methods have all but displaced continuous ones (🥲). Interesting new trend: why not both? Use continuous methods to make discrete diffusion better. Diffusion duality: arxiv.org/abs/2506.10892 CADD: arxiv.org/abs/2510.01329 CCDD: arxiv.org/abs/2510.03206
New survey on diffusion language models: arxiv.org/abs/2508.10875 (via @NicolasPerezNi1). Covers pre/post-training, inference and multimodality, with very nice illustrations. I can't help but feel a bit wistful about the apparent extinction of the continuous approach after 2023🥲
Have you given Mojo a try for this? It has a bunch of infra and existing basic support for neon matmuls - I bet you could make it significantly faster!
Jeremy builds on his years of AI and teaching experience, embracing AI coding by using it the right way: increase productivity and understanding of code, rather than replace programmers with "vibe code". solveit is an innovative platform to learn and build apps. Check it out! 👇
It's a strange time to be a programmer—easier than ever to get started, but easier to let AI steer you into frustration. We've got an antidote that we've been using ourselves with 1000 preview users for the last year: "solveit" Now you can join us.🧵 answer.ai/posts/2025-10-…
Makes sense. Mojo gives you the full power of the hardware, it doesn't "abstract" it like some other systems, so it is perfect for doing this sort of work. It provides helper libraries that you can optionally use to make some things (incl tiling etc) more declarative, and…
Let's gooooooo Modular 🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀
torch.compile is PT's achilles heel!
Compiling large #PyTorch models at Meta could take an hour+. Engineers cut PT2 compile time by 80% with parallel Triton compilation, dynamic shape marking, autotuning config pruning, and cache improvements now integrated into the stack. 🔗 hubs.la/Q03J-6P20
You're a simple person who is so shy and doesn't know anything about these sorts of things :-)
It did occur to me that they're starting to understand why Mojo is actually needed...
Did you tell Jeremy that Mojo runs on all the NV and AMD consumer GPUs and is starting to work on Apple GPUs too? :-)
It was wonderful to spend the day with you in Austin today Kyle. Very excited about the collaboration and our path ahead 🦾!
One more thing to be skeptical. Matter of time that dsl dies too. Mojo 🔥 ftw. Stay tuned for part 4 modular.com/blog/matrix-mu…
For more context, see github.com/triton-lang/tr… and youtube.com/watch?v=5e1YKq…
youtube.com
YouTube
Triton Community Meetup 20250709 130219 Meeting Recording
If you'd like to learn more about Mojo + Blackwell, please read: P1: modular.com/blog/matrix-mu… P2: modular.com/blog/matrix-mu… P3: modular.com/blog/matrix-mu… P4: Coming really soon. 🚀
Triton is nice if you want to get something onto a GPU but don't need full performance/TCO. However, if you want peak perf or other HW, then Mojo🔥 could be a better fit. I'm glad OpenAI folk are acknowledging this publicly, but I wrote about it here: modular.com/blog/democrati…
TIL, RIP Triton, killed by inability to have good Blackwell performance
I’d say Hell froze over, but that might just be because I’m old enough to remember when Mike Hara, VP of Investor Relations at NVIDIA, got in trouble for saying (in 2002) that NVIDIA would be bigger than Intel. wired.com/2002/07/nvidia/
wired.com
Nvidia
Meet Nvidia CEO Jen-Hsun Huang, the man who plans to make the CPU obsolete. Nvidia NASDAQ NVDA California FY 01 Sales$1.4 B FY 01 profit $177M$177 M Market cap$5.1B Microchip manufacturer PLUS The...
Huge deal between $NVDA and $INTC. NVIDIA and Intel announced a multi-generation collaboration across PC and datacenter and NVIDIA will invest $5B in Intel at $23.28 per share. The joint solution will be a tight coupling Intel x86 CPUs and NVIDIA RTX GPUs over NVLink for PCs…
United States Trends
- 1. Packers 98.4K posts
- 2. Eagles 127K posts
- 3. Jordan Love 15.2K posts
- 4. #WWERaw 132K posts
- 5. Benítez 12.1K posts
- 6. LaFleur 14.5K posts
- 7. AJ Brown 6,999 posts
- 8. Patullo 12.3K posts
- 9. Smitty 5,536 posts
- 10. McManus 4,384 posts
- 11. Jalen 24.1K posts
- 12. Jaelan Phillips 7,924 posts
- 13. Sirianni 5,037 posts
- 14. Grayson Allen 3,864 posts
- 15. #GoPackGo 7,933 posts
- 16. James Harden 1,875 posts
- 17. Berkeley 57.6K posts
- 18. Cavs 11.8K posts
- 19. Veterans Day 30.2K posts
- 20. Vit Krejci N/A
Something went wrong.
Something went wrong.