mojonized's profile picture.

Mojonized🔥

@mojonized

Mojonized🔥 reposted

Tired to go back to the original papers again and again? Our monograph: a systematic and fundamental recipe you can rely on! 📘 We’re excited to release 《The Principles of Diffusion Models》— with @DrYangSong, @gimdong58085414, @mittu1204, and @StefanoErmon. It traces the core…

JCJesseLai's tweet image. Tired to go back to the original papers again and again? Our monograph: a systematic and fundamental recipe you can rely on!

📘 We’re excited to release 《The Principles of Diffusion Models》— with @DrYangSong, @gimdong58085414, @mittu1204, and @StefanoErmon.

It traces the core…

Mojonized🔥 reposted

Our latest post explores on-policy distillation, a training approach that unites the error-correcting relevance of RL with the reward density of SFT. When training it for math reasoning and as an internal chat assistant, we find that on-policy distillation can outperform other…

thinkymachines's tweet image. Our latest post explores on-policy distillation, a training approach that unites the error-correcting relevance of RL with the reward density of SFT. When training it for math reasoning and as an internal chat assistant, we find that on-policy distillation can outperform other…

Mojonized🔥 reposted

In diffusion LMs, discrete methods have all but displaced continuous ones (🥲). Interesting new trend: why not both? Use continuous methods to make discrete diffusion better. Diffusion duality: arxiv.org/abs/2506.10892 CADD: arxiv.org/abs/2510.01329 CCDD: arxiv.org/abs/2510.03206

New survey on diffusion language models: arxiv.org/abs/2508.10875 (via @NicolasPerezNi1). Covers pre/post-training, inference and multimodality, with very nice illustrations. I can't help but feel a bit wistful about the apparent extinction of the continuous approach after 2023🥲

sedielem's tweet image. New survey on diffusion language models: arxiv.org/abs/2508.10875 (via @NicolasPerezNi1). Covers pre/post-training, inference and multimodality, with very nice illustrations.

I can't help but feel a bit wistful about the apparent extinction of the continuous approach after 2023🥲


Mojonized🔥 reposted

Have you given Mojo a try for this? It has a bunch of infra and existing basic support for neon matmuls - I bet you could make it significantly faster!


Mojonized🔥 reposted

Jeremy builds on his years of AI and teaching experience, embracing AI coding by using it the right way: increase productivity and understanding of code, rather than replace programmers with "vibe code". solveit is an innovative platform to learn and build apps. Check it out! 👇

It's a strange time to be a programmer—easier than ever to get started, but easier to let AI steer you into frustration. We've got an antidote that we've been using ourselves with 1000 preview users for the last year: "solveit" Now you can join us.🧵 answer.ai/posts/2025-10-…



Mojonized🔥 reposted

Makes sense. Mojo gives you the full power of the hardware, it doesn't "abstract" it like some other systems, so it is perfect for doing this sort of work. It provides helper libraries that you can optionally use to make some things (incl tiling etc) more declarative, and…


Mojonized🔥 reposted

Let's gooooooo Modular 🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀

We raised $250M to accelerate building AI's unified compute layer! 🔥 We’re now powering trillions of tokens, making AI workloads 4x faster 🚀 and 2.5x cheaper ⬇️ for our customers, and welcomed 10K’s of new developers 👩🏼‍💻. We're excited for the future!

Modular's tweet image. We raised $250M to accelerate building AI's unified compute layer! 🔥 We’re now powering trillions of tokens, making AI workloads 4x faster 🚀 and 2.5x cheaper ⬇️ for our customers, and welcomed 10K’s of new developers 👩🏼‍💻. We're excited for the future!


Mojonized🔥 reposted

torch.compile is PT's achilles heel!

Compiling large #PyTorch models at Meta could take an hour+. Engineers cut PT2 compile time by 80% with parallel Triton compilation, dynamic shape marking, autotuning config pruning, and cache improvements now integrated into the stack. 🔗 hubs.la/Q03J-6P20

PyTorch's tweet image. Compiling large #PyTorch models at Meta could take an hour+. Engineers cut PT2 compile time by 80% with parallel Triton compilation, dynamic shape marking, autotuning config pruning, and cache improvements now integrated into the stack.

🔗 hubs.la/Q03J-6P20


Mojonized🔥 reposted

You're a simple person who is so shy and doesn't know anything about these sorts of things :-)


Mojonized🔥 reposted

It did occur to me that they're starting to understand why Mojo is actually needed...


Mojonized🔥 reposted

Did you tell Jeremy that Mojo runs on all the NV and AMD consumer GPUs and is starting to work on Apple GPUs too? :-)


Mojonized🔥 reposted

It was wonderful to spend the day with you in Austin today Kyle. Very excited about the collaboration and our path ahead 🦾!


Mojonized🔥 reposted

One more thing to be skeptical. Matter of time that dsl dies too. Mojo 🔥 ftw. Stay tuned for part 4 modular.com/blog/matrix-mu…


Mojonized🔥 reposted

Good day today @TensorWaveCloud!

clattner_llvm's tweet image. Good day today @TensorWaveCloud!
clattner_llvm's tweet image. Good day today @TensorWaveCloud!

Mojonized🔥 reposted

ssshh... 🤫 @AMD Mi355X... now available in nightlies.

Modular's tweet image. ssshh... 🤫 @AMD Mi355X...  now available in nightlies.

Mojonized🔥 reposted

Triton is nice if you want to get something onto a GPU but don't need full performance/TCO. However, if you want peak perf or other HW, then Mojo🔥 could be a better fit. I'm glad OpenAI folk are acknowledging this publicly, but I wrote about it here: modular.com/blog/democrati…

TIL, RIP Triton, killed by inability to have good Blackwell performance



Mojonized🔥 reposted

I’d say Hell froze over, but that might just be because I’m old enough to remember when Mike Hara, VP of Investor Relations at NVIDIA, got in trouble for saying (in 2002) that NVIDIA would be bigger than Intel. wired.com/2002/07/nvidia/

wired.com

Nvidia

Meet Nvidia CEO Jen-Hsun Huang, the man who plans to make the CPU obsolete. Nvidia NASDAQ NVDA California FY 01 Sales$1.4 B FY 01 profit $177M$177 M Market cap$5.1B Microchip manufacturer PLUS The...

Huge deal between $NVDA and $INTC.    NVIDIA and Intel announced a multi-generation collaboration across PC and datacenter and NVIDIA will invest $5B in Intel at $23.28 per share. The joint solution will be a tight coupling Intel x86 CPUs and NVIDIA RTX GPUs over NVLink for PCs…

PatrickMoorhead's tweet image. Huge deal between $NVDA and $INTC. 
 
NVIDIA and Intel announced a multi-generation collaboration across PC and datacenter and NVIDIA will invest $5B in Intel at $23.28 per share. The joint solution will be a tight coupling Intel x86 CPUs and NVIDIA RTX GPUs over NVLink for PCs…


United States Trends

Loading...

Something went wrong.


Something went wrong.