ehsanmok's profile picture. Mojo🔥 maximalist @Modular. Teacher at heart. MLE. Used to know some Math. Into powerlifting. Opinions are mine.

𝔈𝔥𝔰𝔞𝔫

@ehsanmok

Mojo🔥 maximalist @Modular. Teacher at heart. MLE. Used to know some Math. Into powerlifting. Opinions are mine.

𝔈𝔥𝔰𝔞𝔫 已转帖

pytorch consulting the math library heuristics which say transposing the matrix will yield a 30% speedup over the non-transposed case, deciding to transpose the matrix, and immediately running out of memory


𝔈𝔥𝔰𝔞𝔫 已转帖

GPT-5 Pro found a counterexample to the NICD-with-erasures majority optimality (Simons list, p.25). simons.berkeley.edu/sites/default/… At p=0.4, n=5, f(x) = sign(x_1-3x_2+x_3-x_4+3x_5) gives E|f(x)|=0.43024 vs best majority 0.42904.

PI010101's tweet image. GPT-5 Pro found a counterexample to the NICD-with-erasures majority optimality (Simons list, p.25).
simons.berkeley.edu/sites/default/…

At p=0.4, n=5, f(x) = sign(x_1-3x_2+x_3-x_4+3x_5) gives E|f(x)|=0.43024 vs best majority 0.42904.
PI010101's tweet image. GPT-5 Pro found a counterexample to the NICD-with-erasures majority optimality (Simons list, p.25).
simons.berkeley.edu/sites/default/…

At p=0.4, n=5, f(x) = sign(x_1-3x_2+x_3-x_4+3x_5) gives E|f(x)|=0.43024 vs best majority 0.42904.

𝔈𝔥𝔰𝔞𝔫 已转帖

I encourage you to read this article, in which we describe the current situation and the directions in which, in our view, mathematics is heading. Many thanks to Ken Ono for including me in this extraordinary project. I look forward to a wide-ranging discussion and will be…

nasqret's tweet image. I encourage you to read this article, in which we describe the current situation and the directions in which, in our view, mathematics is heading. Many thanks to Ken Ono for including me in this extraordinary project. I look forward to a wide-ranging discussion and will be…

𝔈𝔥𝔰𝔞𝔫 已转帖

Makes sense. Mojo gives you the full power of the hardware, it doesn't "abstract" it like some other systems, so it is perfect for doing this sort of work. It provides helper libraries that you can optionally use to make some things (incl tiling etc) more declarative, and…


Hoping to see a light in one of the darkest corner of math i.e. ABC conjecture/dispute at some point in my lifetime!

Great @SimonsFdn talk by Kevin Buzzard on math's future; he notes mathematical ideas have grown so complex that traditional writing methods struggle to cope. He demonstrates how combining LLMs with proof assistants like Lean could eliminate hallucinations—the LLM proposes ideas…

leanprover's tweet image. Great @SimonsFdn talk by Kevin Buzzard on math's future; he notes mathematical ideas have grown so complex that traditional writing methods struggle to cope.

He demonstrates how combining LLMs with proof assistants like Lean could eliminate hallucinations—the LLM proposes ideas…


AI as a multiplier has created a lot of fake talents. Take it away, they collapse. For real talents, they might become less productive!


𝔈𝔥𝔰𝔞𝔫 已转帖

Mathematics. The beauty of September 27, 2025. By Jerome White, @talljerome, Used with permission.


𝔈𝔥𝔰𝔞𝔫 已转帖

AI workloads need to run across a growing number of hardware architectures, in the datacenter & at the edge. This ‘industrialization of inference’ demands making every FLOP count through increased resilience & lower costs. USIT believes @Modular is best positioned to become…

We are very thankful to all our new and exiting investors who backed us. This round was lead by Thomas Tull's @usitfund, with @dfjgrowth & all existing investors: @gvteam @generalcatalyst @greylockvc investing. Read more ⬇️ modular.com/blog/modular-r…



Let's gooooooo Modular 🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀🚀

We raised $250M to accelerate building AI's unified compute layer! 🔥 We’re now powering trillions of tokens, making AI workloads 4x faster 🚀 and 2.5x cheaper ⬇️ for our customers, and welcomed 10K’s of new developers 👩🏼‍💻. We're excited for the future!

Modular's tweet image. We raised $250M to accelerate building AI's unified compute layer! 🔥 We’re now powering trillions of tokens, making AI workloads 4x faster 🚀 and 2.5x cheaper ⬇️ for our customers, and welcomed 10K’s of new developers 👩🏼‍💻. We're excited for the future!


𝔈𝔥𝔰𝔞𝔫 已转帖

We raised $250M to accelerate building AI's unified compute layer! 🔥 We’re now powering trillions of tokens, making AI workloads 4x faster 🚀 and 2.5x cheaper ⬇️ for our customers, and welcomed 10K’s of new developers 👩🏼‍💻. We're excited for the future!

Modular's tweet image. We raised $250M to accelerate building AI's unified compute layer! 🔥 We’re now powering trillions of tokens, making AI workloads 4x faster 🚀 and 2.5x cheaper ⬇️ for our customers, and welcomed 10K’s of new developers 👩🏼‍💻. We're excited for the future!

𝔈𝔥𝔰𝔞𝔫 已转帖

We beat Nvidia’s cuBLAS kernels on B200s in 170 LOC. Using zero CUDA. Just pure Mojo. Here’s exactly how we went from 1% to 106% of Nvidia benchmark perf from scratch (with code) 👇🧵

AliesTaha's tweet image. We beat Nvidia’s cuBLAS kernels on B200s in 170 LOC.

Using zero CUDA. Just pure Mojo.

Here’s exactly how we went from 1% to 106% of Nvidia benchmark perf from scratch (with code) 👇🧵

We're going to devour llama.cpp, mlx, etc. one step at a time!

We're thrilled to announce Modular Platform 25.6 🚀 Our biggest step yet toward a unified compute layer for AI. • Peak performance on NVIDIA Blackwell & AMD MI355X • Mojo on Apple, AMD & NVIDIA consumer GPUs modular.com/blog/modular-2…



𝔈𝔥𝔰𝔞𝔫 已转帖

We know that one of the biggest barriers to programming GPUs is access to hardware: "Code you’ve written for NVIDIA or AMD GPUs should now mostly just work on an Apple🍎 Silicon GPU, assuming no device-specific features were being used." Preview here:👇 forum.modular.com/t/apple-silico…


Also the code for part 1-4 are here github.com/modular/modula… will add to the blog posts too.

Part 4 of "Matrix Multiplication on Blackwell" is here! It continues our epic journey of describing how Modular implemented the fastest B200 matmul in the industry, revealing the techniques to achieve 1772 TFLOPs, exceeding that of the current SOTA. modular.com/blog/matrix-mu…



𝔈𝔥𝔰𝔞𝔫 已转帖

Part 4 of "Matrix Multiplication on Blackwell" is here! It continues our epic journey of describing how Modular implemented the fastest B200 matmul in the industry, revealing the techniques to achieve 1772 TFLOPs, exceeding that of the current SOTA. modular.com/blog/matrix-mu…


torch.compile is PT's achilles heel!

Compiling large #PyTorch models at Meta could take an hour+. Engineers cut PT2 compile time by 80% with parallel Triton compilation, dynamic shape marking, autotuning config pruning, and cache improvements now integrated into the stack. 🔗 hubs.la/Q03J-6P20

PyTorch's tweet image. Compiling large #PyTorch models at Meta could take an hour+. Engineers cut PT2 compile time by 80% with parallel Triton compilation, dynamic shape marking, autotuning config pruning, and cache improvements now integrated into the stack.

🔗 hubs.la/Q03J-6P20


𝔈𝔥𝔰𝔞𝔫 已转帖

Rich guy open source passion projects like Ghostty and Omarchy are the internet equivalent of building a library or school


𝔈𝔥𝔰𝔞𝔫 已转帖

Triton is nice if you want to get something onto a GPU but don't need full performance/TCO. However, if you want peak perf or other HW, then Mojo🔥 could be a better fit. I'm glad OpenAI folk are acknowledging this publicly, but I wrote about it here: modular.com/blog/democrati…

TIL, RIP Triton, killed by inability to have good Blackwell performance



𝔈𝔥𝔰𝔞𝔫 已转帖

ssshh... 🤫 @AMD Mi355X... now available in nightlies.

Modular's tweet image. ssshh... 🤫 @AMD Mi355X...  now available in nightlies.

United States 趋势

Loading...

Something went wrong.


Something went wrong.