manojrajarao's profile picture. prev: AI infra @perplexity_ai, @tesla_ai, @aws_ai

Manoj Rao

@manojrajarao

prev: AI infra @perplexity_ai, @tesla_ai, @aws_ai

Manoj Rao أعاد

Training massive Mixture-of-Experts (MoE) models like DeepSeek-V3 and Llama 4-Scout efficiently is one of the challenges in modern AI. These models push GPUs, networks, and compilers to their limits. To tackle this, AMD and Meta’s PyTorch teams joined forces to tune TorchTitan…

PyTorch's tweet image. Training massive Mixture-of-Experts (MoE) models like DeepSeek-V3 and Llama 4-Scout efficiently is one of the challenges in modern AI. These models push GPUs, networks, and compilers to their limits.

To tackle this, AMD and Meta’s PyTorch teams joined forces to tune TorchTitan…

I enjoyed this one.

Out now @elonmusk



They (finally) shipped the obvious feature. Might bring order to our mess, though parts still feel oddly manual.

manojrajarao's tweet image. They (finally) shipped the obvious feature. Might bring order to our mess, though parts still feel oddly manual.

Manoj Rao أعاد

BREAKING🚨: NASA will reveal the latest images of Interstellar visitor, 3I/ATLAS, on Nov. 19.

MAstronomers's tweet image. BREAKING🚨: NASA will reveal the latest images of Interstellar visitor, 3I/ATLAS, on Nov. 19.

Manoj Rao أعاد

I heard Gemini 3 answers questions before you ask them. And that it can talk to your cat.


🔥🔥

manojrajarao's tweet image. 🔥🔥

AI has been built on one vendor’s stack for too long. AMD’s GPUs now offer state-of-the-art peak compute and memory bandwidth — but the lack of mature software / the “CUDA moat” keeps that power locked away. Time to break it and ride into our multi-silicon future. 🌊 It's been a…

simran_s_arora's tweet image. AI has been built on one vendor’s stack for too long.
AMD’s GPUs now offer state-of-the-art peak compute and memory bandwidth — but the lack of mature software / the “CUDA moat” keeps that power locked away. Time to break it and ride into our multi-silicon future. 🌊

It's been a…


Manoj Rao أعاد

AI has been built on one vendor’s stack for too long. AMD’s GPUs now offer state-of-the-art peak compute and memory bandwidth — but the lack of mature software / the “CUDA moat” keeps that power locked away. Time to break it and ride into our multi-silicon future. 🌊 It's been a…

simran_s_arora's tweet image. AI has been built on one vendor’s stack for too long.
AMD’s GPUs now offer state-of-the-art peak compute and memory bandwidth — but the lack of mature software / the “CUDA moat” keeps that power locked away. Time to break it and ride into our multi-silicon future. 🌊

It's been a…

This was a fun event. gpumode.com/v2/news Thanks @marksaroufim @caseyaylward for organizing @danielhanchen @cHHillee for feedback PR soon!

manojrajarao's tweet image. This was a fun event. gpumode.com/v2/news

Thanks @marksaroufim @caseyaylward for organizing
@danielhanchen @cHHillee for feedback

PR soon!

A week ago I went to my first @gpu_mode hackathon, and, together with @manojrajarao, @Ameen_ml and Emily Shen, placed fourth with HelionEvolve, an OpenEvolve-based autotuner for (Helion) GPU kernels.

ethanboneh's tweet image. A week ago I went to my first @gpu_mode hackathon, and, together with @manojrajarao, @Ameen_ml and Emily Shen, placed fourth with HelionEvolve, an OpenEvolve-based autotuner for (Helion) GPU kernels.


Mandatory for free-tier, opt-in for plus++ and Stargate would be paid for.

I want ads in chat gpt so badly. Please tell me what to buy my wife, where to take my mother on vacation, what to think, what to wear, what to read.



Manoj Rao أعاد

We open-sourced QeRL — Quantization-enhanced Reinforcement Learning ! 🧠 4-bit quantized RL training 💪 Train a 32B LLM on a single H100 GPU ⚙️ 1.7× faster overall training 🎯 Accuracy on par with bfloat16-level accuracy 🔥 Supports NVFP4 quantization format Moreover, we show…


Just UX: why does @windsurf feel way faster than @cursor_ai or vscode + copilot (all running Sonnet 4.5)? As a ex-DoomEmacs user, I’d been missing that snappiness in AI IDEs.


Today, I overheard the legendary @jeremyphoward dissuading another man from trying to change Physics with this whole AI thing. (thankfully, he sounded convinced!) Also, I realized I blew the chance for an epic selfie with Jeremy @iScienceLuvr & @johnowhitaker :(


The "...thethethethe..." explanation of process-based provisioning was eye-opening. Many other great ones in this. 👏 @dwarkesh_sp @karpathy

The @karpathy interview 0:00:00 – AGI is still a decade away 0:30:33 – LLM cognitive deficits 0:40:53 – RL is terrible 0:50:26 – How do humans learn? 1:07:13 – AGI will blend into 2% GDP growth 1:18:24 – ASI 1:33:38 – Evolution of intelligence & culture 1:43:43 - Why self…



An absolute Rockstar!

Recharging with a Javelin in the Swiss Alps. 🔋🎯

Neeraj_chopra1's tweet image. Recharging with a Javelin in the Swiss Alps. 🔋🎯
Neeraj_chopra1's tweet image. Recharging with a Javelin in the Swiss Alps. 🔋🎯


w00t!!

Today we are launching InferenceMAX! We have support from Nvidia, AMD, OpenAI, Microsoft, Pytorch, SGLang, vLLM, Oracle, CoreWeave, TogetherAI, Nebius, Crusoe, HPE, SuperMicro, Dell It runs every day on the latest software (vLLM, SGLang, etc) across hundreds of GPUs, $10Ms of…



Coming Soon...



Hitting play on this soon... All you Mech interp VCs holding the bag hear me out: it's a massive psyop @NeelNanda5

manojrajarao's tweet image. Hitting play on this soon...

All you Mech interp VCs holding the bag hear me out: it's a massive psyop @NeelNanda5
manojrajarao's tweet image. Hitting play on this soon...

All you Mech interp VCs holding the bag hear me out: it's a massive psyop @NeelNanda5

Loading...

Something went wrong.


Something went wrong.