romitjain_'s profile picture. 0049

I like organizing matrices

r0

@romitjain_

0049 I like organizing matrices

I love @askalphaxiv , but please fix this. 🙏🙏 All of my highlights and notes went away. I am ready to pay for a subscription.

romitjain_'s tweet image. I love @askalphaxiv , but please fix this. 🙏🙏
All of my highlights and notes went away. I am ready to pay for a subscription.

lol

"ah, it's 19 bits then, right" "well not the storage, that's still 32 bit" "oh, so then the accumulation is done in 19 bits then, right" "no, the accumulation is still done in full fp32" "oh, so then there's basically no precision loss from fp32 then" "well, also no"



Only works for DDP, but amazing speedups. Good for inference setups, not for training workloads (yet)

🚨 Introducing FlashPack: Lightning-fast model loading package for PyTorch! ⚡ 3-6x faster model loading than current methods 📦 Convert existing checkpoints in one command 🔧 Works on any system Read our blogpost for more details!👇️ blog.fal.ai/introducing-fl…



every morning, I wake up, and get ready to face my battle.. bangalore traffic


r0 reposted

Title: Advice for a young investigator in the first and last days of the Anthropocene Abstract: Within just a few years, it is likely that we will create AI systems that outperform the best humans on all intellectual tasks. This will have implications for your research and…

jaschasd's tweet image. Title: Advice for a young investigator in the first and last days of the Anthropocene

Abstract: Within just a few years, it is likely that we will create AI systems that outperform the best humans on all intellectual tasks. This will have implications for your research and…

r0 reposted

FlashInfer redefines how attention kernels, kv-cache layouts and dynamic runtimes are compiled and scheduled for efficient LLM serving. Check out my latest blog "Dissecting FlashInfer - A Systems Perspective on High-Performance LLM Inference".

novasarc01's tweet image. FlashInfer redefines how attention kernels, kv-cache layouts and dynamic runtimes are compiled and scheduled for efficient LLM serving. Check out my latest blog "Dissecting FlashInfer - A Systems Perspective on High-Performance LLM Inference".
novasarc01's tweet image. FlashInfer redefines how attention kernels, kv-cache layouts and dynamic runtimes are compiled and scheduled for efficient LLM serving. Check out my latest blog "Dissecting FlashInfer - A Systems Perspective on High-Performance LLM Inference".
novasarc01's tweet image. FlashInfer redefines how attention kernels, kv-cache layouts and dynamic runtimes are compiled and scheduled for efficient LLM serving. Check out my latest blog "Dissecting FlashInfer - A Systems Perspective on High-Performance LLM Inference".
novasarc01's tweet image. FlashInfer redefines how attention kernels, kv-cache layouts and dynamic runtimes are compiled and scheduled for efficient LLM serving. Check out my latest blog "Dissecting FlashInfer - A Systems Perspective on High-Performance LLM Inference".

r0 reposted

Missed our latest vLLM office hours? We covered hybrid models as first-class citizens in @vllm_project. ✅ Hybrid model support in v1 ✅ Mamba, Mamba2, linear attention ✅ Performance from v0 → v1 ▶️ Recording: youtube.com/live/uWQ489ONv… 📑 Slides: docs.google.com/presentation/d…


Michael Jordan comes close, but Novak is truly the GOAT. His mental toughness is beyond imagination.

Messi never suffered war LeBron wasn’t treated unfairly. Phelps didn’t grow up in poverty Brady can’t talk 12 languages Woods never had 2 Goat rivals Sachin didn’t face this much hate. Bolt didn’t come from a war torn country. But Novak Djokovic faced it all and became The GOAT

Djoko_UTD's tweet image. Messi never suffered war
LeBron wasn’t treated unfairly.
Phelps didn’t grow up in poverty
Brady can’t talk 12 languages
Woods never had 2 Goat rivals 
Sachin didn’t face this much hate.
Bolt didn’t come from a war torn country.

But Novak Djokovic faced it all and became The GOAT


r0 reposted

ok final post for today did youall know there is this golden blogpost on github in markdown format, burried with zero visibility because its not github pages or readme but its so good?

cloneofsimo's tweet image. ok final post for today
did youall know there is this golden blogpost on github in markdown format, burried with zero visibility because its not github pages or readme
but its so good?

If you procrastinate, try scheduling your procrastination time earlier in the day so it doesn’t interfere with your productivity.

If you have anxiety, try scheduling your worry time earlier in the day so it doesn’t interfere with your sleep.

bryan_johnson's tweet image. If you have anxiety, try scheduling your worry time earlier in the day so it doesn’t interfere with your sleep.


r0 reposted

announcing the @GPU_MODE x @scaleml summer speaker series happening next week, a 5⃣-day series where top researchers will teach about the algorithmic and systems-level advances that underpin `gpt-oss`! all content will be live-streamed & recorded for FREE on GPU MODE's YouTube!

a1zhang's tweet image. announcing the @GPU_MODE x @scaleml summer speaker series happening next week, a 5⃣-day series where top researchers will teach about the algorithmic and systems-level advances that underpin `gpt-oss`!

all content will be live-streamed & recorded for FREE on GPU MODE's YouTube!

too real

Everything is computer

RampCapitalLLC's tweet image. Everything is computer


Sometimes a couple of bad weeks can do wonders. Motivates you to be better and you come back stronger.


What an amazing course. I did a few lectures on optimization and kernels. They seem to be good (for high-level understanding). For low-level, their assignments are worth it..

Wrapped up Stanford CS336 (Language Models from Scratch), taught with an amazing team @tatsu_hashimoto @marcelroed @neilbband @rckpudi. Researchers are becoming detached from the technical details of how LMs work. In CS336, we try to fix that by having students build everything:



These two statements - "The hottest new programming language is English" by Karpathy and "gpus go brrrr" by Horace he perfectly sums up the current LLM era


r0 reposted

everybody wants to do fun experiments nobody wants to write core infrastructure code


Have been tinkering with @vllm_project since its release. It's a beautiful library. I hope it remains the same. One of my earlier long-form articles was around understanding vLLM's behaviour - cmeraki.github.io/throughput-is-…

the "Design" pages of vLLM are actually incredible. found just the other day and bingeread them all over the weekend how delightful to know there are still AI researchers doing Real Computer Science custom hashing, careful memory management... they even use linked lists...

jxmnop's tweet image. the "Design" pages of vLLM are actually incredible.  found just the other day and bingeread them all over the weekend

how delightful to know there are still AI researchers doing Real Computer Science

custom hashing, careful memory management... they even use linked lists...
jxmnop's tweet image. the "Design" pages of vLLM are actually incredible.  found just the other day and bingeread them all over the weekend

how delightful to know there are still AI researchers doing Real Computer Science

custom hashing, careful memory management... they even use linked lists...


This is the year, I switch to Android for good. Have been with iOS for 16 years. Can’t fathom this anymore.

every day we stray farther off steve’s light.. #ios26

merterdir's tweet image. every day we stray farther off steve’s light.. #ios26


Loading...

Something went wrong.


Something went wrong.