saurabh_shah2's profile picture. training olmos @allen_ai prev @Apple @Penn 🎤dabbler of things🎸 🐈‍⬛enjoyer of cats 🐈 and mountains🏔️he/him

Saurabh Shah

@saurabh_shah2

training olmos @allen_ai prev @Apple @Penn 🎤dabbler of things🎸 🐈‍⬛enjoyer of cats 🐈 and mountains🏔️he/him

Finbarr is the goat and it’s a little crazy that stale KV cache is just not a problem. Is there a better explanation than “each step doesn’t rly move the policy that much” (due to idk KL term?)

thanks to the excellent work from the @vllm_project team, it was easy to implement! it's egregious that PipelineRL was rejected from NeurIPS. When I describe how inflight updates works to many people, they insist it's broken and can't work. it is quite novel.



you could do this or just do whatever kimi team did ig

idk what the fuss is, scaling RL to 1T+ params is simple all you need is: 1. a few thousand gpus 2. the og goat of opensource RL @vwxyzjn 3. the guy who invented the attention mechanism @DBahdanau



Yayyy come intern with us we’re cool and you can publish and we got some p good amt of GPUs and p good snacks too

why intern at Ai2? 🐟interns own major parts of our model development, sometimes even leading whole projects 🐡we're committed to open science & actively help our interns publish their work reach out if u wanna build open language models together 🤝 links👇

kylelostat's tweet image. why intern at Ai2?

🐟interns own major parts of our model development, sometimes even leading whole projects
🐡we're committed to open science & actively help our interns publish their work

reach out if u wanna build open language models together 🤝

links👇


Said something cool to hamish

saurabh_shah2's tweet image. Said something cool to hamish

Sarah is one of the og interp goats, and is also rly good at naming papers. Would be so fun to be advised by her -> go apply dummies!!

I am recruiting 2 PhD students to work on LM interpretability at UMD @umdcs starting in fall 2026! We are #3 in AI and #4 in NLP research on @CSrankings. Come join us in our lovely building just a few miles from Washington, D.C. Details in 🧵

sarahwiegreffe's tweet image. I am recruiting 2 PhD students to work on LM interpretability at UMD @umdcs starting in fall 2026! 

We are #3 in AI and #4 in NLP research on @CSrankings.
Come join us in our lovely building just a few miles from Washington, D.C. Details in 🧵


Ohhhhh I can just pay for getting impressions to get money

saurabh_shah2's tweet image. Ohhhhh I can just pay for getting impressions to get money

Pls engage w my tweeets so I can get money Pls Ty



Pls engage w my tweeets so I can get money Pls Ty


Ohhhh so that’s why everyone gets kinda annoying on here once they get the check mark

saurabh_shah2's tweet image. Ohhhh so that’s why everyone gets kinda annoying on here once they get the check mark

Fuck it. Blue check mark



Fuck it. Blue check mark


how it feels spending 30 mins going through some data after having ~ full intellectual freedom to do and learn whatever I wanted for 8 months

saurabh_shah2's tweet image. how it feels spending 30 mins going through some data after having ~ full intellectual freedom to do and learn whatever I wanted for 8 months

LM's maybe have totally destroyed the way we do bottom-up learning (i.e. school) and top-down learning is easier and more accessible than ever This is perfect timing for me but I do worry about like, everyone who's in school "Abandon curricula" isn't the right solution here


There’s two ways I use coding models: 1) agent: fire off a claude switch repos and repeat. Lots of context switches, needs mental agility 2) flow state: deep work on 1 thing. Needs a rly fast model used sparingly Excited to see if composer can raise the ceiling for (2)!

Our benchmark comparisons of the model.

srush_nlp's tweet image. Our benchmark comparisons of the model.


> try reasonable thing > doesn't push evals up > model has good vibes though > but I lose 10 points on LiveCodeBench > surely there's a mistake in our impl of LCB > who can I blame for this > oh it's me. it's always me

saurabh_shah2's tweet image. > try reasonable thing
> doesn't push evals up
> model has good vibes though
> but I lose 10 points on LiveCodeBench
> surely there's a mistake in our impl of LCB
> who can I blame for this
> oh it's me. it's always me

average day in the Ai2 slack

saurabh_shah2's tweet image. average day in the Ai2 slack

Demis and Deepmind reaching new levels of based. Cure cancer Solve fusion Hire goodside I’m with it. Deepmind for sure has the Mandate of Heaven rn

Welcome Riley!! Great to have you on the team!



Saurabh Shah 已轉發

OpenAI: ships a browser Anthropic: ships a blogpost Deepmind: solves Navier Stokes Meta: ...fuck it, let's do a layoff


Entire timeline for Dwarkesh <> Karpathy interview

saurabh_shah2's tweet image. Entire timeline for Dwarkesh &amp;lt;&amp;gt; Karpathy interview

“What the fuck does this mean” - @baneepbanana

karpathy speaks like someone who’s running a mental compiler in real time with minimal interpretive latency & almost zero runtime garbage. he’s not verbose. he just threads complexity into compressed lossless statements. most smart people can be dense, but they lose clarity.…



Loading...

Something went wrong.


Something went wrong.