Saurabh Shah
@saurabh_shah2
training olmos @allen_ai prev @Apple @Penn 🎤dabbler of things🎸 🐈⬛enjoyer of cats 🐈 and mountains🏔️he/him
你可能會喜歡
Finbarr is the goat and it’s a little crazy that stale KV cache is just not a problem. Is there a better explanation than “each step doesn’t rly move the policy that much” (due to idk KL term?)
thanks to the excellent work from the @vllm_project team, it was easy to implement! it's egregious that PipelineRL was rejected from NeurIPS. When I describe how inflight updates works to many people, they insist it's broken and can't work. it is quite novel.
you could do this or just do whatever kimi team did ig
idk what the fuss is, scaling RL to 1T+ params is simple all you need is: 1. a few thousand gpus 2. the og goat of opensource RL @vwxyzjn 3. the guy who invented the attention mechanism @DBahdanau
Yayyy come intern with us we’re cool and you can publish and we got some p good amt of GPUs and p good snacks too
why intern at Ai2? 🐟interns own major parts of our model development, sometimes even leading whole projects 🐡we're committed to open science & actively help our interns publish their work reach out if u wanna build open language models together 🤝 links👇
Sarah is one of the og interp goats, and is also rly good at naming papers. Would be so fun to be advised by her -> go apply dummies!!
I am recruiting 2 PhD students to work on LM interpretability at UMD @umdcs starting in fall 2026! We are #3 in AI and #4 in NLP research on @CSrankings. Come join us in our lovely building just a few miles from Washington, D.C. Details in 🧵
Ohhhhh I can just pay for getting impressions to get money
Ohhhh so that’s why everyone gets kinda annoying on here once they get the check mark
how it feels spending 30 mins going through some data after having ~ full intellectual freedom to do and learn whatever I wanted for 8 months
LM's maybe have totally destroyed the way we do bottom-up learning (i.e. school) and top-down learning is easier and more accessible than ever This is perfect timing for me but I do worry about like, everyone who's in school "Abandon curricula" isn't the right solution here
There’s two ways I use coding models: 1) agent: fire off a claude switch repos and repeat. Lots of context switches, needs mental agility 2) flow state: deep work on 1 thing. Needs a rly fast model used sparingly Excited to see if composer can raise the ceiling for (2)!
> try reasonable thing > doesn't push evals up > model has good vibes though > but I lose 10 points on LiveCodeBench > surely there's a mistake in our impl of LCB > who can I blame for this > oh it's me. it's always me
Demis and Deepmind reaching new levels of based. Cure cancer Solve fusion Hire goodside I’m with it. Deepmind for sure has the Mandate of Heaven rn
Welcome Riley!! Great to have you on the team!
OpenAI: ships a browser Anthropic: ships a blogpost Deepmind: solves Navier Stokes Meta: ...fuck it, let's do a layoff
“What the fuck does this mean” - @baneepbanana
karpathy speaks like someone who’s running a mental compiler in real time with minimal interpretive latency & almost zero runtime garbage. he’s not verbose. he just threads complexity into compressed lossless statements. most smart people can be dense, but they lose clarity.…
United States 趨勢
- 1. Sesko 44.2K posts
- 2. Ugarte 15.4K posts
- 3. Richarlison 20.7K posts
- 4. Gameday 13.5K posts
- 5. #SaturdayVibes 4,612 posts
- 6. Amorim 63.9K posts
- 7. De Ligt 24.9K posts
- 8. Good Saturday 32.4K posts
- 9. #Caturday 4,742 posts
- 10. Cunha 24.9K posts
- 11. Casemiro 23.2K posts
- 12. Tottenham 79.4K posts
- 13. Vicario 1,956 posts
- 14. Lando 41.8K posts
- 15. #TOTMUN 17.4K posts
- 16. Texas Tech 7,306 posts
- 17. #MUFC 24.3K posts
- 18. Calen Bullock N/A
- 19. #COYS 2,637 posts
- 20. Bortoleto 20K posts
你可能會喜歡
-
Berkshire Asia
@BerkshireAsia -
Vivek Iyer
@remorax98 -
Liam Dugan
@LiamDugan_ -
Jonathan Bragg
@turingmusician -
Veronica Qing Lyu
@veronica3207 -
neuromlet
@neuromlet -
Aarav AI
@defikin -
Jack Jingyu Zhang
@jackjingyuzhang -
SHOAIB SH
@KillerShoaib__ -
Abraham Chengshuai Yang, Ph.D.
@Chengshuai_Yang -
Catherine Havasi
@catherinehavasi -
Shlomi Hod
@hodthoughts -
Jiawei (Joe) Zhou
@jzhou_jz -
Yuchen Niu
@yuchen_niu22 -
ᛉ ᛟ ᚱ ᛁ ᚠ
@zorif_
Something went wrong.
Something went wrong.