Saurabh Shah

@saurabh_shah2

training olmos @allen_ai prev @Apple @Penn 🎤dabbler of things🎸 🐈‍⬛enjoyer of cats 🐈 and mountains🏔️he/him

Seattle, WA

saurabhs.site

於十二月 2022 加入

1千貼文 3千位跟隨者 2千個跟隨中

你可能會喜歡

@BerkshireAsia

@remorax98

@LiamDugan_

@turingmusician

@veronica3207

@neuromlet

@defikin

@jackjingyuzhang

@KillerShoaib__

@Chengshuai_Yang

@catherinehavasi

@hodthoughts

@jzhou_jz

@yuchen_niu22

@zorif_

Saurabh Shah

@saurabh_shah2

年11月7日

Finbarr is the goat and it’s a little crazy that stale KV cache is just not a problem. Is there a better explanation than “each step doesn’t rly move the policy that much” (due to idk KL term?)

thanks to the excellent work from the @vllm_project team, it was easy to implement! it's egregious that PipelineRL was rejected from NeurIPS. When I describe how inflight updates works to many people, they insist it's broken and can't work. it is quite novel.

Saurabh Shah

@saurabh_shah2

年11月7日

you could do this or just do whatever kimi team did ig

Rohan Pandey

@khoomeik

年10月31日

idk what the fuss is, scaling RL to 1T+ params is simple all you need is: 1. a few thousand gpus 2. the og goat of opensource RL @vwxyzjn 3. the guy who invented the attention mechanism @DBahdanau

Saurabh Shah

@saurabh_shah2

年11月6日

Yayyy come intern with us we’re cool and you can publish and we got some p good amt of GPUs and p good snacks too

Kyle Lo

@kylelostat

年11月5日

why intern at Ai2? 🐟interns own major parts of our model development, sometimes even leading whole projects 🐡we're committed to open science & actively help our interns publish their work reach out if u wanna build open language models together 🤝 links👇

kylelostat's tweet image. why intern at Ai2?

🐟interns own major parts of our model development, sometimes even leading whole projects
🐡we're committed to open science &amp; actively help our interns publish their work

reach out if u wanna build open language models together 🤝

links👇

Saurabh Shah

@saurabh_shah2

年11月6日

Said something cool to hamish

Saurabh Shah

@saurabh_shah2

年11月5日

said a cool thing to @ethnlshn

Saurabh Shah

@saurabh_shah2

年11月5日

Sarah is one of the og interp goats, and is also rly good at naming papers. Would be so fun to be advised by her -> go apply dummies!!

Sarah Wiegreffe

@sarahwiegreffe

年11月4日

I am recruiting 2 PhD students to work on LM interpretability at UMD @umdcs starting in fall 2026! We are #3 in AI and #4 in NLP research on @CSrankings. Come join us in our lovely building just a few miles from Washington, D.C. Details in 🧵

sarahwiegreffe's tweet image. I am recruiting 2 PhD students to work on LM interpretability at UMD @umdcs starting in fall 2026!

We are #3 in AI and #4 in NLP research on @CSrankings.
Come join us in our lovely building just a few miles from Washington, D.C. Details in 🧵

Saurabh Shah

@saurabh_shah2

年11月5日

Ohhhhh I can just pay for getting impressions to get money

Saurabh Shah

@saurabh_shah2

年11月5日

Pls engage w my tweeets so I can get money Pls Ty

Saurabh Shah

@saurabh_shah2

年11月5日

Pls engage w my tweeets so I can get money Pls Ty

Saurabh Shah

@saurabh_shah2

年11月4日

Ohhhh so that’s why everyone gets kinda annoying on here once they get the check mark

Saurabh Shah

@saurabh_shah2

年11月4日

Fuck it. Blue check mark

Saurabh Shah

@saurabh_shah2

年11月4日

Fuck it. Blue check mark

Saurabh Shah

@saurabh_shah2

年11月3日

how it feels spending 30 mins going through some data after having ~ full intellectual freedom to do and learn whatever I wanted for 8 months

Saurabh Shah

@saurabh_shah2

年11月3日

LM's maybe have totally destroyed the way we do bottom-up learning (i.e. school) and top-down learning is easier and more accessible than ever This is perfect timing for me but I do worry about like, everyone who's in school "Abandon curricula" isn't the right solution here

Saurabh Shah

@saurabh_shah2

年10月29日

#NewProfilePic

Saurabh Shah

@saurabh_shah2

年10月29日

There’s two ways I use coding models: 1) agent: fire off a claude switch repos and repeat. Lots of context switches, needs mental agility 2) flow state: deep work on 1 thing. Needs a rly fast model used sparingly Excited to see if composer can raise the ceiling for (2)!

Sasha Rush

@srush_nlp

年10月29日

Our benchmark comparisons of the model.

Saurabh Shah

@saurabh_shah2

年10月27日

> try reasonable thing > doesn't push evals up > model has good vibes though > but I lose 10 points on LiveCodeBench > surely there's a mistake in our impl of LCB > who can I blame for this > oh it's me. it's always me

Saurabh Shah

@saurabh_shah2

年10月27日

average day in the Ai2 slack

Saurabh Shah

@saurabh_shah2

年10月24日

Demis and Deepmind reaching new levels of based. Cure cancer Solve fusion Hire goodside I’m with it. Deepmind for sure has the Mandate of Heaven rn

Demis Hassabis

@demishassabis

年10月23日

Welcome Riley!! Great to have you on the team!

Saurabh Shah 已轉發

Ariel

@redtachyon

年10月22日

OpenAI: ships a browser Anthropic: ships a blogpost Deepmind: solves Navier Stokes Meta: ...fuck it, let's do a layoff

Saurabh Shah

@saurabh_shah2

年10月19日

Entire timeline for Dwarkesh <> Karpathy interview

Saurabh Shah

@saurabh_shah2

年10月19日

“What the fuck does this mean” - @baneepbanana

signüll

@signulll

年10月18日

karpathy speaks like someone who’s running a mental compiler in real time with minimal interpretive latency & almost zero runtime garbage. he’s not verbose. he just threads complexity into compressed lossless statements. most smart people can be dense, but they lose clarity.…