
Saurabh Shah
@saurabh_shah2
training olmos @allen_ai prev @Apple @Penn 🎤dabbler of things🎸 🐈⬛enjoyer of cats 🐈 and mountains🏔️he/him
คุณอาจชื่นชอบ
yo has anyone heard of this Olmo model, loss looks good
why does sonnet 4.5 make like different readme's for every script it writes lmao this sucks

Ohhhh it's called periodic cuz like the periodic table cuz they're doing chemistry and stuff. That's cool
today you will be presented 2 visions of humanity's future with AI if you don't want to build the infinite AI tiktok slop machine but want to develop AI that accelerates fundamental science, raising civilization to Kardashev 1 and beyond come join us at @periodiclabs
wayyy cooler than Sora lmao. If you have the privilege to be picky, you should work on things like this, not the infinite slop machine. Scale simulation, scale learning from experience, and solve our hardest problems by training systems that can think in unhuman-like ways
Today, @ekindogus and I are excited to introduce @periodiclabs. Our goal is to create an AI scientist. Science works by conjecturing how the world might be, running experiments, and learning from the results. Intelligence is necessary, but not sufficient. New knowledge is…

It's simple I think: Sonnet 4.5 for most stuff spanning easy to pretty challenging tasks. Especially good for quick scripts. GPT-5-codex high for the most challenging issues that I don't mind waiting a while for. Very surgical! No other model rly matters for coding rn IMO
First Bob now Kevin 🙄
SakanaAI presents Robust Agentic CUDA Kernel Optimization • Fuses ops, boosts forward/backward passes, outperforms torch baselines • Agentic LLM pipeline: PyTorch → CUDA → evolutionary runtime optimization • Soft-verification: LLMs flag incorrect kernels (↑30% verification…

Figured it out nw

If you visit me in Seattle like @aryaman2020 and @michaelryan207 I will show you what life’s all about Or at least take you to Ai2 office where the snacks are pretty good




I’ll be in nyc next week! Would love to grab a coffee or a drink with folks. I’m interested in language models, especially how they relate to: — reinforcement learning — code gen and agents — accelerating science, especially biology and protein design
Happy to say Ai2 is now on the frontier of blueberry size. Deepseek moment for big blueberries

Holy shit they’re doing on-policy RL by just deploying the model to prod lmao that’s so baller. also 2 hrs for a training step makes our 10 minute steps feel lightning fast @hamishivi … they probably have a bigger batch size though 😅

We've trained a new Tab model that is now the default in Cursor. This model makes 21% fewer suggestions than the previous model while having a 28% higher accept rate for the suggestions it makes. Learn more about how we improved Tab with online RL.
RL env is a thing that runs LM generation and produces a score (we call this the reward) eval is a thing that runs LM generation and produces a score good abstraction by @willccbb to unify these in verifiers
wait "environments" are just evals? did i misread something...? i thought there would be various app mockups, website clones, games, etc. to help simulate things that folks are looking to automate. (unless this is some meta point about evals == envs?)
go birds
like being part of the 2024 Eagles - total championship mentality. Excited to be investing more in @cognition and joining the team with @ScottWu46 and @russelljkaplan - Amazing to see the power law at work
Me when I’ve written a singleton class called PlasticBottle and I’ve already created an instance of it

Yeah. A “bitter lesson” I’ve been coming around to is products used by 1 billion people are not just helpful, but maybe necessary to push the frontier of what models can do…
The bitter lesson here is you can't experiment with continual learning, unless you have continual interactions, and A/B testing at scale
If you’re interp-pilled you should also be olmo-pilled FYI
GitHub losing to both hugging face and cursor needs to be studied Why does GitHub lfs suck so much! Mostly genuine question, what is hard about this (or: why is hf/xet impressive?)
rubric based rewards coming soon to an olmo near you 🫡
for the first time i am aware of, there is an entirely private subfield of AI research every company that actually trains models is doing RL with rubrics and LLM-judged rewards but academic work is stuck on RL with automated rewards (math problems and code). much cleaner for…
United States เทรนด์
- 1. Bears 90K posts
- 2. #Worlds2025 19.7K posts
- 3. Happy Birthday Charlie 11.8K posts
- 4. Jake Moody 14K posts
- 5. Snell 24.9K posts
- 6. Bills 143K posts
- 7. Falcons 51.9K posts
- 8. Caleb 49.5K posts
- 9. Josh Allen 27K posts
- 10. Joji 32K posts
- 11. #BearDown 2,407 posts
- 12. Jayden 23K posts
- 13. Swift 290K posts
- 14. Ben Johnson 4,471 posts
- 15. #Dodgers 15.5K posts
- 16. Turang 4,401 posts
- 17. Troy Aikman 6,671 posts
- 18. Roki 6,103 posts
- 19. Bijan 33.5K posts
- 20. #RaiseHail 8,449 posts
คุณอาจชื่นชอบ
-
Berkshire Asia
@BerkshireAsia -
Yue Yang
@YueYangAI -
Jonathan Bragg
@turingmusician -
neuromlet
@neuromlet -
Aarav AI
@defikin -
Jack Jingyu Zhang
@jackjingyuzhang -
SHOAIB SH
@KillerShoaib__ -
Abraham Chengshuai Yang, Ph.D.
@Chengshuai_Yang -
Catherine Havasi
@catherinehavasi -
Shlomi Hod
@hodthoughts -
Jiawei (Joe) Zhou
@jzhou_jz -
Parameter Lab
@parameterlab -
ᛉ ᛟ ᚱ ᛁ ᚠ
@zorif_ -
Zhu
@zhuexe -
Vaish Shrivastava
@VaishShrivas
Something went wrong.
Something went wrong.