Sasha Rush

@srush_nlp

Researcher at Cursor https://www.youtube.com/@srush_nlp

New York, NY

rush-nlp.com

Joined December 2015

8KPosts 74KFollowers 498Following

You might like

@SchmidhuberAI

@soumithchintala

@chrmanning

@sleepinyourhat

@seb_ruder

@Thom_Wolf

@percyliang

@kchonyc

@arankomatsuzaki

@lilianweng

@ericjang11

@gneubig

@ai2_allennlp

@_rockt

@svlevine

Sasha Rush reposted

Kat @ NeurIPS

@baneepbanana

Nov 27

Hello 24 followers I will be at neurips!! Although I will be but a drop of water in the ocean of attendees I hope to see you there :-)

Sasha Rush reposted

Hardik

@TheHardikVala

Nov 25

Got addicted to @srush_nlp 's Tensor Puzzles, so I wrote a sequel with more puzzles: github.com/hardik-vala/Te…. Example:

Sasha Rush reposted

SO lucky to have Alex intern with us through Olmo 3 development & see his massive contributions to our pretrain data 🐟Alex's created WebOrganizer (ICML 2025) which moved us beyond "quality? ✅❌" towards "what type of document?" We use WebOrganizer in Olmo 3 to partition both…

kylelostat's tweet image. SO lucky to have Alex intern with us through Olmo 3 development &amp; see his massive contributions to our pretrain data

🐟Alex's created WebOrganizer (ICML 2025) which moved us beyond "quality? ✅❌" towards "what type of document?" We use WebOrganizer in Olmo 3 to partition both…

Alex Wettig

@_awettig

Nov 20

Olmo 3 has some neat pre-training data curation: - @MayeeChen found much better ways to mix WebOrganizer domains - We use quality signals not as a filter (0/1) but for setting # epochs per sample (0-7x), but any duplicates would distort this ➡️ Run global dedup across 39B docs 🤯

Sasha Rush reposted

Torsten Scholak

@tscholak

Nov 20

🚀 Introducing Apriel-H1: a family of seven 15B hybrid model (Transformer + Mamba) distilled directly from Apriel-Nemotron-15B-Thinker reasoner. ✅ Navigating throughput performance tradeoff with up to 3.4x speedup ✅ 2x speedup without performance loss ✅ Efficient distillation…

tscholak's tweet image. 🚀 Introducing Apriel-H1: a family of seven 15B hybrid model (Transformer + Mamba) distilled directly from Apriel-Nemotron-15B-Thinker reasoner.

✅ Navigating throughput performance tradeoff with up to 3.4x speedup
✅ 2x speedup without performance loss
✅ Efficient distillation…

Sasha Rush reposted

Ai2

@allen_ai

Nov 20

Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, & tool use, and an open model flow—not just the final weights, but the entire training journey. Best fully open 32B reasoning model & best 32B base model. 🧵

allen_ai's tweet image. Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, &amp; tool use, and an open model flow—not just the final weights, but the entire training journey.
Best fully open 32B reasoning model &amp; best 32B base model. 🧵

Sasha Rush reposted

Sarah Catanzaro

@sarahcat21

Nov 18

I'll be among dozens (hundreds?) of VCs attending NeurIPS this year, but among the few who might be more interested in topics like managing episodic memory with RL, avoiding model collapse when training with synthetic data, and more effectively using base models to guide…

Sasha Rush reposted

Ben Lang

@benln

Nov 18

Who would be in for Cursor Conf?

Sasha Rush reposted

Siva Reddy

@sivareddyg

Nov 17

Jacob Andreas (@jacobandreas) on "the specification problem" Can we build interactive systems for task specification? LM as an interviewer about the task Use the interview transcript as the task prompt This outperforms or is competitive to active learning or user-designed…

sivareddyg's tweet image. Jacob Andreas (@jacobandreas) on "the specification problem"

Can we build interactive systems for task specification?

LM as an interviewer about the task
Use the interview transcript as the task prompt
This outperforms or is competitive to active learning or user-designed…

Siva Reddy

@sivareddyg

Nov 17

Checkout the IVADO workshop on Deploying Autonomous Agents: Lessons, Risks and Real-World Impact happening today until Wednesday in Montreal with an exciting line up of speakers #Agents #LLMs ivado.ca/en/events/2nd-…

2nd Workshop: Deploying Autonomous Agents: Lessons, Risks, and Real-World Impact | IVADO

Source: ivado.ca

Sasha Rush reposted

eric zakariasson

@ericzakariasson

Nov 16

how we trained composer-1 by @srush_nlp youtube.com/watch?v=md8D8e…

Sasha Rush reposted

sankalp

@dejavucoder

Nov 15

some points from the talk - for the agent RL, the RL rollouts try to mimic how cursor works in production at scale including cursor as environment - try to keep training/inference similar so they use same tool call formats in prod infra architecture - trainer server (pytorch…

dejavucoder's tweet image. some points from the talk
- for the agent RL, the RL rollouts try to mimic how cursor works in production at scale including cursor as environment
- try to keep training/inference similar so they use same tool call formats in prod

infra architecture
- trainer server (pytorch…

Sasha Rush

@srush_nlp

Nov 9

Talk at Ray Summit on "Building Cursor Composer." Overview of the work from our research team. youtube.com/watch?v=md8D8e…

srush_nlp's tweet card. Ray Summit 2025 Keynote: Building Cursor Composer with Sasha Rush

youtube.com

YouTube

Ray Summit 2025 Keynote: Building Cursor Composer with Sasha Rush

Source: youtube.com

Sasha Rush reposted

Federico Cassano

@ellev3n11

Nov 14

Interesting to hear this six-month-old podcast where we discuss ideas that later evolved into what's now Online Tab RL and Composer.

Cursor

@cursor_ai

May 29

A conversation on the optimal reward for coding agents, infinite context models, and real-time RL

Sasha Rush

@srush_nlp

Nov 14

This paper is really cool! Big fan of this style of interpretability, nice to see it scaled up a bit.

Leo Gao

@nabla_theta

Nov 13

Excited to share our latest work on untangling language models by training them with extremely sparse weights! We can isolate tiny circuits inside the model responsible for various simple behaviors and understand them unprecedentedly well. openai.com/index/understa…

nabla_theta's tweet card. We trained models to think in simpler, more traceable steps—so we can better understand how they work.

Understanding neural networks through sparse circuits

Source: openai.com

Sasha Rush reposted

Leo Gao

@nabla_theta

Nov 13

Understanding neural networks through sparse circuits

Source: openai.com

Sasha Rush reposted

Siva Reddy

@sivareddyg

Nov 14

Honored to receive the Computer Science Canada Outstanding Early Career Researcher award 🏅. It is a recognition of the work carried out by my students for their courage to push fundamental ideas in natural language processing even in the era of LLMs. Thanks to my mentors and…

Mila - Institut québécois d'IA

@Mila_Quebec

Nov 14

Congratulations to Siva Reddy (@sivareddyg), Core Academic Member at Mila, who has received the prestigious Outstanding Early Career Computer Science Researcher Award from @CSCan_InfoCan , the leading organization for the computer science community in Canada.…

Mila_Quebec's tweet image. Congratulations to Siva Reddy (@sivareddyg), Core Academic Member at Mila, who has received the prestigious Outstanding Early Career Computer Science Researcher Award from @CSCan_InfoCan , the leading organization for the computer science community in Canada.…

Sasha Rush reposted

Conference on Language Modeling

@COLM_conf

Nov 11

COLM is going to San Francisco for 2026! 🗓️Dates: October 6-9, 2026 🏨Venue: Hilton San Francisco Union Square Website and CFPs for papers and workshops coming up soon!

COLM_conf's tweet image. COLM is going to San Francisco for 2026!

🗓️Dates: October 6-9, 2026
🏨Venue: Hilton San Francisco Union Square

Website and CFPs for papers and workshops coming up soon!

(((ل()(ل() 'yoav))))👾

@yoavgo

Delip Rao e/σ

@deliprao

Aran Komatsuzaki

@arankomatsuzaki

Soumith Chintala

@soumithchintala

Percy Liang

@percyliang

Kyunghyun Cho

@kchonyc

Lucas Beyer (bl16)

@giffmana

clem 🤗

@ClementDelangue

Rosanne Liu

@savvyRL

Sam Bowman

@sleepinyourhat

Christopher Manning

@chrmanning

Horace He

@cHHillee

Graham Neubig

@gneubig

Julien Chaumond

@julien_c

Jason Wei

@_jasonwei

$sarahookr's profile picture. Adaptive Intelligence. Built @Cohere_Labs, @GoogleBrain, @GoogleDeepmind. ML Efficiency, Multimodal\lingual. Changing spaces where breakthroughs happen.$

Sara Hooker

@sarahookr

Thomas Wolf

@Thom_Wolf

Zachary Lipton

@zacharylipton

Jay Alammar

@JayAlammar

Tim Dettmers

@Tim_Dettmers

Alvin Chauhan

@ItsAlvinChauhan

0x1_DastanBossaNova

@AmDastan91207

動物が癒しの政治不信

@Ddd62623066

Chukwuemeka Samuel Udoh

@im_chukz

Maria | GFLOPS CEO

@maria_saxina

Francois Oustry

@francois_oustry

Neha Gupta

@GuptaNeha64

Barna Pásztor @NeurIPS2025

@pasztorb

CRYPTO QUEEN 👑 🖤🤍

@CryptoQ91063

Faruk Guney

@farguney

Vivek Nair

@virtuallyvivek

Shirin_Shahabi

@eyeskip

Varun Jain

@varunjain2021

Jun Il Kwun

@junil_kwun

Deniz Birlikci

@denizbirlikci

Kausik Lakkaraju

@KausikLakkaraj1

Antonio Knez

@antoniok1406

Yael Einy

@einy_yael

Nikhil Mahajan

@m_nik55

Kartik Srinivas

@srinivas_kartik

harry sanders

@harrysanders22

Shloime Ilowitz

@Saul5330NY

Natasha

@Natasha1379379

Anant Bhandarkar

@ununt_ai

Owen Wilson Liu

@Stanford_Owen_L

Oscar Davies

@OscarMDavies

Jason

@HangZhang542695

jestarray

@jestarray

Lukas Wuttke

@lukas_wuttke

Powerinthenumbers.ai

@Powerinthe18689

Chalew (e/acc)

@NicolaThijssen

Eghbal Hosseini

@eghbal_hosseini

r oge r

@roger4443511025

aaab

@AldynLr92149

Theminda Srimal

@ThemindaSrimal

Ankit Jaiswal

@AnkitJa63684017

Arjun Tomar

@Verinox_AT

Marcus Vukojevic

@marcusvuk

giri giri

@gg1012794

Vadim Poletaev

@va2dim_

Archie Sengupta

@archiexzzz

Nemo

@neshat_mohamadi

Ryan | Neurips ✈️

@ryanhu20

Zhouhan Lin

@zhouhan_lin

Hamid

@hamid_kazemi22

Ferhat Erata 🦀

@ferhaterata

return of the research era ꙮ

@byebyescaling

Damon Jaw 🇺🇸

@JawadiRamiz

dachui wang

@dachui_wang1376

(((ل()(ل() 'yoav))))👾

@yoavgo

Soumith Chintala

@soumithchintala

Percy Liang

@percyliang

Kyunghyun Cho

@kchonyc

clem 🤗

@ClementDelangue

Rosanne Liu

@savvyRL

Sam Bowman

@sleepinyourhat

Christopher Manning

@chrmanning

Horace He

@cHHillee

Graham Neubig

@gneubig

Julien Chaumond

@julien_c

Jia-Bin Huang

@jbhuang0604

$sarahookr's profile picture. Adaptive Intelligence. Built @Cohere_Labs, @GoogleBrain, @GoogleDeepmind. ML Efficiency, Multimodal\lingual. Changing spaces where breakthroughs happen.$