Sasha Rush

@srush_nlp

Researcher at Cursor https://www.youtube.com/@srush_nlp

New York, NY

rush-nlp.com

Tham gia vào Tháng 12 2015

8KBài đăng 73KNgười theo dõi 498Đang theo dõi

Bạn có thể thích

@SchmidhuberAI

@soumithchintala

@chrmanning

@sleepinyourhat

@seb_ruder

@Thom_Wolf

@percyliang

@kchonyc

@arankomatsuzaki

@lilianweng

@ericjang11

@gneubig

@ai2_allennlp

@_rockt

@svlevine

Sasha Rush đã đăng lại

Hardik

@TheHardikVala

25 thg 11

Got addicted to @srush_nlp 's Tensor Puzzles, so I wrote a sequel with more puzzles: github.com/hardik-vala/Te…. Example:

Sasha Rush đã đăng lại

SO lucky to have Alex intern with us through Olmo 3 development & see his massive contributions to our pretrain data 🐟Alex's created WebOrganizer (ICML 2025) which moved us beyond "quality? ✅❌" towards "what type of document?" We use WebOrganizer in Olmo 3 to partition both…

kylelostat's tweet image. SO lucky to have Alex intern with us through Olmo 3 development &amp; see his massive contributions to our pretrain data

🐟Alex's created WebOrganizer (ICML 2025) which moved us beyond "quality? ✅❌" towards "what type of document?" We use WebOrganizer in Olmo 3 to partition both…

Alex Wettig

@_awettig

20 thg 11

Olmo 3 has some neat pre-training data curation: - @MayeeChen found much better ways to mix WebOrganizer domains - We use quality signals not as a filter (0/1) but for setting # epochs per sample (0-7x), but any duplicates would distort this ➡️ Run global dedup across 39B docs 🤯

Sasha Rush đã đăng lại

Torsten Scholak

@tscholak

20 thg 11

🚀 Introducing Apriel-H1: a family of seven 15B hybrid model (Transformer + Mamba) distilled directly from Apriel-Nemotron-15B-Thinker reasoner. ✅ Navigating throughput performance tradeoff with up to 3.4x speedup ✅ 2x speedup without performance loss ✅ Efficient distillation…

tscholak's tweet image. 🚀 Introducing Apriel-H1: a family of seven 15B hybrid model (Transformer + Mamba) distilled directly from Apriel-Nemotron-15B-Thinker reasoner.

✅ Navigating throughput performance tradeoff with up to 3.4x speedup
✅ 2x speedup without performance loss
✅ Efficient distillation…

Sasha Rush đã đăng lại

Ai2

@allen_ai

20 thg 11

Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, & tool use, and an open model flow—not just the final weights, but the entire training journey. Best fully open 32B reasoning model & best 32B base model. 🧵

allen_ai's tweet image. Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, &amp; tool use, and an open model flow—not just the final weights, but the entire training journey.
Best fully open 32B reasoning model &amp; best 32B base model. 🧵

Sasha Rush đã đăng lại

Sarah Catanzaro

@sarahcat21

18 thg 11

I'll be among dozens (hundreds?) of VCs attending NeurIPS this year, but among the few who might be more interested in topics like managing episodic memory with RL, avoiding model collapse when training with synthetic data, and more effectively using base models to guide…

Sasha Rush đã đăng lại

Ben Lang

@benln

18 thg 11

Who would be in for Cursor Conf?

Sasha Rush đã đăng lại

Siva Reddy

@sivareddyg

17 thg 11

Jacob Andreas (@jacobandreas) on "the specification problem" Can we build interactive systems for task specification? LM as an interviewer about the task Use the interview transcript as the task prompt This outperforms or is competitive to active learning or user-designed…

sivareddyg's tweet image. Jacob Andreas (@jacobandreas) on "the specification problem"

Can we build interactive systems for task specification?

LM as an interviewer about the task
Use the interview transcript as the task prompt
This outperforms or is competitive to active learning or user-designed…

Siva Reddy

@sivareddyg

17 thg 11

Checkout the IVADO workshop on Deploying Autonomous Agents: Lessons, Risks and Real-World Impact happening today until Wednesday in Montreal with an exciting line up of speakers #Agents #LLMs ivado.ca/en/events/2nd-…

2nd Workshop: Deploying Autonomous Agents: Lessons, Risks, and Real-World Impact | IVADO

Nguồn: ivado.ca

Sasha Rush đã đăng lại

eric zakariasson

@ericzakariasson

16 thg 11

how we trained composer-1 by @srush_nlp youtube.com/watch?v=md8D8e…

Sasha Rush đã đăng lại

sankalp

@dejavucoder

15 thg 11

some points from the talk - for the agent RL, the RL rollouts try to mimic how cursor works in production at scale including cursor as environment - try to keep training/inference similar so they use same tool call formats in prod infra architecture - trainer server (pytorch…

dejavucoder's tweet image. some points from the talk
- for the agent RL, the RL rollouts try to mimic how cursor works in production at scale including cursor as environment
- try to keep training/inference similar so they use same tool call formats in prod

infra architecture
- trainer server (pytorch…

Sasha Rush

@srush_nlp

9 thg 11

Talk at Ray Summit on "Building Cursor Composer." Overview of the work from our research team. youtube.com/watch?v=md8D8e…

srush_nlp's tweet card. Ray Summit 2025 Keynote: Building Cursor Composer with Sasha Rush

youtube.com

YouTube

Ray Summit 2025 Keynote: Building Cursor Composer with Sasha Rush

Nguồn: youtube.com

Sasha Rush đã đăng lại

Federico Cassano

@ellev3n11

14 thg 11

Interesting to hear this six-month-old podcast where we discuss ideas that later evolved into what's now Online Tab RL and Composer.

Cursor

@cursor_ai

29 thg 5

A conversation on the optimal reward for coding agents, infinite context models, and real-time RL

Sasha Rush

@srush_nlp

14 thg 11

This paper is really cool! Big fan of this style of interpretability, nice to see it scaled up a bit.

Leo Gao

@nabla_theta

13 thg 11

Excited to share our latest work on untangling language models by training them with extremely sparse weights! We can isolate tiny circuits inside the model responsible for various simple behaviors and understand them unprecedentedly well. openai.com/index/understa…

nabla_theta's tweet card. We trained models to think in simpler, more traceable steps—so we can better understand how they work.

Understanding neural networks through sparse circuits

Nguồn: openai.com

Sasha Rush đã đăng lại

Leo Gao

@nabla_theta

13 thg 11

Understanding neural networks through sparse circuits

Nguồn: openai.com

Sasha Rush đã đăng lại

Siva Reddy

@sivareddyg

14 thg 11

Honored to receive the Computer Science Canada Outstanding Early Career Researcher award 🏅. It is a recognition of the work carried out by my students for their courage to push fundamental ideas in natural language processing even in the era of LLMs. Thanks to my mentors and…

Mila - Institut québécois d'IA

@Mila_Quebec

14 thg 11

Congratulations to Siva Reddy (@sivareddyg), Core Academic Member at Mila, who has received the prestigious Outstanding Early Career Computer Science Researcher Award from @CSCan_InfoCan , the leading organization for the computer science community in Canada.…

Mila_Quebec's tweet image. Congratulations to Siva Reddy (@sivareddyg), Core Academic Member at Mila, who has received the prestigious Outstanding Early Career Computer Science Researcher Award from @CSCan_InfoCan , the leading organization for the computer science community in Canada.…

Sasha Rush đã đăng lại

Conference on Language Modeling

@COLM_conf

11 thg 11

COLM is going to San Francisco for 2026! 🗓️Dates: October 6-9, 2026 🏨Venue: Hilton San Francisco Union Square Website and CFPs for papers and workshops coming up soon!

COLM_conf's tweet image. COLM is going to San Francisco for 2026!

🗓️Dates: October 6-9, 2026
🏨Venue: Hilton San Francisco Union Square

Website and CFPs for papers and workshops coming up soon!

Sasha Rush đã đăng lại

Conference on Language Modeling

@COLM_conf

10 thg 11

Sasha Rush đã đăng lại

will brown

@willccbb

10 thg 11

i have mostly stopped using coding models other than composer-1 and tab

droyd

@n_dof_droyd

9 thg 11

I think cursor might just have the mandate of heaven now. this composer 1 model is incredible and its been getting better (vibes). I think raw iq is no longer the bottleneck. its just reliability of tool use and harnessing

(((ل()(ل() 'yoav))))👾

@yoavgo

Delip Rao e/σ

@deliprao

Aran Komatsuzaki

@arankomatsuzaki

Soumith Chintala

@soumithchintala

Percy Liang

@percyliang

Kyunghyun Cho

@kchonyc

Lucas Beyer (bl16)

@giffmana

clem 🤗

@ClementDelangue

Rosanne Liu

@savvyRL

Sam Bowman

@sleepinyourhat

Christopher Manning

@chrmanning

Horace He

@cHHillee

Graham Neubig

@gneubig

Julien Chaumond

@julien_c

Jason Wei

@_jasonwei

$sarahookr's profile picture. Adaptive Intelligence. Built @Cohere_Labs, @GoogleBrain, @GoogleDeepmind. ML Efficiency, Multimodal\lingual. Changing spaces where breakthroughs happen.$

Sara Hooker

@sarahookr

Thomas Wolf

@Thom_Wolf

Zachary Lipton

@zacharylipton

Jay Alammar

@JayAlammar

Tim Dettmers

@Tim_Dettmers

stud

@studdatai

xyh0817

@xieyh0817

Atome

@MDatome

Pawan

@Ipawan_y

Abhishek Yadav

@CodeWithabhi7

Rupinder_Singh

@CrySmmf

Almaz Bazarbaev

@Almazb

Apathium

@apathium65906

Patrick Halpin (Sublimelight)

@Sublimelight327

Your AI Guy

@YourAIGuy_ca

Miguel @xAI Hackathon

@Kaweees1

Sanjib Das|(সঞ্জীব দাস)

@SanjibD27435406

Kamelot

@kamelotshow

alohi89742

@ami_geek52911

Jamian Gerard

@JamianGerard

Sheena Yoon 🌱

@SheenaYoon

TACITUS ◳ Conflict Intel & Resolution Technologies

@TacitusmeAI

Ato plans

@ATO_KANTE

Govardhan

@Govardhan051995

Akshay Kumawat

@Akshaykumawat24

Billy Dickson

@billygdickson

Oli

@oliviazzzu

Yaswanth Kottana

@YaswanthK32662

Aditya Ranjan Dalai

@aditya_dalai19

rajeev baditha

@rajeevbaditha

Hsin-Ling Hsu

@JustinHsu99

SPARKS

@0x_Sparks

syd23

@aditya_sharma84

马文系统

@MaWen0x369

LUIMANDO

@DimenzionDezign

Sadam Mohamed 🇮🇳

@sadam_msh

Ryan Khalili

@KhaliliRyan24

chloe chia

@chloewchia

IR_INFO

@WEIXIYANG120116

gopher

@chadgopher

Prakash Murugesan

@PrakashM14

Leni

@Leni3500

Luis

@0xcsavant

Prasanna Murugesan

@prasanna5

🇺🇸 Disorientated Express 🇺🇸

@civocim

Mahmoud Zaher

@zaher_m_

Proton Analysis

@ProtonAnalysis

Matt

@drcopybymatt

Shikhar

@shikhxr

Maverick

@Blitz129N

zerebekerman

@maratara121

Ali Valiyev

@AliValiyev17

jordan

@Jfein14

Mandip Adhikari

@mandipadk

Sami Khan

@ibnAmjid

(((ل()(ل() 'yoav))))👾

@yoavgo

Soumith Chintala

@soumithchintala

Percy Liang

@percyliang

Kyunghyun Cho

@kchonyc

clem 🤗

@ClementDelangue

Rosanne Liu

@savvyRL

Sam Bowman

@sleepinyourhat

Christopher Manning

@chrmanning

Horace He

@cHHillee

Graham Neubig

@gneubig

Julien Chaumond

@julien_c

Jia-Bin Huang

@jbhuang0604

$sarahookr's profile picture. Adaptive Intelligence. Built @Cohere_Labs, @GoogleBrain, @GoogleDeepmind. ML Efficiency, Multimodal\lingual. Changing spaces where breakthroughs happen.$