
Sarath Chandar
@apsarathchandar
Associate Professor @polymtl and @Mila_Quebec; Canada CIFAR AI Chair; Machine Learning Researcher. Pro-bono office hours: https://t.co/tK69DKRf9N?amp=1
คุณอาจชื่นชอบ
Markovian Thinking by @Mila_Quebec & @Microsoft lets LLMs reason with a fixed-size state – compute stays the same no matter how long the reasoning chain gets. This makes RL linear-cost and memory-constant. The team’s Delethink RL setup trains models to be Markovian Thinkers,…

Thanks for sharing, @TheTuringPost We propose Markovian Thinking as a new paradigm, and Delethink as a simple, concrete instantiation enabling constant-memory, linear-compute reasoning that keeps improving beyond training limits.
Markovian Thinking by @Mila_Quebec & @Microsoft lets LLMs reason with a fixed-size state – compute stays the same no matter how long the reasoning chain gets. This makes RL linear-cost and memory-constant. The team’s Delethink RL setup trains models to be Markovian Thinkers,…

Nice paper! Make the context for reasoning local and train an RL model with such truncation. This way the model "markovinifies" and makes use of its context efficiently!
Introducing linear scaling of reasoning: 𝐓𝐡𝐞 𝐌𝐚𝐫𝐤𝐨𝐯𝐢𝐚𝐧 𝐓𝐡𝐢𝐧𝐤𝐞𝐫 Reformulate RL so thinking scales 𝐎(𝐧) 𝐜𝐨𝐦𝐩𝐮𝐭𝐞, not O(n^2), with O(1) 𝐦𝐞𝐦𝐨𝐫𝐲, architecture-agnostic. Train R1-1.5B into a markovian thinker with 96K thought budget, ~2X accuracy 🧵
It’s clear next-gen reasoning LLMs will run for millions of tokens. RL at 1M needs ~100× compute than 128K. Our Markovian Thinking keeps compute scaling linear instead. Check out Milad’s thread; some of my perspectives below:
Introducing linear scaling of reasoning: 𝐓𝐡𝐞 𝐌𝐚𝐫𝐤𝐨𝐯𝐢𝐚𝐧 𝐓𝐡𝐢𝐧𝐤𝐞𝐫 Reformulate RL so thinking scales 𝐎(𝐧) 𝐜𝐨𝐦𝐩𝐮𝐭𝐞, not O(n^2), with O(1) 𝐦𝐞𝐦𝐨𝐫𝐲, architecture-agnostic. Train R1-1.5B into a markovian thinker with 96K thought budget, ~2X accuracy 🧵
Long reasoning without the quadratic tax: The Markovian Thinker makes LLMs reason in chunks with a bounded state → linear compute, constant memory and it keeps scaling beyond the training limit. 1/6
Introducing linear scaling of reasoning: 𝐓𝐡𝐞 𝐌𝐚𝐫𝐤𝐨𝐯𝐢𝐚𝐧 𝐓𝐡𝐢𝐧𝐤𝐞𝐫 Reformulate RL so thinking scales 𝐎(𝐧) 𝐜𝐨𝐦𝐩𝐮𝐭𝐞, not O(n^2), with O(1) 𝐦𝐞𝐦𝐨𝐫𝐲, architecture-agnostic. Train R1-1.5B into a markovian thinker with 96K thought budget, ~2X accuracy 🧵
Introducing linear scaling of reasoning: 𝐓𝐡𝐞 𝐌𝐚𝐫𝐤𝐨𝐯𝐢𝐚𝐧 𝐓𝐡𝐢𝐧𝐤𝐞𝐫 Reformulate RL so thinking scales 𝐎(𝐧) 𝐜𝐨𝐦𝐩𝐮𝐭𝐞, not O(n^2), with O(1) 𝐦𝐞𝐦𝐨𝐫𝐲, architecture-agnostic. Train R1-1.5B into a markovian thinker with 96K thought budget, ~2X accuracy 🧵
I moved out of my home in 2008 and I still call my mom every day! If you read this and you don’t call your parents everyday, call them NOW! The love parents have for their kids is priceless! I realized it only after I became a father myself!
Story time: After I moved out of my home in 2014 to live in Japan, initially I used to call my parents daily. But as I became more "internationalized" I started calling my parents less and less. I used to tell my mom "I have nothing to talk about so we don't need to call daily".…
I’ve been working with Mathieu for more than 1.5 years and I can vouch that he’s an excellent researcher and a great mentor. I’m also attending @COLM_conf this time, and would love to chat with anyone about LLM post-training, reinforcement learning, and AI for Science!
If you are attending @COLM_conf and looking to hire a research scientist, I highly recommend you talk to my postdoc, Mathieu Reymond, who is in the job market and at the conference! Mathieu is an expert in mult-objective RL, multi-agent RL, RL for scientific discovery, and RL for…

If you are attending @COLM_conf and looking to hire a research scientist, I highly recommend you talk to my postdoc, Mathieu Reymond, who is in the job market and at the conference! Mathieu is an expert in mult-objective RL, multi-agent RL, RL for scientific discovery, and RL for…

✨ What if we could tune Frontier LLM agents without touching any weights? Meet JEF-Hinter, an agent capable of analyzing multiple offline trajectories to extract auditable and timely hints💡 In our new paper 📄 , we show significant performance gains on downstream tasks ⚡ with…

I am not at @COLM_conf, but most of my students are! If you are attending CoLM and are interested in PhD and postdoc positions @ChandarLab, please talk to my students and postdocs who are at the conference this week! We have 2 open postdoc positions and multiple PhD positions!…
Looking forward to talking about explainable AI at @Mila_Quebec’s first Community of Practice event! Registrations open now!
We are pleased to invite you to the very first gathering of the Mila Community of Practice on October 23. Prof. @apsarathchandar, Maryam Molamohammadi from Mila’s AI Studios, and an industry expert will share their insights and discuss the challenges surrounding explainable AI.…

I am attending @COLM_conf to present our paper: “Do Biased Models Have Biased Thoughts?”: arxiv.org/pdf/2508.06671 Let me know if you want to chat! You can find me at the @amazon booth on Tuesday, Oct 7, 1:30-2:00 pm; or at poster #42 in room 710 on Wednesday, Oct 8, 4:30-6:30 pm
At @ChandarLab, we are happy to announce the third edition of our assistance program to provide feedback for members of communities underrepresented in AI who want to apply to high-profile graduate programs. Want feedback? Details: chandar-lab.github.io/grad_app/. Deadline: Nov 01! cc:…

"LSTM is like a ResNet 90 degree rotated" by @ilyasut But LSTM pre-date ResNets by 25 years. Residual connections since 1990!!! Who invented deep residual learning? Paper by @SchmidhuberAI : P: arxiv.org/abs/2509.24732
3. LSTM was a network used in the "Sequence to Sequence Learning with Neural Networks" study, however it's disadvantage is that it's a more complex system that Transformers. As Ilya Sutskever said, it's like a ResNet (Residual Network) rotated 90 degrees.

ELLIS ML4Molecules workshop 2025 ** Call for papers** is out: moleculediscovery.github.io/workshop2025/

I think many AI safety folks do real science, but it is an important trap to be aware of. No matter if it's AI safety, climate change, or public health, when scientists become activists the result is not better policies but less trust in science. See nytimes.com/2025/05/02/opi…
When models start to become genuinely misaligned, it's important that we raise the alarm. However, most people working on this are trying to make scary demos for policymakers not do real science. Interpretability researchers should fill this gap, and my team is exploring this
Checkout this cool notebook from our lab!🚀 Perfect for anyone curious about playing with generative models in chemistry!
We just made NovoMolGen easy to play with: Transformers-native checkpoints on the Hub and small notebooks that let you load, sample, and fine-tune in minutes. The few lines of code that load the model, plug in a reward, run a short RL finetune, and plot the curve.

Exciting new release from our lab 💪 NovoMolGen just dropped notebooks + ready models for quick RL fine-tuning. Check it out below!
We just made NovoMolGen easy to play with: Transformers-native checkpoints on the Hub and small notebooks that let you load, sample, and fine-tune in minutes. The few lines of code that load the model, plug in a reward, run a short RL finetune, and plot the curve.

Check out the code for NovoMolGen!
We just made NovoMolGen easy to play with: Transformers-native checkpoints on the Hub and small notebooks that let you load, sample, and fine-tune in minutes. The few lines of code that load the model, plug in a reward, run a short RL finetune, and plot the curve.

United States เทรนด์
- 1. Chiefs 83.4K posts
- 2. Brian Branch 2,442 posts
- 3. #TNABoundForGlory 41K posts
- 4. #LoveCabin N/A
- 5. Mahomes 23.4K posts
- 6. LaPorta 9,429 posts
- 7. Goff 12.4K posts
- 8. Bryce Miller 3,529 posts
- 9. #OnePride 5,882 posts
- 10. Kelce 13.3K posts
- 11. Butker 7,943 posts
- 12. #DETvsKC 4,223 posts
- 13. #ALCS 9,606 posts
- 14. Baker 50.5K posts
- 15. Gibbs 5,370 posts
- 16. Dan Campbell 2,241 posts
- 17. Collinsworth 2,446 posts
- 18. Pacheco 4,534 posts
- 19. Tyquan Thornton 1,108 posts
- 20. Mike Santana 2,685 posts
คุณอาจชื่นชอบ
-
Dhruv Batra
@DhruvBatra_ -
Shimon Whiteson
@shimon8282 -
Mila - Institut québécois d'IA
@Mila_Quebec -
Chelsea Finn
@chelseabfinn -
Tim Rocktäschel
@_rockt -
Jakob Foerster
@j_foerst -
Deepak Pathak
@pathak2206 -
Marc G. Bellemare
@marcgbellemare -
Sanjeev Arora
@prfsanjeevarora -
Tejas Kulkarni
@tejasdkulkarni -
Rishabh Agarwal
@agarwl_ -
Pulkit Agrawal
@pulkitology -
Aditya Grover
@adityagrover_ -
Alexia Jolicoeur-Martineau
@jm_alexia -
Aishwarya Agrawal
@aagrawalAA
Something went wrong.
Something went wrong.