Chinmaya Andukuri
@chinmaya_mohan
applied research @CapitalOne, previously @StanfordAILab / @StanfordHAI
One of my takeaways from #COLM2025 was that people are thinking a lot about user simulation (have been thinking about this myself in the context of tutoring!) Really exciting to see this work on the topic 🤩
Simulating user–AI conversations helps us understand how LMs work in multi-turn settings. Prompting LMs like GPT-4o to simulate users is common, but their assistant nature makes it hard to replicate user behavior. We introduce User LMs - trained to be users, not assistants.
have been enjoying dipping my toes into `verifiers` and @PrimeIntellect environments hub - just pushed an eval environment for MultiChallenge (@scale_AI) to the Environments Hub my env: app.primeintellect.ai/dashboard/envi… main page: scale.com/leaderboard/mu…
if you’re at @COLM_conf, come say hi tomorrow and talk to us about LM self-improvement + clarification!
Presenting this tomorrow at @COLM_conf! Poster 36 (11:00 AM-1:00 PM). We’ll have a demo—come along if you want to try our models and talk about multi-turn dialogue!
Constitutional AI showed LMs can learn to follow constitutions by labeling their own outputs. But why can't we just tell a base model the principles of desired behavior and rely on it to act appropriately? Introducing SAMI: Self-Supervised Alignment with Mutual Information!
Excited to share OffTheRails: A moral reasoning benchmark beyond trolley problems! We present a simple prompting pipeline for generating moral reasoning evaluations with language models using causal templates 🔵→🟠
Language models struggle to search, not due to an architecture problem, but a data one! They rarely see how to search or backtrack. We show how LLMs can be taught to search by representing the process of search in language as a flattened string, a stream of search (SoS)!
Multi-turn interactive RL should be a bigger focus. Current methods are not well-suited for this - i.e. PPO can't train with user in the loop generally and offline Q-learning still does not work at scale. It's interesting to see more work in that direction.
When prompting language models to complete a task, users often leave important things unsaid. Can language models teach themselves to ask clarifying questions? In STaR-GATE, we explore LMs' ability to self-improve by rewarding the model for generating useful questions!
New work where language models learn to ask questions? So they can better understand user needs? With an amazing method name? Oh, yes!
When prompting language models to complete a task, users often leave important things unsaid. Can language models teach themselves to ask clarifying questions? In STaR-GATE, we explore LMs' ability to self-improve by rewarding the model for generating useful questions!
really enjoyed working on STaR-GATE! thanks to the team + many others for helpful discussions 🚀 check out the arXiv here: arxiv.org/abs/2403.19154
When prompting language models to complete a task, users often leave important things unsaid. Can language models teach themselves to ask clarifying questions? In STaR-GATE, we explore LMs' ability to self-improve by rewarding the model for generating useful questions!
United States Trends
- 1. Jonathan Taylor 20K posts
- 2. Bills 111K posts
- 3. Falcons 30.9K posts
- 4. Colts 51.3K posts
- 5. Diggs 5,994 posts
- 6. Kyle Williams 5,012 posts
- 7. Browns 22.6K posts
- 8. Daniel Jones 9,945 posts
- 9. Penix 10.8K posts
- 10. Joe Brady 1,593 posts
- 11. Jaxson Dart 3,220 posts
- 12. Parker Washington 2,694 posts
- 13. Dillon Gabriel 1,389 posts
- 14. Bijan 6,659 posts
- 15. Liverpool 221K posts
- 16. Drake Maye 6,239 posts
- 17. Starks 1,969 posts
- 18. #NYGiants 1,893 posts
- 19. Max B 25.2K posts
- 20. Dallas Turner N/A
Something went wrong.
Something went wrong.