
Stanford NLP Group
@stanfordnlp
Computational Linguists—Natural Language—Machine Learning @chrmanning @jurafsky @percyliang @ChrisGPotts @tatsu_hashimoto @MonicaSLam @Diyi_Yang @StanfordAILab
You might like
Many inconsistencies in Wikipedia discovered with the help of LLMs!
Excited to share our EMNLP 2025 (Main) paper: "Detecting Corpus-Level Knowledge Inconsistencies in Wikipedia with LLMs." How consistent is English Wikipedia? With the help of LLMs, we estimate 80M+ internally inconsistent facts (~3.3%). Small in percentage, large at corpus scale.

I'm feeling serious FOMO over COLM this week 😭 BUT the upside is that I'll be giving a guest lecture on pluralistic alignment in @Diyi_Yang's human-centered LLMs class at Stanford today🌲! Please reach out if you're in the area and want details :) web.stanford.edu/class/cs329x/
Throwback Thursday! Weaviate Podcast #85 with Omar Khattab (@lateinteraction) and Connor Shorten (@CShorten30)! This podcast covers: • What is the state of AI? • DSPy • LLM Pipelines • Prompt Tuning and Optimization • Models for Specific Tasks • LLM Compiler • Colbert or…

Nicholas Carlini man. That guy knows how to give a talk.
I suspect biases against prompt optimization derive from the community elevating RL post-training to a mythical status. The truth is that RL post-training is hard, and never effective without outstanding prompts. Prompt optimizers are cheaper and more effective in most scenarios.
🔥Introducing #AgentFlow, a new trainable agentic system where a team of agents learns to plan and use tools in the flow of a task. 🌐agentflow.stanford.edu 📄huggingface.co/papers/2510.05… AgentFlow unlocks full potential of LLMs w/ tool-use. (And yes, our 3/7B model beats GPT-4o)👇…



Excited to share our EMNLP 2025 (Main) paper: "Detecting Corpus-Level Knowledge Inconsistencies in Wikipedia with LLMs." How consistent is English Wikipedia? With the help of LLMs, we estimate 80M+ internally inconsistent facts (~3.3%). Small in percentage, large at corpus scale.

“we find that interaction with sycophantic AI models significantly reduced participants’ willingness to take actions to repair interpersonal conflict, while increasing their conviction of being in the right.” Great work from @chengmyra1 and @stanfordnlp

We fought an uphill battle for 3 years. Glad to hear from OpenAI: "People are realizing that prompt optimization, which they thought 2 years ago would be dead, is further entrenched." "Really cool time in prompt optimizers, like GEPA." "To improve an entire agent over time."
ColBERT micro-models that “perform well with 250K parameters”. That’s 0.00025B parameters for the uninitiated 😂
✨ We're proud to release the ColBERT Nano series of models. All 3 of these models come in at less than 1 million parameters (250K, 450K, 950K)! Late interaction models perform shockingly well with small models. Collection: huggingface.co/collections/Ne… Model: huggingface.co/NeuML/colbert-…

Congratulations to @chrmanning on his official induction into the National Academy of Engineering (@theNAEng) this past weekend for his work on the development and dissemination of natural language processing methods! #NAEMember

Hi everyone! This Thursday, we will host the second NLP Seminar of the year! For this week's seminar, we are excited to host Tianyu Gao (@gaotianyu1350) from OpenAI and UC San Diego (UCSD)! If you are interested in attending remotely, here is the Zoom link:…

1/ I will be presenting our paper titled “The Unlearning Mirage: A Dynamic Framework for Evaluating LLM Unlearning” on Tuesday, Oct 7 (poster session 2). This is work done with Jing Huang, @keerthi166 , @NathalieBaraca1 , @Diyi_Yang.
Fun fact, @Diyi_Yang has great taste in fine dining at conferences as well 🍱🧑🍳#COLM2025 #professorgossip
📅 Just 4 days until LM4Sci #COLM2025! 🤖🤝🔬 🔥 The countdown continues! Today's spotlight: Diyi Yang (Stanford) @Diyi_Yang, on a Human-Centered Perspective on Automating Research 🧵
🚨 New Top Open Model Update! A relative newcomer to the Arena, @zai_org's GLM-4.6 takes the clear, undisputed #1 spot for Top Open Model. 🏆 It also ranks #4 overall, which is not an easy feat! The next top open model, DeepSeek R1 0528, has been the standing champion for…

Introducing GLM-4.6: Advanced Agentic, Reasoning and Coding Capabilities As our new flagship model, GLM-4.6 brings significant advancements across real-world coding, long-context processing (up to 200K tokens), reasoning, search, writing, and agentic applications. API:…

Papers from @stanfordnlp at #COLM2025 @COLM_conf – looking forward to seeing people there! 👋 • Synthetic Data Generation and Multi-Step Reinforcement Learning for Reasoning and Tool Use openreview.net/forum?id=oN9ST… • Bayesian scaling laws for in-context learning…




📅 Just 4 days until LM4Sci #COLM2025! 🤖🤝🔬 🔥 The countdown continues! Today's spotlight: Diyi Yang (Stanford) @Diyi_Yang, on a Human-Centered Perspective on Automating Research 🧵
In the latest Kempner Seminar Series talk, @tatsu_hashimoto of @stanfordnlp discusses synthetic data and algorithmic approaches to data efficiency. Watch the talk: youtu.be/pookfyF5Vu4 #KempnerInstitute #AI #scaling
youtube.com
YouTube
Back to the Future – Data Efficient Language Modeling with Tatsunori...
AI always calling your ideas “fantastic” can feel inauthentic, but what are sycophancy’s deeper harms? We find that in the common use case of seeking AI advice on interpersonal situations—specifically conflicts—sycophancy makes people feel more right & less willing to apologize.

🚨🚨New Paper: Training LLMs to Discover Abstractions for Solving Reasoning Problems Introducing RLAD, a two-player RL framework for LLMs to discover 'reasoning abstractions'—natural language hints that encode procedural knowledge for structured exploration in reasoning.🧵⬇️

United States Trends
- 1. Good Sunday 50.5K posts
- 2. Discussing Web3 N/A
- 3. #HealingFromMozambique 17.4K posts
- 4. #SundayMorning 1,315 posts
- 5. Blessed Sunday 16.6K posts
- 6. #sundayvibes 4,430 posts
- 7. Trump's FBI 10.5K posts
- 8. Wordle 1,576 X N/A
- 9. Auburn 47.8K posts
- 10. Gilligan's Island 5,394 posts
- 11. #SEVENTEEN_NEW_IN_TACOMA 41.1K posts
- 12. QUICK TRADE 2,157 posts
- 13. The CDC 31.7K posts
- 14. #SVT_TOUR_NEW_ 32.8K posts
- 15. FDV 5min 2,158 posts
- 16. Pegula 5,116 posts
- 17. Utah 25.2K posts
- 18. Market Cap Surges N/A
- 19. Whale - Buy 1,772 posts
- 20. Boots 51.3K posts
You might like
-
Andrew Ng
@AndrewYNg -
Hugging Face
@huggingface -
Christopher Manning
@chrmanning -
Andrej Karpathy
@karpathy -
Ian Goodfellow
@goodfellow_ian -
Stanford AI Lab
@StanfordAILab -
Berkeley AI Research
@berkeley_ai -
PyTorch
@PyTorch -
Yann LeCun
@ylecun -
Soumith Chintala
@soumithchintala -
Jeff Dean
@JeffDean -
Sebastian Ruder
@seb_ruder -
Demis Hassabis
@demishassabis -
Chris Olah
@ch402
Something went wrong.
Something went wrong.