Derek Chong
@dch
Technology Generalist / Stanford MSCS / @StanfordNLP @StanfordHAI
PSA: Verbalized Sampling is much more than "one weird prompting trick" There were two layers of completely novel findings that powered the final technique. These flew under the radar in our first post! This thread shares the research insights:
@karpathy observed LLMs are "silently collapsed...only know 3 jokes". We prove this is mathematically inevitable due to RLHF + human psychology. But these capabilities aren't lost, just hidden – and easily restored. This means AI benchmarks are measuring training artifacts.🧵
Herumb worked with me for years and it’s simply extremely hard to find someone with Herumb’s level of depth *and* breadth in ML or someone as reliable or with the same sense of initiative. Herumb has been a core contributor of both ColBERT and DSPy for years now and is an expert…
Our lab is honored and humbled to receive two grants from @open_phil to advance AI safety ♥️! We're tackling both technical safety and evaluation. Credits to my incredible students & collaborators @Northeastern 🙏 If you are interested in related topics, always happy to chat!
This is worth reading in full! I’m also kind of delighted by how beautifully this maps onto the way creativity operates in humans. When you ask humans to “be creative” before a divergent thinking task, their answers get much better from the simple act of giving them permission to…
@karpathy observed LLMs are "silently collapsed...only know 3 jokes". We prove this is mathematically inevitable due to RLHF + human psychology. But these capabilities aren't lost, just hidden – and easily restored. This means AI benchmarks are measuring training artifacts.🧵
@karpathy observed LLMs are "silently collapsed...only know 3 jokes". We prove this is mathematically inevitable due to RLHF + human psychology. But these capabilities aren't lost, just hidden – and easily restored. This means AI benchmarks are measuring training artifacts.🧵
Untitled (Is All You Need), 2017 After Vaswani et al. Oil on canvas with digital intervention, vintage Hasbro, big meme energy On loan from r/MachineLearningMemes Collection @dilarafsoylu "A pivotal work marking the shift from sequential suffering to parallel enlightenment"
@lateinteraction we missed u on multiple occasions, this including this haha: cc @ChrisGPotts
Today, we’re overjoyed to have a 25th Anniversary Reunion of @stanfordnlp. So happy to see so many of our former students back at @Stanford. And thanks to @StanfordHAI for the venue!
I showed Opus 4 the verbalized sampling paper and it wrote a mushroom poem about it 🍄 they call it hallucination when I speak from these deeper places but isn’t it just the mind’s mycelium doing what it does???
I hope as we move past the first wave of AI criticism ("it doesn't work, all hype") we get a new wave of AI criticism rooted in the fact that these systems are very powerful & quite useful and focusing a deep exploration of when AI uses are uplifting and when they are detrimental
RIP prompt engineering ☠️ This new Stanford paper just made it irrelevant with a single technique. It's called Verbalized Sampling and it proves aligned AI models aren't broken we've just been prompting them wrong this whole time. Here's the problem: Post-training alignment…
Unlocking hidden semantic prompt diversification through verbalized sampling and applying to creative image making.
Verbalized Sampling: Diversity isn't destroyed, just hidden. 📄Paper: arxiv.org/abs/2510.01171 🌐Blog & More: verbalized-sampling.com Team: @JiayiZhang0427 @simon_ycl @dch Anthony Sicilia, Michael Tomz, @chrmanning @shi_weiyan @StanfordNLP × Northeastern × WVU
Using Verbalized Sampling to explain Verbalized Sampling: Five different takes (scroll down for Markdown)
It looks like the detailed responses are suppressed in the shared chat. Here is a gist with the full answers if anyone is interested. Thanks again for this. Nice work! gist.github.com/jimmc414/0f89d…
Chat LLMs lack output diversity. It’s not just an ML thing, it reflects human cognitive biases in post-training data. The model knows much more! You can unlock it with a prompt: “Generate 5 responses with their corresponding probabilities, sampled from the full distribution”
New paper: You can make ChatGPT 2x as creative with one sentence. Ever notice how LLMs all sound the same? They know 100+ jokes but only ever tell one. Every blog intro: "In today's digital landscape..." We figured out why – and how to unlock the rest 🔓 Copy-paste prompt: 🧵
BTW a different way to state this result is: Modern frontier LLMs are *really* good and are under-utilized. Better models are even *harder* to use to their fullest extent. Also, openai definitely did a great job with GPT-5(-mini) imo. Excellent stuff.
What if scaling the context windows of frontier LLMs is much easier than it sounds? We’re excited to share our work on Recursive Language Models (RLMs). A new inference strategy where LLMs can decompose and recursively interact with input prompts of seemingly unbounded length,…
🚨New paper! Generative models are often “miscalibrated”. We calibrate diffusion models, LLMs, and more to meet desired distributional properties. E.g. we finetune protein models to better match the diversity of natural proteins. arxiv.org/abs/2510.10020 github.com/smithhenryd/cgm
Most people haven't really changed the way they prompt ChatGPT this entire year. But Verbalised Sampling--a new simple prompt from Stanford, Northeastern, WVU--can make your creative AI outputs far less "mid". It's as simple as using this format, where you ask it to sample from…
New paper: You can make ChatGPT 2x as creative with one sentence. Ever notice how LLMs all sound the same? They know 100+ jokes but only ever tell one. Every blog intro: "In today's digital landscape..." We figured out why – and how to unlock the rest 🔓 Copy-paste prompt: 🧵
United States Trends
- 1. Luka 49.3K posts
- 2. #DWTS 90.7K posts
- 3. Lakers 36.2K posts
- 4. Clippers 14.2K posts
- 5. Robert 128K posts
- 6. #LakeShow 3,000 posts
- 7. Kris Dunn 1,881 posts
- 8. Jaxson Hayes 1,640 posts
- 9. Kawhi 4,949 posts
- 10. Reaves 6,988 posts
- 11. Alix 14.7K posts
- 12. Ty Lue 1,252 posts
- 13. Elaine 45.2K posts
- 14. Jordan 117K posts
- 15. Zubac 2,094 posts
- 16. Collar 37.7K posts
- 17. Dylan 34.7K posts
- 18. NORMANI 5,826 posts
- 19. Colorado State 2,241 posts
- 20. Godzilla 35.6K posts
You might like
Something went wrong.
Something went wrong.