
Joshua Batson
@thebasepoint
trying to understand evolved systems (🖥 and 🧬) interpretability research @anthropicai formerly @czbiohub, @mit math
You might like
3->5, 4->6, 9→11, 7-> ? LLMs solve this via In-Context Learning (ICL); but how is ICL represented and transmitted in LLMs? We build new tools identifying “extractor” and “aggregator” subspaces for ICL, and use them to understand ICL addition tasks like above. Come to…

This was so cool to be a part of. Jack led an incredible effort to quickly analyze the internals of a new model, as versions were coming in, to assess alignment. Research at the speed of model development.
Prior to the release of Claude Sonnet 4.5, we conducted a white-box audit of the model, applying interpretability techniques to “read the model’s mind” in order to validate its reliability and alignment. This was the first such audit on a frontier LLM, to our knowledge. (1/15)

Prior to the release of Claude Sonnet 4.5, we conducted a white-box audit of the model, applying interpretability techniques to “read the model’s mind” in order to validate its reliability and alignment. This was the first such audit on a frontier LLM, to our knowledge. (1/15)

We asked every version of Claude to make a clone of Claude(dot)ai, including today’s Sonnet 4.5… see what happened in the video
We’re hiring someone to run the Anthropic Fellows Program! Our research collaborations have led to some of our best safety research and hires. We’re looking for an exceptional ops generalist, TPM, or research/eng manager to help us significantly scale and improve our collabs 🧵
Arc Institute trained their foundation model Evo 2 on DNA from all domains of life. What has it learned about the natural world? Our new research finds that it represents the tree of life, spanning thousands of species, as a curved manifold in its neuronal activations. (1/8)

Join Anthropic interpretability researchers @thebasepoint, @mlpowered, and @Jack_W_Lindsey as they discuss looking into the mind of an AI model - and why it matters:
We’re running another round of the Anthropic Fellows program. If you're an engineer or researcher with a strong coding or technical background, you can apply to receive funding, compute, and mentorship from Anthropic, beginning this October. There'll be around 32 places.

New research with coauthors at @Anthropic, @GoogleDeepMind, @AiEleuther, and @decode_research! We expand on and open-source Anthropic’s foundational circuit-tracing work. Brief highlights in thread: (1/7)
United States Trends
- 1. Auburn 44.9K posts
- 2. Brewers 63.5K posts
- 3. Georgia 67.5K posts
- 4. Cubs 55.4K posts
- 5. Kirby 23.7K posts
- 6. Arizona 41.8K posts
- 7. Utah 24.4K posts
- 8. Gilligan 5,818 posts
- 9. Michigan 62.7K posts
- 10. #AcexRedbull 3,536 posts
- 11. Hugh Freeze 3,206 posts
- 12. #BYUFootball N/A
- 13. Boots 50.3K posts
- 14. #Toonami 2,520 posts
- 15. Amy Poehler 4,278 posts
- 16. #GoDawgs 5,551 posts
- 17. Kyle Tucker 3,162 posts
- 18. Dissidia 5,517 posts
- 19. #ThisIsMyCrew 3,231 posts
- 20. Texas Tech 6,271 posts
You might like
-
Chan Zuckerberg Biohub Network
@czbiohub -
Dr. Jean Fan
@JEFworks -
Debora Marks
@deboramarks -
Valentine Svensson
@vallens -
Loïc A. Royer 💻🔬⚗️
@loicaroyer -
Jonathan Hafetz
@JonathanHafetz -
Peter Horvath
@hpke1980 -
Kieran Campbell
@kieranrcampbell -
Alex Rosenberg
@dna_rosenberg -
David Van Valen
@davidvanvalen -
AmyKistler
@amykczb
Something went wrong.
Something went wrong.