weakly typed
@weakly_typed
learning {ML, PL, maths} // CS pre-grad // DMs open :)
You might like
while this is an impressive demonstration of the capabilities of large language models to synthesise natural-language problem statements into formal / executable versions, we're still a long way off from 'true' system 2 mathematical reasoning (1/3)
Exciting, mechanistic interpretability has a dedicated lecture in the syllabus of a Cambridge CS masters course! The field has come so far in the past few years ❤️
The slowly-unfolding premise of the Good Place is that everyone is damned. They are damned because they participate in the modern world; they buy from sweatshops, they eat chocolate, they fly in airplanes while the poorest people in the world see their harvests fail thanks to…
Take a break from arxiv/LW/AF. Sit in the woods with a random textbook and mull new ideas away from interp community lockstep. Diverge. Don’t compete with a saturated subtopic, maybe you’ll get to take weekends off. Premature overinvestment comes from monoculture.
So what should the community do? I'd guess we're over-invested in fundamental SAE research, but shouldn't abandon it completely. And SAEs remain a valuable tool, esp for exploration and debugging I'm most keen on applied work, and making targeted fixes for fundamental issues.
I've recently learned about Algebraic Positional Encoding from @bgavran3 and isnt this the coolest breakthrough in mathematical approaches to transformers in the last few years arxiv.org/abs/2312.16045
LLMs are dramatically worse at ARC tasks the bigger they get. However, humans have no such issues - ARC task difficulty is independent of size. Most ARC tasks contain around 512-2048 pixels, and o3 is the first model capable of operating on these text grids reliably.
This is a really creative and well-executed paper on using "black-box interpretability" methods to understand and control model cognition. Especially impressed by the many applications explored IMO this is an important direction; this paper sets the field on an excellent path!
The tragic suicide of Sewell Setzer III shows our generation has become unwitting test subjects in a vast, unregulated AI experiment. That's why we're launching @youthandai with our Generation AI Survey in @TIME. A thread: (1/10)
American teenagers believe addressing the potential risks of AI should be a top priority for lawmakers, according to a new poll time.com/7098524/teenag…
Announcing Transluce, a nonprofit research lab building open source, scalable technology for understanding AI systems and steering them in the public interest. Read a letter from the co-founders Jacob Steinhardt and Sarah Schwettmann: transluce.org/introducing-tr…
SHA-256: 218cebed21f2e8514df2ea1e4caca39750349cf30804995d5d577f08afc5855a
in slight defense of mathiness / mathematical notation in ML research papers: a thread (twessay?)
in slight defense of mathiness: there’s a flavour of research that looks like “finding the right abstractions through which to think about things” — either to make it easier to build tools to manipulate the things, or to inspire researchers to import ideas from other fields
Who should I meet in Cambridge? (You?)
On Reddit's statistics forum, the most common question is "What test should I use?" My answer, from 2011, is "There is only one test" allendowney.blogspot.com/2011/05/there-…
Mechanistic interpretability gives us rich explanations of models. But can we convert these explanations into formal proofs? Surprisingly, yes! Mech interp helps write short proofs of generalization bounds — and, shorter proofs provide more mechanistic understanding. 🧵
perhaps growing up is realising that 'growing up' was a comforting lie
maybe the most exciting interp result I’ve seen all year (if it ends up being true for interesting reasons): a meaningful step towards uncovering the type of the residual stream
Fundamentally, high-level concepts group into categorical variables---mammal, reptile, fish, bird---with a semantic hierarchy---poodle is a dog is a mammal is an animal. How do LLMs internally represent this structure? arxiv.org/abs/2406.01506
fyi the real reason i've been ignoring you is: - i want to reply - i want to be able to give you the attention and focus you deserve - i never feel like i have enough energy to properly do that
fuck, did i just cut off every single one of my autistic friends (all of my friends) who can't read jokes??
mechinterp people: does anyone have a good (formal?) definition of 'feature' that doesn't assume the linear representation hypothesis? like, if I have some points in high-dim space, what makes them "the composition of several features" as opposed to "some random points"
very interesting that every frontier lab interp team is working on sparse autoencoders (SAEs) and ~ no one in academia is
United States Trends
- 1. Henderson 17K posts
- 2. Justin Fields 5,208 posts
- 3. Drake Maye 12.9K posts
- 4. AD Mitchell 1,809 posts
- 5. Patriots 123K posts
- 6. Judge 168K posts
- 7. Cal Raleigh 5,785 posts
- 8. #Jets 4,235 posts
- 9. Diggs 6,909 posts
- 10. Purdue 8,272 posts
- 11. Pats 11.7K posts
- 12. #911onABC 14.5K posts
- 13. #TNFonPrime 2,413 posts
- 14. Braden Smith 1,360 posts
- 15. #TNAiMPACT 4,257 posts
- 16. AL MVP 15.4K posts
- 17. Mack Hollins 2,430 posts
- 18. John Metchie N/A
- 19. #JetUp 1,772 posts
- 20. RIP Beef 1,081 posts
You might like
-
linearly independent arisu
@aris_uu -
Alex Turner
@Turn_Trout -
Wes Gurnee
@wesg52 -
Quintin Pope
@QuintinPope5 -
Jason D. Clinton 🔸
@JasonDClinton -
Apprentice
@monadivalence -
bilal
@bilalchughtai_ -
Lee Sharkey
@leedsharkey -
Rudolf Laine
@LRudL_ -
Jessica Rumbelow
@JessicaRumbelow -
Nora Ammann
@AmmannNora -
Wenda Li
@WendaLi8 -
Arthur Conmy
@ArthurConmy -
Thomas Trautmann
@ThomasTrautma13 -
Zach Furman
@FurmanZach
Something went wrong.
Something went wrong.