$weakly_typed's profile picture. learning {ML, PL, maths} // CS pre-grad // DMs open :)$

weakly typed

@weakly_typed

learning {ML, PL, maths} // CS pre-grad // DMs open :)

= fix 📍

Joined December 2021

332Posts 241Followers 562Following

You might like

@aris_uu

@Turn_Trout

@wesg52

@QuintinPope5

@JasonDClinton

@monadivalence

@bilalchughtai_

@leedsharkey

@LRudL_

@JessicaRumbelow

@AmmannNora

@WendaLi8

@ArthurConmy

@ThomasTrautma13

@FurmanZach

Pinned

weakly typed

@weakly_typed

Jan 3, 2022

while this is an impressive demonstration of the capabilities of large language models to synthesise natural-language problem statements into formal / executable versions, we're still a long way off from 'true' system 2 mathematical reasoning (1/3)

weakly typed reposted

Neel Nanda

@NeelNanda5

Jul 5

Exciting, mechanistic interpretability has a dedicated lecture in the syllabus of a Cambridge CS masters course! The field has come so far in the past few years ❤️

NeelNanda5's tweet image. Exciting, mechanistic interpretability has a dedicated lecture in the syllabus of a Cambridge CS masters course! The field has come so far in the past few years ❤️

weakly typed reposted

Kelsey Piper

@KelseyTuoc

Apr 6

The slowly-unfolding premise of the Good Place is that everyone is damned. They are damned because they participate in the modern world; they buy from sweatshops, they eat chocolate, they fly in airplanes while the poorest people in the world see their harvests fail thanks to…

weakly typed reposted

Naomi Saphra

@nsaphra

Mar 27

Take a break from arxiv/LW/AF. Sit in the woods with a random textbook and mull new ideas away from interp community lockstep. Diverge. Don’t compete with a saturated subtopic, maybe you’ll get to take weekends off. Premature overinvestment comes from monoculture.

Neel Nanda

@NeelNanda5

Mar 26

So what should the community do? I'd guess we're over-invested in fundamental SAE research, but shouldn't abandon it completely. And SAEs remain a valuable tool, esp for exploration and debugging I'm most keen on applied work, and making targeted fixes for fundamental issues.

NeelNanda5's tweet image. So what should the community do?

I'd guess we're over-invested in fundamental SAE research, but shouldn't abandon it completely. And SAEs remain a valuable tool, esp for exploration and debugging

I'm most keen on applied work, and making targeted fixes for fundamental issues.

weakly typed reposted

Zanzi Tangle, now at Monoidal Cafe

@tangled_zans

Jan 15

I've recently learned about Algebraic Positional Encoding from @bgavran3 and isnt this the coolest breakthrough in mathematical approaches to transformers in the last few years arxiv.org/abs/2312.16045

weakly typed reposted

Mikel Bober-Irizar

@mikb0b

Dec 24

LLMs are dramatically worse at ARC tasks the bigger they get. However, humans have no such issues - ARC task difficulty is independent of size. Most ARC tasks contain around 512-2048 pixels, and o3 is the first model capable of operating on these text grids reliably.

mikb0b's tweet image. LLMs are dramatically worse at ARC tasks the bigger they get. However, humans have no such issues - ARC task difficulty is independent of size.

Most ARC tasks contain around 512-2048 pixels, and o3 is the first model capable of operating on these text grids reliably.

weakly typed reposted

Samuel Marks

@saprmarks

Dec 15

This is a really creative and well-executed paper on using "black-box interpretability" methods to understand and control model cognition. Especially impressed by the many applications explored IMO this is an important direction; this paper sets the field on an excellent path!

Alex Pan

@aypan_17

Dec 13

LLMs have behaviors, beliefs, and reasoning hidden in their activations. What if we could decode them into natural language? We introduce LatentQA: a new way to interact with the inner workings of AI systems. 🧵

aypan_17's tweet image. LLMs have behaviors, beliefs, and reasoning hidden in their activations. What if we could decode them into natural language?

We introduce LatentQA: a new way to interact with the inner workings of AI systems. 🧵

weakly typed reposted

thebes

@voooooogel

Dec 4

weakly typed reposted

Jason Hausenloy

@jasonhausenloy

Oct 25, 2024

The tragic suicide of Sewell Setzer III shows our generation has become unwitting test subjects in a vast, unregulated AI experiment. That's why we're launching @youthandai with our Generation AI Survey in @TIME. A thread: (1/10)

TIME

@TIME

Oct 25, 2024

American teenagers believe addressing the potential risks of AI should be a top priority for lawmakers, according to a new poll time.com/7098524/teenag…

TIME's tweet card. American teenagers believe addressing the potential risks of AI should be a top priority for lawmakers, according to a new poll.

What Teenagers Really Think About AI

Source: time.com

weakly typed reposted

Transluce

@TransluceAI

Oct 23, 2024

Announcing Transluce, a nonprofit research lab building open source, scalable technology for understanding AI systems and steering them in the public interest. Read a letter from the co-founders Jacob Steinhardt and Sarah Schwettmann: transluce.org/introducing-tr…

weakly typed

@weakly_typed

Oct 6, 2024

SHA-256: 218cebed21f2e8514df2ea1e4caca39750349cf30804995d5d577f08afc5855a

weakly typed

@weakly_typed

Sep 9, 2024

in slight defense of mathiness / mathematical notation in ML research papers: a thread (twessay?)

$weakly_typed's profile picture. learning {ML, PL, maths} // CS pre-grad // DMs open :)$

weakly typed

@weakly_typed

Sep 9, 2024

in slight defense of mathiness: there’s a flavour of research that looks like “finding the right abstractions through which to think about things” — either to make it easier to build tools to manipulate the things, or to inspire researchers to import ideas from other fields

weakly typed reposted

gavin leech

@g_leech_

Aug 29, 2024

Who should I meet in Cambridge? (You?)

weakly typed reposted

Allen Downey

@AllenDowney

Jul 29, 2024

On Reddit's statistics forum, the most common question is "What test should I use?" My answer, from 2011, is "There is only one test" allendowney.blogspot.com/2011/05/there-…

AllenDowney's tweet image. On Reddit's statistics forum, the most common question is "What test should I use?"
My answer, from 2011, is "There is only one test"

allendowney.blogspot.com/2011/05/there-…

weakly typed reposted

Jason Gross

@diagram_chaser

Jun 24, 2024

Mechanistic interpretability gives us rich explanations of models. But can we convert these explanations into formal proofs? Surprisingly, yes! Mech interp helps write short proofs of generalization bounds — and, shorter proofs provide more mechanistic understanding. 🧵

diagram_chaser's tweet image. Mechanistic interpretability gives us rich explanations of models. But can we convert these explanations into formal proofs?

Surprisingly, yes! Mech interp helps write short proofs of generalization bounds — and, shorter proofs provide more mechanistic understanding. 🧵

weakly typed

@weakly_typed

Jun 19, 2024

perhaps growing up is realising that 'growing up' was a comforting lie

weakly typed

@weakly_typed

Jun 19, 2024

on reading ml papers:

weakly typed

@weakly_typed

Jun 10, 2024

maybe the most exciting interp result I’ve seen all year (if it ends up being true for interesting reasons): a meaningful step towards uncovering the type of the residual stream

Victor Veitch 🔸

@victorveitch

Jun 10, 2024

Fundamentally, high-level concepts group into categorical variables---mammal, reptile, fish, bird---with a semantic hierarchy---poodle is a dog is a mammal is an animal. How do LLMs internally represent this structure? arxiv.org/abs/2406.01506

weakly typed reposted

henry

@arithmoquine

Jun 3, 2024

fyi the real reason i've been ignoring you is: - i want to reply - i want to be able to give you the attention and focus you deserve - i never feel like i have enough energy to properly do that

henry

@arithmoquine

Jun 3, 2024

fuck, did i just cut off every single one of my autistic friends (all of my friends) who can't read jokes??

weakly typed

@weakly_typed

May 7, 2024

mechinterp people: does anyone have a good (formal?) definition of 'feature' that doesn't assume the linear representation hypothesis? like, if I have some points in high-dim space, what makes them "the composition of several features" as opposed to "some random points"