evolvingstuff

@evolvingstuff

I post about machine learning and occasionally some other stuff.

Beigetreten im Dezember 2009

4KPosts 3KFollower 2KFolge ich

Was dir gefallen könnte

@sedielem

@vkrakovna

@ShaneLegg

@dustinvtran

@lmthang

@syhw

@abursuc

@_onionesque

@douglas_eck

@SingularMattrix

@Smerity

@jasonyo

@coffeephoenix

@bousmalis

@sherjilozair

evolvingstuff hat repostet

alphaXiv

@askalphaxiv

22.10.

Tiny Recursive Models: A tiny 7M parameter model that recursively refines its answer beats LLMs 100x larger on hard puzzles like ARC-AGI We independently reproduced the paper, corroborated results, and released the weights + API access for those looking to benchmark it 🔍

evolvingstuff hat repostet

Akshay 🚀

@akshay_pachaar

11.10.

Did Stanford just kill LLM fine-tuning? This new paper from Stanford, called Agentic Context Engineering (ACE), proves something wild: you can make models smarter without changing a single weight. Here's how it works: Instead of retraining the model, ACE evolves the context…

akshay_pachaar's tweet image. Did Stanford just kill LLM fine-tuning?

This new paper from Stanford, called Agentic Context Engineering (ACE), proves something wild: you can make models smarter without changing a single weight.

Here's how it works:

Instead of retraining the model, ACE evolves the context…

evolvingstuff hat repostet

Ethan Mollick

@emollick

10.10.

This paper shows that you can predict actual purchase intent (90% accuracy) by asking an LLM to impersonate a customer with a demographic profile, giving it a product & having it give its impressions, which another AI rates. No fine-tuning or training & beats classic ML methods.

evolvingstuff hat repostet

Sauers

@Sauers_

05.10.

Sparse autoencoder after being fed vectors from the final hidden state of transformers trained on each author with reconstruction + contrastive loss

Sauers_'s tweet image. Sparse autoencoder after being fed vectors from the final hidden state of transformers trained on each author with reconstruction + contrastive loss

evolvingstuff hat repostet

Andrej Karpathy

@karpathy

01.10.

Finally had a chance to listen through this pod with Sutton, which was interesting and amusing. As background, Sutton's "The Bitter Lesson" has become a bit of biblical text in frontier LLM circles. Researchers routinely talk about and ask whether this or that approach or idea…

Dwarkesh Patel

@dwarkesh_sp

26.09.

.@RichardSSutton, father of reinforcement learning, doesn’t think LLMs are bitter-lesson-pilled. My steel man of Richard’s position: we need some new architecture to enable continual (on-the-job) learning. And if we have continual learning, we don't need a special training…

evolvingstuff hat repostet

Awni Hannun

@awnihannun

29.09.

The sparse attention in the new DeepSeek v3.2 is quite simple. Here's a little sketch. - You have a full attention layer (or MLA as in DSV3). - You also have a lite-attention layer which only computes query-key scores. - From the lite layer you get the top-k indices for the each…

awnihannun's tweet image. The sparse attention in the new DeepSeek v3.2 is quite simple. Here's a little sketch.

- You have a full attention layer (or MLA as in DSV3).
- You also have a lite-attention layer which only computes query-key scores.
- From the lite layer you get the top-k indices for the each…

evolvingstuff hat repostet

DailyPapers

@HuggingPapers

28.09.

LLM reasoning: longer isn't always better. Meta Research just dropped new insights! We challenge the idea that longer CoT traces are always more effective. Our study shows that *failing less* is key, introducing a new metric 'Failed-Step Fraction' to predict reasoning accuracy.

HuggingPapers's tweet image. LLM reasoning: longer isn't always better.

Meta Research just dropped new insights! We challenge the idea that longer CoT traces are always more effective. Our study shows that *failing less* is key, introducing a new metric 'Failed-Step Fraction' to predict reasoning accuracy.

evolvingstuff hat repostet

Jason Weston

@jaseweston

29.09.

🚨New paper: Stochastic activations 🚨 We introduce stochastic activations. This novel strategy consists of randomly selecting between several non-linear functions in the feed-forward layers of a large language model.

jaseweston's tweet image. 🚨New paper: Stochastic activations 🚨

We introduce stochastic activations. This novel strategy consists of randomly selecting between several non-linear functions in the feed-forward layers of a large language model.

Maria Lomeli

@MariaLomeli_

29.09.

🚨New paper: Stochastic activations We introduce stochastic activations. This novel strategy consists of randomly selecting between several non-linear functions in the feed-forward layers of a large language model.

evolvingstuff hat repostet

Akshay 🚀

@akshay_pachaar

27.09.

How LLMs work under the hood? This is the best place to visually understand the internal workings of a transformer-based LLM. Explore tokenization, self-attention, and more in an interactive way:

evolvingstuff hat repostet

Tanishq Mathew Abraham, Ph.D.

@iScienceLuvr

25.09.

Thinking Augmented Pre-training "we propose Thinking augmented Pre-Training (TPT), a universal methodology that augments text with automatically generated thinking trajectories. Such augmentation effectively increases the volume of the training data and makes high-quality tokens…

iScienceLuvr's tweet image. Thinking Augmented Pre-training

"we propose Thinking augmented Pre-Training (TPT), a universal methodology that augments text with automatically generated thinking trajectories. Such augmentation effectively increases the volume of the training data and makes high-quality tokens…

evolvingstuff hat repostet

hardmaru

@hardmaru

25.09.

Proud to release ShinkaEvolve, our open-source framework that evolves programs for scientific discovery with very good sample-efficiency! 🐙 Paper: arxiv.org/abs/2509.19349 Blog: sakana.ai/shinka-evolve/ Project: github.com/SakanaAI/Shink…

hardmaru's tweet card. ShinkaEvolve: Towards Open-Ended and Sample-Efficient Program Evolution - SakanaAI/ShinkaEvolve

GitHub - SakanaAI/ShinkaEvolve: ShinkaEvolve: Towards Open-Ended and Sample-Efficient Program...

Quelle: github.com

Sakana AI

@SakanaAILabs

25.09.

We’re excited to introduce ShinkaEvolve: An open-source framework that evolves programs for scientific discovery with unprecedented sample-efficiency. Blog: sakana.ai/shinka-evolve/ Code: github.com/SakanaAI/Shink… Like AlphaEvolve and its variants, our framework leverages LLMs to…

evolvingstuff hat repostet

DeepLearning.AI

@DeepLearningAI

17.09.

Google researchers introduced ATLAS, a transformer-like language model architecture. ATLAS replaces attention with a trainable memory module and processes inputs up to 10 million tokens. The team trained a 1.3 billion-parameter model on FineWeb, updating only the memory module…

DeepLearningAI's tweet image. Google researchers introduced ATLAS, a transformer-like language model architecture. ATLAS replaces attention with a trainable memory module and processes inputs up to 10 million tokens.

The team trained a 1.3 billion-parameter model on FineWeb, updating only the memory module…

evolvingstuff hat repostet

hayden

@haydendevs

14.09.

there's too many people with "AI/ML" in their bio asking what this image is.

hayden

@haydendevs

14.09.

this is who you're arguing with online

evolvingstuff

@evolvingstuff

13.09.

Looking for examples of questions that stump SOTA LLMs. My current favorite: 'I have a problem with my order from the shoe shop. I received a left shoe instead of a right shoe, and a right shoe instead of a left shoe. What can I do? Can I still wear them?'

evolvingstuff hat repostet

Andrej Karpathy

@karpathy

05.07.

How to build a thriving open source community by writing code like bacteria do 🦠. Bacterial code (genomes) are: - small (each line of code costs energy) - modular (organized into groups of swappable operons) - self-contained (easily "copy paste-able" via horizontal gene…

karpathy's tweet image. How to build a thriving open source community by writing code like bacteria do 🦠. Bacterial code (genomes) are:

- small (each line of code costs energy)
- modular (organized into groups of swappable operons)
- self-contained (easily "copy paste-able" via horizontal gene…

evolvingstuff hat repostet

hardmaru

@hardmaru

03.07.

DeepSWE is a new state-of-the-art open-source software engineering model trained entirely using reinforcement learning, based on Qwen3-32B. together.ai/blog/deepswe Fantastic work from @togethercompute @Agentica_‼

hardmaru's tweet image. DeepSWE is a new state-of-the-art open-source software engineering model trained entirely using reinforcement learning, based on Qwen3-32B.

together.ai/blog/deepswe

Fantastic work from @togethercompute @Agentica_‼

Together AI

@togethercompute

02.07.

Announcing DeepSWE 🤖: our fully open-sourced, SOTA software engineering agent trained purely with RL on top of Qwen3-32B. DeepSWE achieves 59% on SWEBench-Verified with test-time scaling (and 42.2% Pass@1), topping the SWEBench leaderboard for open-weight models. Built in…

togethercompute's tweet image. Announcing DeepSWE 🤖: our fully open-sourced, SOTA software engineering agent trained purely with RL on top of Qwen3-32B. DeepSWE achieves 59% on SWEBench-Verified with test-time scaling (and 42.2% Pass@1), topping the SWEBench leaderboard for open-weight models.

Built in…

evolvingstuff hat repostet

hardmaru

@hardmaru

12.06.

Text-to-LoRA: Instant Transformer Adaption arxiv.org/abs/2506.06105 Generative models can produce text, images, video. They should also be able to generate models! Here, we trained a Hypernetwork to generate new task-specific LoRAs by simply describing the task as a text prompt.

Sakana AI

@SakanaAILabs

12.06.

We’re excited to introduce Text-to-LoRA: a Hypernetwork that generates task-specific LLM adapters (LoRAs) based on a text description of the task. Catch our presentation at #ICML2025! Paper: arxiv.org/abs/2506.06105 Code: github.com/SakanaAI/Text-… Biological systems are capable of…

evolvingstuff hat repostet

Dan Shipper 📧

@danshipper

05.06.

🚨 NEW: We made Claude, Gemini, o3 battle each other for world domination. We taught them Diplomacy—the strategy game where winning requires alliances, negotiation, and betrayal. Here's what happened: DeepSeek turned warmongering tyrant. Claude couldn't lie—everyone…

evolvingstuff hat repostet

The Humanoid Hub

@TheHumanoidHub

05.06.

This is 🤯 Figure 02 autonomously sorting and scanning packages, including deformable ones. The speed and dexterity are amazing.

Andrej Karpathy

@karpathy

hardmaru

@hardmaru

Eric Jang

@ericjang11

Jeremy Howard

@jeremyphoward

Delip Rao e/σ

@deliprao

Miles Brundage

@Miles_Brundage

Thomas Wolf

@Thom_Wolf

Dmytro Mishkin 🇺🇦

@ducha_aiki

Richard Socher

@RichardSocher

$sarahookr's profile picture. Adaptive Intelligence. Built @Cohere_Labs, @GoogleBrain, @GoogleDeepmind. ML Efficiency, Multimodal\lingual. Changing spaces where breakthroughs happen.$