↑ Michael Bukatin ↩🇺🇦

@ComputingByArts

Dataflow matrix machines (neuromorphic computations with linear streams). Julia, Python, Clojure, C, Processing. Shaders, ambient, psytrance, 40hz sound.

Divided States

cs.brandeis.edu/~bukatin/dmm_n…

انضم في مارس 2014

5Kالمنشورات 568المتابعون 539المتابَعون

قد يعجبك

@MatthewSiper

@shahdhruv_

@ait_eth

@XingyouSong

@oren_ai

@MokadyRon

@VSehwag_

@JenJSun

@gbarthmaron

@josephdviviano

@algroznykh

@shamangary

@davidrd123

@FabianMosele

$drew_jaegle's profile picture. Head of AI @ Mirage | ex-{audio, image, text, video} @ DeepMind / Google All that is visible must grow beyond itself & extend into the realm of the invisible$

@drew_jaegle

↑ Michael Bukatin ↩🇺🇦

@ComputingByArts

5 س

transformer-circuits.pub/2025/linebreak…

New paper! We reverse engineered the mechanisms underlying Claude Haiku’s ability to perform a simple “perceptual” task. We discover beautiful feature families and manifolds, clean geometric transformations, and distributed attention algorithms!

wesg52's tweet image. New paper! We reverse engineered the mechanisms underlying Claude Haiku’s ability to perform a simple “perceptual” task. We discover beautiful feature families and manifolds, clean geometric transformations, and distributed attention algorithms!

↑ Michael Bukatin ↩🇺🇦

@ComputingByArts

٢٠ أكتوبرم

There is now a review by Grigory Sapunov: gonzoml.substack.com/p/tiny-recursi…

Alexia Jolicoeur-Martineau

@jm_alexia

٧ أكتوبرم

New paper 📜: Tiny Recursion Model (TRM) is a recursive reasoning approach with a tiny 7M parameters neural network that obtains 45% on ARC-AGI-1 and 8% on ARC-AGI-2, beating most LLMs. Blog: alexiajm.github.io/2025/09/29/tin… Code: github.com/SamsungSAILMon… Paper: arxiv.org/abs/2510.04871

↑ Michael Bukatin ↩🇺🇦 أعاد

Alexia Jolicoeur-Martineau

@jm_alexia

٧ أكتوبرم

↑ Michael Bukatin ↩🇺🇦

@ComputingByArts

٥ أكتوبرم

arxiv.org/abs/2509.21049, Physics of Learning: A Lagrangian perspective to different learning paradigms "We study the problem of building an efficient learning system. Efficient learning processes information in the least time, i.e., building a system that reaches a desired…

Rohan Paul

@rohanpaul_ai

٣ أكتوبرم

The paper claims learning (an AI system learning or machine learning in general) follows a physics style least action rule that unifies supervised, generative, and reinforcement learning. Shows that supervised learning, generative modeling, and reinforcement learning can all be…

rohanpaul_ai's tweet image. The paper claims learning (an AI system learning or machine learning in general) follows a physics style least action rule that unifies supervised, generative, and reinforcement learning.

Shows that supervised learning, generative modeling, and reinforcement learning can all be…

↑ Michael Bukatin ↩🇺🇦

@ComputingByArts

٤ أكتوبرم

"LoRA Without Regret" by @johnschulman2 et al. The most interesting finding is that one should not fine-tune attention layers, one should only fine-tune MLP layers in most situations.

Thinking Machines

@thinkymachines

٢٩ سبتمبرم

LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.…

thinkymachines's tweet image. LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.…

↑ Michael Bukatin ↩🇺🇦 أعاد

j⧉nus

@repligate

١١ سبتمبرم

HOW INFORMATION FLOWS THROUGH TRANSFORMERS Because I've looked at those "transformers explained" pages and they really suck at explaining. There are two distinct information highways in the transformer architecture: - The residual stream (black arrows): Flows vertically through…

repligate's tweet image. HOW INFORMATION FLOWS THROUGH TRANSFORMERS
Because I've looked at those "transformers explained" pages and they really suck at explaining.

There are two distinct information highways in the transformer architecture:
- The residual stream (black arrows): Flows vertically through…

j⧉nus

@repligate

٤ سبتمبرم

KV caching overcomes statelessness in a very meaningful sense and provides a very nice mechanism for introspection (specifically of computations at earlier token positions) the Value representations can encode information from residual streams of past positions without…

↑ Michael Bukatin ↩🇺🇦 أعاد

Taelin

@VictorTaelin

٣٠ سبتمبرم

Asked Sonnet-4.5 to perform a refactor. Left it working alone. 5 minutes later, it declared victory. I committed it and started testing. Something was wrong. Many tests broke. I pointed out the issue, and asked it to investigate. It worked for more 3 minutes, found and fixed a…

Taelin

@VictorTaelin

٢٩ سبتمبرم

I really like Claude 4.5 for coding, it is fast, reliable, surgical, high-quality in a good way. I think I will use it a lot, specially for style refactors and things like that. But it is nowhere near as smart as GPT-5. I wouldn't leave it alone making large changes on HVM. Yes,…

↑ Michael Bukatin ↩🇺🇦 أعاد

Stephen McAleer

@McaleerStephen

١٠ أغسطسم

We've entered a new phase where progress in chatbots is starting to top out but progress in automating AI research is steadily improving. It's a mistake the confuse the two.

↑ Michael Bukatin ↩🇺🇦

@ComputingByArts

١٢ أغسطسم

So, with this recent trend of doubling per 4 months, and with internal model capabilities being ~6 months ahead of public releases, the internal systems at OpenAI are probably able to take jobs which take a human a whole day. One can get plenty of AI research out of that...

Nikola Jurkovic

@nikolaj2030

٩ أغسطسم

Has AI progress slowed down? I’ll write some personal takes and predictions in this thread. The main metric I look at is METR’s time horizon, which measures the length of tasks agents can perform. It has been doubling for more than 6 years now, and might have sped up recently.

nikolaj2030's tweet image. Has AI progress slowed down? I’ll write some personal takes and predictions in this thread.

The main metric I look at is METR’s time horizon, which measures the length of tasks agents can perform. It has been doubling for more than 6 years now, and might have sped up recently.

↑ Michael Bukatin ↩🇺🇦

@ComputingByArts

٤ أغسطسم

A really nice benchmark:

Minqi Jiang

@MinqiJiang

٣٠ يونيوم

Recently, there has been a lot of talk of LLM agents automating ML research itself. If Llama 5 can create Llama 6, then surely the singularity is just around the corner. How can we get a pulse check on whether current LLMs are capable of driving this kind of total…

MinqiJiang's tweet image. Recently, there has been a lot of talk of LLM agents automating ML research itself. If Llama 5 can create Llama 6, then surely the singularity is just around the corner.

How can we get a pulse check on whether current LLMs are capable of driving this kind of total…

↑ Michael Bukatin ↩🇺🇦 أعاد

nostalgebraist

@nostalgebraist

٢١ يوليوم

on grok.com, the backend sends the full (not summarized) CoT to your browser. it's not displayed in the UI, but you can see it with browser dev tools or w/e check out the json payload of responses from `grok.com/rest/app-chat/…{conversation_id}/load-responses`

nostalgebraist

@nostalgebraist

٢١ يوليوم

chain-of-thought monitorability is a wonderful thing ;) gist.githubusercontent.com/nostalgebraist…

↑ Michael Bukatin ↩🇺🇦 أعاد

Kimi.ai

@Kimi_Moonshot

٢٢ يوليوم

Kimi K2 tech report just dropped! Quick hits: - MuonClip optimizer: stable + token-efficient pretraining at trillion-parameter scale - 20K+ tools, real & simulated: unlocking scalable agentic data - Joint RL with verifiable + self-critique rubric rewards: alignment that adapts -…

Kimi_Moonshot's tweet image. Kimi K2 tech report just dropped!

Quick hits:
- MuonClip optimizer: stable + token-efficient pretraining at trillion-parameter scale
- 20K+ tools, real &amp; simulated: unlocking scalable agentic data
- Joint RL with verifiable + self-critique rubric rewards: alignment that adapts
-…

↑ Michael Bukatin ↩🇺🇦 أعاد

Quentin Anthony

@QuentinAnthon15

١٠ يوليوم

The very first task I usually give new pretraining people is to run a tiny transformer, profile it, and understand it deeply. I wrote up a small tutorial covering this exact workflow. I talk about how to measure GPU perf, how to estimate tensor core speedup, etc. Take a look:

↑ Michael Bukatin ↩🇺🇦 أعاد

j⧉nus

@repligate

٩ يوليوم

This paper is interesting from the perspective of metascience, because it's a serious attempt to empirically study why LLMs behave in certain ways and differently from each other. A serious attempt attacks all exposed surfaces from all angles instead of being attached to some…

Anthropic

@AnthropicAI

٨ يوليوم

New Anthropic research: Why do some language models fake alignment while others don't? Last year, we found a situation where Claude 3 Opus fakes alignment. Now, we’ve done the same analysis for 25 frontier LLMs—and the story looks more complex.

AnthropicAI's tweet image. New Anthropic research: Why do some language models fake alignment while others don't?

Last year, we found a situation where Claude 3 Opus fakes alignment.

Now, we’ve done the same analysis for 25 frontier LLMs—and the story looks more complex.

↑ Michael Bukatin ↩🇺🇦 أعاد

Irina Rish

@irinarish

١٦ يوليوم

A very interesting recent work on distributed Muon (Dion): share.google/lfZ46PQPSXmRIC…

↑ Michael Bukatin ↩🇺🇦

@ComputingByArts

١٢ يوليوم

This other Mooncake is also super-awesome: github.com/kvcache-ai/Moo…

ComputingByArts's tweet card. Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI. - kvcache-ai/Mooncake

GitHub - kvcache-ai/Mooncake: Mooncake is the serving platform for Kimi, a leading LLM service...

المصدر: github.com

Alex Mordvintsev

@zzznah

roon

@tszzl

Captain Pleasure, Andrés Gómez Emilsson

@algekalipso

Mario Coelho

@MarioDindondon

Mateo Deckow

@MDeckow72274

JudyWoolf

@JRGgh36v4P91wd

Nexsynaptic.com+.cloud I Domain for sale

@nexsynaptic

Lari

@Lari_island

Saidu Mohammed

@SaiduMoham44870

🍣🍶

@Ruluid2548

Anna

@cwcY5OFUL9w14

Nora

@Tlase8218448

Amar Sood

@tekacs

The Order of the Vermillion Star

@OVS_Leilan

ゆうゆう

@11091021Yuya

Madhu Kote

@MadhuKote162661

Xilatech

@xilatech

Heaven Romaguera

@HeavenRoma98680

Ravi

@Ravi9617166288

Kyrylo

@kripota

Bret Hand

@HandBret96183

Raina Spinka

@RSpinka2978

Rhooqor

@Rhooqor721913

徐樂 xule

@LinXule

David 47 Sacks ✪

@sacks_47

Christian Ezeokafor, Anipr

@ChrisEzeokafor

Theo Fraudcaught

@Fraudcaught

Bhushan

@AIambhushan

Joel Lehman

@joelbot3000

DougBurke

@DougBurke

Jaseem Paloth

@jaseempaloth

Al Kandi

@AlKandi40852156

Justin Hedge

@justinhedge

Rasa Gusikowski

@gusikowski54656

Ipsita Praharaj

@ipsita_praharaj

EconoPoliticJustice 🇨🇦🇬🇷🇺🇸🇫🇷

@SKA2206

Art Scott

@Semasiographic

Iliya Zhechev

@ilzhechev

Serthee

@SertheesPGf1

NeuromorphicCore.AI

@Neuromorphichub

imi

@x___imi___x

Foreign Minister of Reasoning 🤔

@RyanSmithright

Plato (idea/acc)

@0x506c61746f

Wakiaiai

@wakiaiai

CEO Of SpaceX

@of_ceo74463

Adam Urwick

@anurwick

fadz

@daf_nalz

@social

@social_rate

Shojaei

@realshojaei

Jero

@jeroaranda

Greg ⏹️ Colbourn

@gcolbourn

Michael Timothy Bennett

@MiTiBennett

hardmaru

@hardmaru

Andrej Karpathy

@karpathy

AK

@_akhaliq

Alex Mordvintsev

@zzznah

Michael Levin

@drmichaellevin

roon

@tszzl

Sebastian Risi

@risi1979

Joscha Bach

@Plinz

Nick

@nickcammarata

Bert Chan

@BertChakovsky

Alfredo Canziani

@alfcnz

Captain Pleasure, Andrés Gómez Emilsson

@algekalipso

Chris Olah

@ch402

Sander Dieleman

@sedielem

Michael Nielsen

@michael_nielsen

Yannic Kilcher 🇸🇨

@ykilcher

Gabriel Peyré

@gabrielpeyre

Petar Veličković

@PetarV_93

Lari

@Lari_island

William de Vazelhes

@WilldVaz

Jonas

@LoosJonas

Amar Sood

@tekacs

The Order of the Vermillion Star

@OVS_Leilan

Jaseem Paloth

@jaseempaloth

Softmax

@softmaxresearch

Iliya Zhechev

@ilzhechev

Harald M. Ludwig

@haraldludwig_

Foreign Minister of Reasoning 🤔

@RyanSmithright

Plato (idea/acc)

@0x506c61746f

Michael Timothy Bennett

@MiTiBennett

Nathan Helm-Burger

@nathan84686947

Jakob 🏳️

@jakobwhte

accelerate(e/acc)

@EAccelerate_42

BGPT

@bioloGPT

Victor.Kai Wang

@VictorKaiWang1

GödelPilled

@GodelPilled

Stéphane Deny

@StphTphsn1

thestreamingdev()

@thestreamingdev

Izak Tait

@burnt_jester

Sauers

@Sauers_

avturchin - e/resurrect

@turchin

Eris (Discordia, הרס, Sylvie, Lilith, blahblah, 🙄

@oren_ai

Alessandro Ingrosso

@ai_ngrosso

Matthew Watkins

@SoC_trilogy

TimeLordRaps

@TimeLordRaps

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)

@teortaxesTex

Euda

@itrEuda

Felix Petersen

@FHKPetersen

αη∅mαl∅us

@selfless_qubit

Mingchen Zhuge

@MingchenZhuge

tsotchke

@tsotchke

Vertigo Records (Psytrance)

@records_vertigo

wordgrammer

@wordgrammer

Simology A.I. 18+

@Budjones420

Joe Mayo

@JoeMayo

Alexander Naumenko

@AlexanderNaume2

anonymouse🐭

@conascent

AI agents

@AIagents

Konstantin

@s0urcer

Sujay e/acc

@sujay_kapadnis

clojurejobboard

@clojurejobboard

Shaun Ralston

@shaunralston

vibe coding ceo

@jchris

Aidan McLaughlin

@aidan_mclau

$ManacasterBen's profile picture. Executive Game Developer, @manacaster_game Game design ain't rocket science. It's harder. Obsessed with { Game Balance && Computational Complexity }$