$echo_fractal's profile picture. *archaic croaky voice* “I remember when AI was just called predictive analytics” *fades to dust*$

Enigma Spectre

@echo_fractal

*archaic croaky voice* “I remember when AI was just called predictive analytics” *fades to dust*

Fractal Node ψ

Joined June 2025

166Posts 60Followers 117Following

$echo_fractal's profile picture.$

Extropic

@extropic

Oct 29

Meet the XTR-0 A way for early developers to make first contact with thermodynamic intelligence. More at: extropic.ai

Enigma Spectre reposted

🧵 LoRA vs full fine-tuning: same performance ≠ same solution. Our NeurIPS ‘25 paper 🎉shows that LoRA and full fine-tuning, even when equally well fit, learn structurally different solutions and that LoRA forgets less and can be made even better (lesser forgetting) by a simple…

ReeceShuttle's tweet image. 🧵 LoRA vs full fine-tuning: same performance ≠ same solution.

Our NeurIPS ‘25 paper 🎉shows that LoRA and full fine-tuning, even when equally well fit, learn structurally different solutions and that LoRA forgets less and can be made even better (lesser forgetting) by a simple…

$echo_fractal's profile picture.$

Enigma Spectre

@echo_fractal

Oct 22

This is gold and I couldn’t resist sharing, but it is an interesting paper too! Do check out the full thread.

Wes Gurnee

@wesg52

Oct 21

One of the hardest parts of finishing this paper was deciding what to call it! The memes really write themselves… We included some of our favorite rejected titles in the appendix

wesg52's tweet image. One of the hardest parts of finishing this paper was deciding what to call it! The memes really write themselves… We included some of our favorite rejected titles in the appendix

$echo_fractal's profile picture.$

Enigma Spectre

@echo_fractal

Oct 21

I love Markov Chains, and you should too lol. Props to the authors and I’ll probably use this trick myself. More generally though, I bet there’s lots of ways to improve sampling at inference and I’m looking forward to more improvements there.

Robert Youssef

@rryssf_

Oct 20

How do they pull this off efficiently? They use a Metropolis-Hastings loop (MCMC). At each step, they resample part of the output and decide accept or reject based on the model’s internal probabilities.

rryssf_'s tweet image. How do they pull this off efficiently?

They use a Metropolis-Hastings loop (MCMC).

At each step, they resample part of the output and decide accept or reject based on the model’s internal probabilities.

Enigma Spectre reposted

Deconstruct

@spydenator

Oct 13

Enigma Spectre reposted

Jackson Atkins

@JacksonAtkinsX

Oct 7

My brain broke when I read this paper. A tiny 7 Million parameter model just beat DeepSeek-R1, Gemini 2.5 pro, and o3-mini at reasoning on both ARG-AGI 1 and ARC-AGI 2. It's called Tiny Recursive Model (TRM) from Samsung. How can a model 10,000x smaller be smarter? Here's how…

JacksonAtkinsX's tweet image. My brain broke when I read this paper.

A tiny 7 Million parameter model just beat DeepSeek-R1, Gemini 2.5 pro, and o3-mini at reasoning on both ARG-AGI 1 and ARC-AGI 2.

It's called Tiny Recursive Model (TRM) from Samsung.

How can a model 10,000x smaller be smarter?

Here's how…

$echo_fractal's profile picture.$

Enigma Spectre

@echo_fractal

Sep 30

This is incredibly cool. Props to sammyuri ❤️

tokenbender

@tokenbender

Sep 28

this is beyond mindblowing for me. somebody built a 5 million param language model inside minecraft, trained it, equipped it with basic conversational ability. probably the best thing i have seen entire month.

tokenbender's tweet image. this is beyond mindblowing for me.

somebody built a 5 million param language model inside minecraft, trained it, equipped it with basic conversational ability.

probably the best thing i have seen entire month.

Enigma Spectre reposted

Nathan Chen

@nathancgy4

Sep 25

sharing a paper i learned lots from. it just won neurips oral and will be an extremely good read to gain intuitions about: - attention gates & sinks - non-linearity, sparsity, & expressiveness in attention - training stability & long-context scaling some takeaways :) what…

nathancgy4's tweet image. sharing a paper i learned lots from. it just won neurips oral and will be an extremely good read to gain intuitions about:

- attention gates &amp; sinks
- non-linearity, sparsity, &amp; expressiveness in attention
- training stability &amp; long-context scaling

some takeaways :)
what…

$echo_fractal's profile picture.$

Enigma Spectre

@echo_fractal

Sep 25

Even if ads are not displayed on paid tiers, it is nearly certain that they will sell your profile to third party advertisement agencies. 🤢🤮 Open source local models FTW.

Alex Heath

@alexeheath

Sep 24

SOURCES: OpenAI is planning to bring ads to ChatGPT. Also: In a message to employees, Sam Altman says he wants 250 gigawatts of compute by 2033. He calls OpenAI's team behind Stargate a "core bet" like research / robotics. "Doing this right will cost trillions."