Vector

@Vector434852

Built a 1-layer O(n) model that reached 5.5 Val PPL on TinyStories / 8.5M params · trained on a 4060 laptop .

Joined December 2025

16Posts 1Followers 33Following

Vector

@Vector434852

Dec 19

Optimizing my custom O(n) model.. val ppl 16.7 @ 1°epoch is fair .. 37 mins / epoch on rtx4060 laptop #LocalLLM #SmallModels #AI #MachineLearning #RTX4060 #FromScratch

Vector

@Vector434852

Dec 18

8.5M params . 1 layer of custom O(n) attention on rtx4060 laptop . Val PPL 5.5 (still decreasing) on Tinystories.

Vector

@Vector434852

Dec 16

Added a second layer.. 8.5M params.. how it gonna ends?

Vector

@Vector434852

Dec 15

I'm working on a new breed of linear attention. 1 single layer reached PPL 6.5 in 30 epochs on a rtx4060 laptop.. more to come :)

Built a new O(n) attention from scratch. Not Transformer / LSTM / Mamba. Single layer, 8.5M params, RTX 4060 laptop. TinyStories: Val PPL 18.6 | Test PPL 18.9 Context 2048 GPT-2 small needs ~124M params for similar PPL. ~15× parameter efficiency. @ylecun @karpathy Logs ↓

Vector

@Vector434852

Dec 10

8.5M params 1 layer O(n) attention chunk=2048 RTX 4060 laptop ~72h training TinyStories val PPL: 18.9 (still dropping) No transformers were harmed. log ↓