@spatial

@spatialneuron

gradient learner

sf/nyc

Joined May 2019

2KPosts 244Followers 320Following

You might like

@juand4bot

@firqaaaa

@QuelMohamadAmin

@dopsnacky

@spurthi_ah

@m_salti_

@MorpheusSJQ

@nikhilbarhate99

@dill_sunnyb11

@shovon_sengupta

@pythagodzilla

@prthtiw

@SiliconNoodle

@uundertheradar

@wtergan

@spatial

@spatialneuron

Nov 4

just make it fun

@spatial reposted

step-by-step LLM Engineering Projects each project = one concept learned the hard (i.e. real) way Tokenization & Embeddings > build byte-pair encoder + train your own subword vocab > write a “token visualizer” to map words/chunks to IDs > one-hot vs learned-embedding: plot…

@spatial

@spatialneuron

Oct 30

Oh fuck

0x45

@0x45o

Oct 30

will AI take my job? check now, 0x45o.com

@spatial reposted

Felix Krause

@felix_m_krause

Oct 14

We cut the cost of training a diffusion model from months of rent to one night out. TREAD matches ImageNet performance of a DiT with 97% fewer A100 hours! No extra components. No extra losses. Training‑time only. Inference remains unchanged. Accepted at ICCV2025🌺

felix_m_krause's tweet image. We cut the cost of training a diffusion model from months of rent to one night out.

TREAD matches ImageNet performance of a DiT with 97% fewer A100 hours!
No extra components. No extra losses. Training‑time only. Inference remains unchanged.

Accepted at ICCV2025🌺

@spatial

@spatialneuron

Oct 5

uid

Can

@icanvardar

Oct 5

user_id, userId or userID?

@spatial reposted

pdawg

@prathamgrv

Oct 2

as much as i hate to admit this

@spatial reposted

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)

@teortaxesTex

Oct 1

Yann LeCun is right

From A.I.Warper

@spatial reposted

Dimitris Papailiopoulos

@DimitrisPapail

Sep 19

Small models as the new frontier and why this may be academia's LLM moment Academia should reject the nihilism of "scale is all you need", i.e, that meaningful research requires frontier scale compute. This mindset hurts basic research and what we can contribute to machine…

@spatial reposted

Gergely Orosz

@GergelyOrosz

Sep 19

Talked with the Claude Code team on how they build Claude Code. It feels I get a peek into the future, and I get why Dario said 6 months ago that 90% of code will be written by AI. This team works SO differently than any eng team I saw. Will share in-depth soon. One example:

@spatial reposted

Air Katakana

@airkatakana

Sep 19

your ability to vibe code is proportional to your ability to code by hand, and anyone who loudly announces that vibe coding cant lead to production ready software is just telling on themselves

@spatial

@spatialneuron

Sep 16

Any startups in nyc hiring?

@spatial reposted

Guan Wang

@makingAGI

Sep 12

Hierarchical reasoning works well on large language models!🎉

@spatial reposted

Kendra

@StretchGoat_

Sep 6

@spatial reposted

Deedy

@deedydas

Aug 31

This new DeepMind research shows just how broken vector search is. Turns out some docs in your index are theoretically incapable of being retrieved by vector search, given a certain dimension count of the embedding. Plain old BM25 from 1994 outperforms it on recall. 1/4

deedydas's tweet image. This new DeepMind research shows just how broken vector search is.

Turns out some docs in your index are theoretically incapable of being retrieved by vector search, given a certain dimension count of the embedding.

Plain old BM25 from 1994 outperforms it on recall.

1/4

@spatial reposted

Binyuan Hui

@huybery

Aug 27

I believe LLMs will inevitably surpass humans in coding. Let us think about how humans actually learn to code. Human learning of coding has two stages. First comes memorization and imitation: learning syntax and copying good projects. Then comes trial and error: writing code,…

@spatial reposted

Mr 9to5

@9to5Balance

Aug 24

Researchers, don’t miss this: ‘The Big LLM Architecture Comparison’ by @rasbt lays out how modern models like DeepSeek-V3 and Kimi K2 differ in structure, efficiency, and capabilities. Great for model design inspiration! Link in comments.

9to5Balance's tweet image. Researchers, don’t miss this: ‘The Big LLM Architecture Comparison’ by @rasbt lays out how modern models like DeepSeek-V3 and Kimi K2 differ in structure, efficiency, and capabilities.

Great for model design inspiration!

Link in comments.

@spatial reposted

jack

@jack

Aug 23

i love deleting code

@spatial reposted

DataVoid

@DataPlusEngine

Aug 20

Just came up with Multi-Scale Control for Stable Diffusion and I'm losing my mind! Instead of your prompt flowing through ALL upsample/downsample blocks like normal, you can now inject DIFFERENT prompts at different resolution stages of the UNet. Discovered something wild:…

DataPlusEngine's tweet image. Just came up with Multi-Scale Control for Stable Diffusion and I'm losing my mind!

Instead of your prompt flowing through ALL upsample/downsample blocks like normal, you can now inject DIFFERENT prompts at different resolution stages of the UNet.

Discovered something wild:…

@spatial reposted

mac

@connormclarenn

Aug 20

pressure is a crazy feeling it's an energy that will turn you into a pussy or a killer you either run through the fucking wall and build confidence OR shut down and feel bad for yourself BUT you can always get back up and run through the fucking wall SO the real question…