difficultyang's profile picture. More social alt of @ezyang

difficultyang

@difficultyang

More social alt of @ezyang

🍼🤖

oh and nobody is allowed to say this part loudly but, ignoring a certain txt file



Earlier this year Gemini was incredibly mid for coding and it will be extremely funny if they still haven't fixed it


when people ask me questions about code I'm not directly familiar with but have good priors about, I love being able to say "point claude code at $FILE and ask it questions"


claude gets so excited when analyzing pytorch profiler traces


New benchmark just dropped

difficultyang's tweet image. New benchmark just dropped

Tracing is a hell of a drug (4yo)

difficultyang's tweet image. Tracing is a hell of a drug (4yo)

So I am teaching the 4yo how to read and she keeps sounding out night while forgetting -igh is an I sound, to unfortunate results 😂


4yo 35yo pair drawing

difficultyang's tweet image. 4yo 35yo pair drawing

After the Tahoe upgrade many of the application icons in my dock are blurry and it's driving me nuts


getting a picture in my head of what nn.Module 2.0 would look like, but tbh it's probably never gonna happen


I did bedtime using an ebook on the iPad for the 4yo yesterday... And it was great! So much selection now and no fretting about eink colors sucking.


Unfortunately, it seems increasingly the answer is "figure out how to make it not dynamic"

why is triton’s kernel launch cpu overhead so freaking high? the actual kernel takes 10x less execution time than to launch it and i can’t use cuda graphs because the shapes are dynamic.



Terrific work. We know how good computers are when the objective function can be cleanly expressed but it still gives me chills to see it done well in reality

From Terry Tao on Mathstodon: “ A new paper with Bogdan Georgiev, Javier Gomez-Serrano, and Adam Zsolt Wagner: "Mathematical exploration and discovery at scale" arxiv.org/abs/2511.02864 , in which we record our experiments using the LLM-powered optimization tool #AlphaEvolve to…

pli_cachete's tweet image. From Terry Tao on Mathstodon:

“ A new paper with Bogdan Georgiev, Javier Gomez-Serrano, and Adam Zsolt Wagner: "Mathematical exploration and discovery at scale" arxiv.org/abs/2511.02864 , in which we record our experiments using the LLM-powered optimization tool #AlphaEvolve to…
pli_cachete's tweet image. From Terry Tao on Mathstodon:

“ A new paper with Bogdan Georgiev, Javier Gomez-Serrano, and Adam Zsolt Wagner: "Mathematical exploration and discovery at scale" arxiv.org/abs/2511.02864 , in which we record our experiments using the LLM-powered optimization tool #AlphaEvolve to…


Grace 🤖

Now that I'm training models again, it's actually crazy how far behind software support for Blackwell is. For example torchao has to be built from source for sm120, and I'm also having issues with inference frameworks.



Now do it in Helion

Pytorch quizz! You have a 1d long tensor containing contiguous increasing blocks of ids E.g. [ 0, 0, 1, 1, 1, 2, 3, 3, 3, 3 ] how would you create the tensor of per-block reversed indexes? E.g. here [ 1, 0, 4, 3, 2, 5, 9, 8, 7, 6 ] ? I have a solution in 228 chars.



Loading...

Something went wrong.


Something went wrong.