program_counter's profile picture. all things toward agi

Program Counter

@program_counter

all things toward agi

Program Counter أعاد

I quit my job so I can have enough time to read this book btw

Training LLMs end to end is hard. Very excited to share our new blog (book?) that cover the full pipeline: pre-training, post-training and infra. 200+ pages of what worked, what didn’t, and how to make it run reliably huggingface.co/spaces/Hugging…

eliebakouch's tweet image. Training LLMs end to end is hard. Very excited to share our new blog (book?) that cover the full pipeline: pre-training, post-training and infra. 200+ pages of what worked, what didn’t, and how to make it run reliably

huggingface.co/spaces/Hugging…


Program Counter أعاد

I've been brainstorming episodes for the next season of PyTorch Developer Podcast. DTensor StridedShard, FSDP-TP order Redistributing a DTensor Prefetching vs Bucketing History of FSDP in PyTorch Multiprocessing: DataParallel versus DistributedDataParallel Monarch Parallelism…


Program Counter أعاد

Kaggle is still extremely underrated to learn these basics. On Kaggle you need to produce (i) strong models, (ii) that are robust with strong validation setups, and that are (iii) optimized for inference speed. So you actually learn about many parts of the DS pipeline.

my honest advice to people who this resonated with: spend less time reading shiny papers & more time working on the "boring" things, focus on the basics. by basics, i mean like deduplication (on the data side), understanding dp/tp/pp abstractions (on the training side), etc



Program Counter أعاد

Recently we released a 3000+ word book chapter written by @KeremTurgutlu, based on @karpathy's marvelous "Let's built the GPT tokenizer" video. It's got pics, links, code, diagrams, … Kerem has now written a detailed walk-through of how he made it: answer.ai/posts/2025-10-…


Program Counter أعاد

Beautiful technical debugging detective longread that starts with a suspicious loss curve and ends all the way in the Objective-C++ depths of PyTorch MPS backend of addcmul_ that silently fails on non-contiguous output tensors. I wonder how long before an LLM can do all of this.

New blog post: The bug that taught me more about PyTorch than years of using it started with a simple training loss plateau... ended up digging through optimizer states, memory layouts, kernel dispatch, and finally understanding how PyTorch works!

ElanaPearl's tweet image. New blog post: The bug that taught me more about PyTorch than years of using it

started with a simple training loss plateau... ended up digging through optimizer states, memory layouts, kernel dispatch, and finally understanding how PyTorch works!


Program Counter أعاد

24 years old, still holds up

joodalooped's tweet image. 24 years old, still holds up

Program Counter أعاد

RL is pain sometimes and torchforge could handle a lot of the messiness! Thanks for the @PyTorch and @CoreWeave team for letting us test it out!

Today Meta announced torchforge, a brand-new PyTorch-native library that makes it easy to use reinforcement learning (RL) to train AI agents. Forge provides high-performance building blocks and ready-to-use examples, so you can focus on what’s novel about your use case rather…

PyTorch's tweet image. Today Meta announced torchforge, a brand-new PyTorch-native library that makes it easy to use reinforcement learning (RL) to train AI agents.

Forge provides high-performance building blocks and ready-to-use examples, so you can focus on what’s novel about your use case rather…


Program Counter أعاد

One of the best SIMD programmers I’ve had the pleasure of interacting with is becoming available. Real work, with real code, that you almost certainly interact with every single day.

I am looking for a job starting May 2026. I am an expert in SIMD programming, in particular for non-numeric applications such as text processing or database programming. Please have a look at my website for the sort of work I do. I am located in Berlin, Germany.



Program Counter أعاد

At PyTorch 2025, where NVIDIA decided to unveil more details about cuTile and TileIR.

CUDAHandbook's tweet image. At PyTorch 2025, where NVIDIA decided to unveil more details about cuTile and TileIR.

Program Counter أعاد

The OG PyTorch blog, explaining the mechanics and concepts of the internals of the framework. This basically allows you to explore the complete codebase, enabling better contributions. Definitely worth a read, then another ! Blog by - @ezyang

TheGlobalMinima's tweet image. The OG PyTorch blog, explaining the mechanics and concepts of the internals of the framework. This basically allows you to explore the complete codebase, enabling better contributions. Definitely worth a read, then another ! 

Blog by - @ezyang

Program Counter أعاد

One year and half after starting the first draft of the first chapter, look what arrived in the mail!

GuggerSylvain's tweet image. One year and half after starting the first draft of the first chapter, look what arrived in the mail!

Program Counter أعاد

Good point. My first paper was on time travel in the Gödel universe. ML was easy to pick up after that :) journals.aps.org/prd/abstract/1…

We’ve found a ton of value hiring folks with strong theory backgrounds with little to no production ML experience. One of our members of technical staff got his phd in pure math/the geometry of black holes and had no prior ML experience. Within days of hiring him we released our…



Program Counter أعاد

it is indeed blog post catch-up day (i'm behind by 6 weeks)

jdchawla29's tweet image. it is indeed blog post catch-up day (i'm behind by 6 weeks)

I guess it's blog post catch-up day instead of paper catch-up day x.com/omouamoua/stat…



Program Counter أعاد

I've left NVIDIA Research and joined AIRoA Tokyo as Team Lead, VLA Dev. We're pushing VLA and building a Japan-wide real-world data ecosystem with major partners in retail/logistics/construction to deploy hundreds of humanoids. 🔥We're hiring researchers and DM me if interested!


Program Counter أعاد

bro @karpathy literally re-implemented the entire lm-eval-harness in 2 Python files It's been very useful for my own repo and easy to adapt for SuperBPE case

iamgrigorev's tweet image. bro @karpathy literally re-implemented the entire lm-eval-harness in 2 Python files
It's been very useful for my own repo and easy to adapt for SuperBPE case

Program Counter أعاد

I'll be presenting Formalized Kernel Derivation to @GPU_MODE w/ @GioeleZardini; discord dot gg/gpumode at noon PST today! Will be uploaded to the GPU Mode YT afterward. Somewhere at the intersection of art and science. Come for the diagrams, stay for the math.

vtabbott_'s tweet image. I'll be presenting Formalized Kernel Derivation to
@GPU_MODE w/ @GioeleZardini; discord dot gg/gpumode at noon PST today! Will be uploaded to the GPU Mode YT afterward.
Somewhere at the intersection of art and science. Come for the diagrams, stay for the math.

Program Counter أعاد

RIP. Markov processes and Yang-Mills: ​ ​tinyurl.com/3huay75x

Prof. Chen Ning Yang, a world-renowned physicist, Nobel Laureate in Physics, Academician of the Chinese Academy of Sciences, Professor at Tsinghua University, and Honorary Director of the Institute for Advanced Study at Tsinghua University, passed away in Beijing due to illness…

Tsinghua_Uni's tweet image. Prof. Chen Ning Yang, a world-renowned physicist, Nobel Laureate in Physics, Academician of the Chinese Academy of Sciences, Professor at Tsinghua University, and Honorary Director of the Institute for Advanced Study at Tsinghua University, passed away in Beijing due to illness…


Program Counter أعاد

Checkout our latest work on Gaussian Splatting for LiDAR with 3DGUT!

[1/N] Excited to introduce "SimULi: Real-Time LiDAR and Camera Simulation with Unscented Transforms." We extend 3DGUT with LiDAR support and render a wide range of sensors 10-20x faster than ray tracing and 1.5-10x faster than prior rasterization work. research.nvidia.com/labs/sil/proje…



Loading...

Something went wrong.


Something went wrong.