Program Counter

@program_counter

all things toward agi

Valhalla

Joined December 2022

1KPosts 479Followers 8KFollowing

You might like

@Truthtellerliar

@lucas__durante

@SeniorPartner_

@Xavier_web_dev

@viralvaghela3

@cameraenvy

@JohnMcDonough_

Program Counter reposted

Edward Z. Yang

@ezyang

Dec 2

TPU question: suppose I want to do a point wise operation on a buffer that is 90% padding, but the padding boundary is only known on device. How do I avoid wasting compute cycles for the padded regions?

Program Counter reposted

Ethan Lipnik

@EthanLipnik

Dec 2

vector.ethanlipnik.com

Vector - Intelligent Search for macOS

Lightning-fast search powered by on-device machine learning. Find apps, files, messages, and more — all without compromising your privacy.

Source: vector.ethanlipnik.com

Program Counter reposted

Ziqian Zhong ✈️ NeurIPS '25

@fjzzq2002

Dec 2

I ran gpt-oss over #NeurIPS2025 papers and tagged the mech interp / AI safety-ish ones. In case it's useful: docs.google.com/spreadsheets/d…

Program Counter reposted

Irwan Bello

@IrwanBello

Dec 2

Headed to #NeurIPS with the Reflection team this week! 👋 Keen to chat about LLMs, RL, agents, open research & science. We have a few open roles: jobs.ashbyhq.com/reflectionai

Program Counter reposted

raymond ma @ neurips ✈️

@rayhascode

Dec 2

im excited to be at NeurIPS in San Diego this week!! shoot me a DM if ur here or wanna hang! :D

Program Counter reposted

Jingfeng Wu

@uuujingfeng

Dec 1

Together with @yuxiangw_cs and Maryam Fazel, we are excited to present our tutorial "Theoretical Insights on Training Instability in Deep Learning" tomorrow at #NeurIPS2025! Link: uuujf.github.io/instability/ *picture generated by Gemini

uuujingfeng's tweet image. Together with @yuxiangw_cs and Maryam Fazel, we are excited to present our tutorial "Theoretical Insights on Training Instability in Deep Learning" tomorrow at #NeurIPS2025!
Link: uuujf.github.io/instability/
*picture generated by Gemini

Program Counter reposted

Lianmin Zheng

@lm_zheng

Dec 2

Very interesting optimizations to push the limit of GPUs. The team will test and try to integrate it.

SzymonOzog (NeurIPS)

@SzymonOzog_

Dec 1

Releasing Alpha-MoE: Megakernel for fast Tensor Parallel Inference! Up to 200% faster execution of MoE layer in SGLang, with 17% higher average throughput on Qwen3-Next-80B, and 10% higher average throughput on DeepSeek Proud to showcase my recent work at @Aleph__Alpha🧵

SzymonOzog_'s tweet image. Releasing Alpha-MoE: Megakernel for fast Tensor Parallel Inference!

Up to 200% faster execution of MoE layer in SGLang, with 17% higher average throughput on Qwen3-Next-80B, and 10% higher average throughput on DeepSeek

Proud to showcase my recent work at @Aleph__Alpha🧵

Program Counter reposted

Saman Habibi Esfahani

@Saman_Habibi_E

Dec 1

For those interested in the intersection of Geometry and machine learning: Charles Fefferman (Fields Medalist) recently gave a great online talk at Harvard CMSA on "extrinsic and intrinsic manifold learning, old and new". Very interesting talk. Link: youtube.com/watch?v=XUv6re…

Saman_Habibi_E's tweet image. For those interested in the intersection of Geometry and machine learning: Charles Fefferman (Fields Medalist) recently gave a great online talk at Harvard CMSA on "extrinsic and intrinsic manifold learning, old and new". Very interesting talk. Link: youtube.com/watch?v=XUv6re…

Program Counter reposted

Xueyan Zou @ NeurIPS

@xyz2maureen

Dec 1

I will join Tsinghua University, College of AI, as an Assistant Professor in the coming month. I am actively looking for 2026 spring interns and future PhDs (ping me if you are in #NeurIPS). It has been an incredible journey of 10 years since I attended an activity organized by…

xyz2maureen's tweet image. I will join Tsinghua University, College of AI, as an Assistant Professor in the coming month. I am actively looking for 2026 spring interns and future PhDs (ping me if you are in #NeurIPS).

It has been an incredible journey of 10 years since I attended an activity organized by…

Program Counter reposted

Jonathan Whitaker

@johnowhitaker

Dec 1

This video by @jbhuang0604 manages to cram in all the core pieces of modern attention variants, a perfect refresher if you (like me) need a reminder of the differences between MHA, GQA, MLA, DSA etc :) youtube.com/watch?v=Y-o545…

johnowhitaker's tweet image. This video by @jbhuang0604 manages to cram in all the core pieces of modern attention variants, a perfect refresher if you (like me) need a reminder of the differences between MHA, GQA, MLA, DSA etc :)
youtube.com/watch?v=Y-o545…

Program Counter reposted

Aleksander Holynski

@holynski_

Dec 1

We’ve opened some more full-time & intern roles at Google DeepMind. Come work with us! DM if interested, or come find me at NeurIPS this week! ☀️

Program Counter reposted

Been Kim

@_beenkim

Dec 1

Coming to Neurips from 5-7th! Speaking at the mechanistic interpretability workshop mechinterpworkshop.com on the 7th about unmechanistic interpretability (as requested by the organizers) 🙃🙂 While I’ll miss this, our work will be demoed at Google booth eg veo zeroshot…

Program Counter reposted

Luke Darlow

@LearningLukeD

Dec 1

I'll be at NeurIPS 2025 in San Diego. I will be presenting our poster on Continuous Thought Machines (pub.sakana.ai/ctm) on Thursday afternoon: neurips.cc/virtual/2025/l… See you there?

Program Counter reposted

Luke Darlow

@LearningLukeD

Dec 1

I just came across this really good summary of @YesThisIsLion and my @MLStreetTalk interview (youtu.be/DtePicx_kFY): theneuron.ai/explainer-arti… Noice.

LearningLukeD's tweet card. The Guy Who Invented the Transformer Just Said We Should Stop Using It; This Is What He Created Instead.

Continuous Thought Machine, Explained

Source: theneuron.ai

Program Counter reposted

Ravi Theja

@ravithejads

Dec 1

Super excited to share that I have joined the Applied AI, US team at @MistralAI. Grateful to @aviTwit3 and @sophiamyang for the opportunity, and thankful to the HR team, Charles, Alexandre, and Brian for making the transition seamless. A heartfelt thank you to @NirantK, Naveen…

ravithejads's tweet image. Super excited to share that I have joined the Applied AI, US team at @MistralAI. Grateful to @aviTwit3 and @sophiamyang for the opportunity, and thankful to the HR team, Charles, Alexandre, and Brian for making the transition seamless.

A heartfelt thank you to @NirantK, Naveen…

Program Counter reposted

DurstewitzLab

@DurstewitzLab

Nov 30

Unlike current AI systems, brains can quickly & flexibly adapt to changing environments. This is the topic of our perspective in Nature MI (rdcu.be/eSeif), where we relate dynamical & plasticity mechanisms in the brain to in-context & continual learning in AI. #NeuroAI

DurstewitzLab's tweet image. Unlike current AI systems, brains can quickly &amp; flexibly adapt to changing environments.
This is the topic of our perspective in Nature MI (rdcu.be/eSeif), where we relate dynamical &amp; plasticity mechanisms in the brain to in-context &amp; continual learning in AI. #NeuroAI

Program Counter reposted

Andreas Köpf

@neurosp1ke

Nov 29

Lumine - 64 H100 are all you need. 2026 is going to be a crazy year for robotics.

Igor Kotenkov

@stalkermustang

Nov 15

I'm not sure why this new ByteDance Seed paper is not all over my feed. Am I missing something? - trained Qwen2VL-7B to play genshin - SFT only, no RL - 2424 hours of human gameplay + 15k short reasoning traces to decompose the tasks - sub 20k H100 hours (3 epochs) - heaps of…

stalkermustang's tweet image. I'm not sure why this new ByteDance Seed paper is not all over my feed. Am I missing something?

- trained Qwen2VL-7B to play genshin
- SFT only, no RL
- 2424 hours of human gameplay + 15k short reasoning traces to decompose the tasks
- sub 20k H100 hours (3 epochs)
- heaps of…

Program Counter reposted

Aaron Lou

@aaron_lou

Nov 30

The Strategic Explorations team @OpenAI is looking to recruit researchers interested in working on the next frontier of language modeling! Feel free to reach out to me by email. @darkproger and I will also be at NeurIPS to connect and discuss in person.

Program Counter reposted

echo.hive

@hive_echo

Nov 30

train 8 neuron NN, expand it to 256 neuron NN and continue training from same accuracy and even gain a jump 1) we copy the small network’s 8 hidden units into the larger layer and duplicate the rest so the larger model computes exactly the same function as the small one at the…

echo.hive

@hive_echo

Nov 30

Evo strategy and Hebbian-learning work together to train a network to 80% accuracy! NO backprop The script alternates between these two learning paradigms, allowing ES to explore the weight space broadly while Hebbian learning fine-tunes the solutions. -Code in comment- trained…

Program Counter reposted

Matej Sirovatka

@m_sirovatka

Nov 30

After 3 weeks, we have concluded our first problem of the @GPU_MODE x @nvidia competition, NVFP4 GEMV. Thanks to everyone who has participated, we have collected over 40k submissions from >200 users. Congrats to the winners and good luck with the next problem, NVFP4 GEMM 🔥