Program Counter

@program_counter

all things toward agi

Valhalla

於十二月 2022 加入

1千貼文 479位跟隨者 8千個跟隨中

你可能會喜歡

@Truthtellerliar

@lucas__durante

@SeniorPartner_

@Xavier_web_dev

@viralvaghela3

@cameraenvy

@JohnMcDonough_

Program Counter 已轉發

Ksenia_TuringPost

@TheTuringPost

21 小時

Interestingly, it was still hard to tell when AI models gain better reasoning – during pre-training, mid-training, or RL. Researchers at @CarnegieMellon found that each of them plays distinct roles: - RL truly improves reasoning only in specific conditions - Generalizing across…

TheTuringPost's tweet image. Interestingly, it was still hard to tell when AI models gain better reasoning – during pre-training, mid-training, or RL.

Researchers at @CarnegieMellon found that each of them plays distinct roles:

- RL truly improves reasoning only in specific conditions
- Generalizing across…

Program Counter 已轉發

John Carmack

@ID_AA_Carmack

2020年1月11日

My machine learning education has progressed to the point where I lose sleep tossing around a “brilliant” idea, only to find the next day that it doesn’t actually work. This is great! I talked about this a few years ago: amasad.me/carmack

Program Counter 已轉發

Ita Zaporozhets

@itazapo

年12月18日

Transformers v5 redesigns tokenization. In this blog post we talk about: > tokenization crash course > tokenizers and transformers - the bridge > v5 tokenizers backend Major shoutout to @karpathy for his BPE video that got me interested in tokenization in the first place.

itazapo's tweet image. Transformers v5 redesigns tokenization.

In this blog post we talk about:
&gt; tokenization crash course
&gt; tokenizers and transformers - the bridge
&gt; v5 tokenizers backend

Major shoutout to @karpathy for his BPE video that got me interested in tokenization in the first place.

Program Counter 已轉發

Zheng Yuan

@GanjinZero

年12月19日

Excited to announce Seed-Prover 1.5 which is trained via large-scale agentic RL with Lean. It proved 580/660 Putnam problems and proved 11/12 in Putnam 2025 within 9 hours. Check details at github.com/ByteDance-Seed…. We will work on autoformalize towards contributing to real math!

GanjinZero's tweet image. Excited to announce Seed-Prover 1.5 which is trained via large-scale agentic RL with Lean. It proved 580/660 Putnam problems and proved 11/12 in Putnam 2025 within 9 hours. Check details at github.com/ByteDance-Seed…. We will work on autoformalize towards contributing to real math!

Program Counter 已轉發

Zhiyang (Frank) Dou

@frankzydou

年12月14日

Excited to share that our paper 🌊🤺 “CFC: Simulating Character–Fluid Coupling using a Two-Level World Model” has been accepted to #SIGGRAPHASIA2025! In this work, we build a two-level world model (neural physics) for rigid-body–fluid interaction and use it to train…

Program Counter 已轉發

[email protected]

@HaskRay

年12月15日

Weak AVL trees are replacements for AVL trees and red-black trees. The insertion and deletion operations are inspired by the FreeBSD implementation (sys/tree.h), with the insertion further optimized. maskray.me/blog/2025-12-1…

Program Counter 已轉發

William Zhang

@will_z65038

年12月14日

Really grateful to @GPU_MODE for the opportunity to talk about my recent Tiny TPU project: 🧵youtube.com/watch?v=kccs9x….

will_z65038's tweet card. Lecture 88: TinyTPU

youtube.com

YouTube

Lecture 88: TinyTPU

來源: youtube.com

Program Counter 已轉發

Deep-ML

@real_deep_ml

年12月14日

Great new video! Should we make a question based on the titans paper?

Program Counter 已轉發

Larry Dial

@classiclarryd

年12月14日

New NanoGPT WR from @ChrisJMcCormick at 130.2s, a 1.4s improvement! He has somehow found a way to make Muon even faster, along with several other optimizations to pre-multiply lambdas, update Normuon axis on gates, and reshape matrices. github.com/KellerJordan/m….

classiclarryd's tweet image. New NanoGPT WR from @ChrisJMcCormick at 130.2s, a 1.4s improvement! He has somehow found a way to make Muon even faster, along with several other optimizations to pre-multiply lambdas, update Normuon axis on gates, and reshape matrices. github.com/KellerJordan/m….

Program Counter 已轉發

Winston Lin

@linstonwin

年12月12日

My old causal inference reading lists Intro to Causal Inference (undergrad stats, Yale, 2021): stat.berkeley.edu/~winston/causa… Causal Inference & Research Design (grad political science seminar, Yale, 2019): stat.berkeley.edu/~winston/causa…

Program Counter 已轉發

Samaneh Saadat

@smn_sdt

年12月9日

Check out the "Learning JAX" video series if you're interested in learning JAX!

Program Counter 已轉發

L

@llllvvuu

年12月9日

I made it into Terry Tao’s blog! terrytao.wordpress.com/2025/12/08/the… One cool part of this experience is that I *would not have made the Claude Deep Research query resulting in the connection to Erdos 106 if not for Aristotle’s exact implementation*. i.e. Aristotle, an AI tool, contributed…

Program Counter 已轉發

Shashwat Goel

@ShashwatGoel7

年12月8日

An interesting research problem that might be solvable at big labs / @thinkymachines / @wandb With enough data on training runs, can we make universal recommendations of good hyperparams, using stats of dataset, lossfn, activations, size etc Would save so much time, compute

Program Counter 已轉發

Yifan Zhang

@yifan_zhang_

年12月8日

A Recipe for Transformer+++: GQA/TPA (arxiv.org/abs/2501.06425) + QKRMSNorm + Output Gate (arxiv.org/abs/2505.06708) + GRAPE/Alibi (github.com/model-architec…) + KV Shifting (Shortconv/canon layer) Enjoy it!

yifan_zhang_'s tweet card. Official implementation of GRAPE: Group Representational Position Encoding (https://arxiv.org/abs/2512.07805) - model-architectures/GRAPE

GitHub - model-architectures/GRAPE: Official implementation of GRAPE: Group Representational...

來源: github.com

Program Counter 已轉發

Marco Mascorro

@Mascobot

年12月4日

If the #NeurIPS2025 app is crashing for you (like it’s for me) to the point that’s unusable, here a website with all the content/sessions: ml.ink

Mascobot's tweet card. Conference schedule for NeurIPS 2025 in San Diego. Browse events, tutorials, workshops, posters, and talks.

NeurIPS 2025 San Diego

來源: ml.ink

Program Counter 已轉發

Chris Barber

@chrisbarber

年12月3日

I made another NeurIPS 2025 hiring list. More teams hiring research engineers, MLEs, SWEs: @julianibarz at Tesla Optimus @nlpmattg at @ScaledCognition @msalbergo at Kempner Institute at Harvard @chinwei_h at MSFT Research @apsarathchandar at Chandar Lab @stash_pomichter at…

Program Counter 已轉發

enbao

@enbaoc

年12月3日

my first blogpost related to GPUs! this one looks at pyutils, a small but important part of the ThunderKittens library that allows kernels to be launched with PyTorch. enbao.me/posts/tk

enbaoc's tweet card. how ThunderKittens wraps CUDA kernels for Python

Inside ThunderKittens' Python Bindings

來源: enbao.me

Program Counter 已轉發

Jiaming Tang

@jmtang42

年12月2日

Even large VLAs can play ping-pong in real time! 🏓⚡️ In practice, VLAs struggle with fast, dynamic tasks: • slow reactions, jittery actions. • demos often shown at 5-10× speed to look “smooth”. We introduce VLASH: • future-state-aware asynchronous inference with >30Hz…

Program Counter 已轉發

Thien Tran

@gaunernst

年12月3日

Imagine the world if hardware folks settle on MXINT4 instead

Jack Cook

@jackcookjack

年12月2日

Training LLMs with NVFP4 is hard because FP4 has so few values that I can fit them all in this post: ±{0, 0.5, 1, 1.5, 2, 3, 4, 6}. But what if I told you that reducing this range even further could actually unlock better training + quantization performance? Introducing Four…

jackcookjack's tweet image. Training LLMs with NVFP4 is hard because FP4 has so few values that I can fit them all in this post: ±{0, 0.5, 1, 1.5, 2, 3, 4, 6}. But what if I told you that reducing this range even further could actually unlock better training + quantization performance?

Introducing Four…

Program Counter 已轉發

Anna Goldie

@annadgoldie

年12月2日

Excited to announce that @Azaliamirh and I are launching @RicursiveAI, a frontier AI lab creating a recursive self-improving loop between AI and the hardware that fuels it. Today, chip design takes 2-3 years and requires thousands of human experts. We will reduce that to weeks.…

annadgoldie's tweet card. Founded by ex-Google researchers, the company raised $35 million with backing from Sequoia to automate chip design.

Exclusive | This AI Startup Wants to Remake the $800 Billion Chip Industry

來源: wsj.com

Ricursive Intelligence

@RicursiveAI

年12月2日

Introducing Ricursive Intelligence, a frontier AI lab enabling a recursive self-improvement loop between AI and the chips that fuel it. Learn more at ricursive.com