Sam Foreman

@saforem2

training large models for science @argonne

Science & Technology

Chicago, IL

samforeman.me

Joined August 2011

14KPosts 2KFollowers 5KFollowing

You might like

@BahramShakerin

@rothkopfAK

@AninditaMaiti7

@cai2r

@TetsuHirano

@paco_astro

$larrylee's profile picture. LL Cool Jr \\ Assistant Professor @ U Tennessee Knoxville \\ Research at the Large Hadron Collider \\ Creator of @collider_scope \\ He/Him$

@larrylee

@AndreasCrivell1

@Eupaulocramalho

@Scottyhh2thecc

@CrayKain

Pinned

Sam Foreman

@saforem2

Aug 1, 2023

slides from my talk yesterday at #Lattice2023 MLMC: Machine Learning Monte Carlo for Lattice Gauge Theory Slides: saforem2.github.io/lattice23 Code: github.com/saforem2/l2hmc…

saforem2's tweet image. slides from my talk yesterday at #Lattice2023

MLMC: Machine Learning Monte Carlo for Lattice Gauge Theory

Slides: saforem2.github.io/lattice23
Code: github.com/saforem2/l2hmc…

Sam Foreman reposted

random walk off a short pier

@maximalcoupling

Nov 20

i should've gone into industry man

Sam Foreman

@saforem2

Nov 13

W&B absolutely cooking with this new TUI

saforem2's tweet image. W&amp;B absolutely cooking with this new TUI

Shawn Lewis

@shawnup

Nov 11

We heard you all like TUIs! Announcing Lightweight Experiment Exploration Tool (LEET). Available today in wandb sdk 0.23.0. Run "wandb beta leet"

Sam Foreman reposted

Prof. Anima Anandkumar

@AnimaAnandkumar

Nov 10

Arvind talking about generative models and agents for biology

Sam Foreman reposted

Luis Batalha

@luismbat

Oct 31

Imagine losing first authorship because you got hit by a blue shell on the last lap 💀

LLMs are injective and invertible. In our new paper, we show that different prompts always map to different embeddings, and this property can be used to recover input tokens from individual embeddings in latent space. (1/6)

GladiaLab's tweet image. LLMs are injective and invertible.

In our new paper, we show that different prompts always map to different embeddings, and this property can be used to recover input tokens from individual embeddings in latent space.

(1/6)

Sam Foreman

@saforem2

Oct 26

Sam Foreman reposted

Jason Stock

@itsstock

Oct 11

I've reimplemented the core parts of TRM in MLX to experiment with locally and reduce complexity. This includes: - deep supervision w/ ACT - recursive reasoning steps (n & T) - EMA + lr schedules - x, y, z values etc. - policy or max step inference (new) github.com/stockeh/mlx-trm

itsstock's tweet image. I've reimplemented the core parts of TRM in MLX to experiment with locally and reduce complexity.

This includes:
- deep supervision w/ ACT
- recursive reasoning steps (n &amp; T)
- EMA + lr schedules
- x, y, z values etc.
- policy or max step inference (new)

github.com/stockeh/mlx-trm

Alexia Jolicoeur-Martineau

@jm_alexia

Oct 7

New paper 📜: Tiny Recursion Model (TRM) is a recursive reasoning approach with a tiny 7M parameters neural network that obtains 45% on ARC-AGI-1 and 8% on ARC-AGI-2, beating most LLMs. Blog: alexiajm.github.io/2025/09/29/tin… Code: github.com/SamsungSAILMon… Paper: arxiv.org/abs/2510.04871

Sam Foreman reposted

tender

@tenderizzation

Oct 9

Sam Foreman

@saforem2

Oct 7

i did math (and physics) at UIUC and had no AP math or science credits coming in i took stats senior year 😭

Justin Skycak

@justinskycak

Oct 5

If you want to major in math at an elite university, but all the knowledge you show up with is high school math and AP Calculus, and you’re not a genius, then you’re probably going to get your ass handed to you. High school math – even the “honors” track, even getting a 5 on the…

Sam Foreman reposted

arXiv.org

@arxiv

Oct 2

Days in a work week: 5 Days in a month: 30 Total new submissions to arXiv in September: 26,646 arXiv editorial and user support staff: 7 someone who is good at science please help me with this. our team isn't sleeping. #openaccess #preprints

arxiv's tweet image. Days in a work week: 5
Days in a month: 30
Total new submissions to arXiv in September: 26,646
arXiv editorial and user support staff: 7

someone who is good at science please help me with this. our team isn't sleeping.

#openaccess #preprints

Sam Foreman reposted

Tyler Pasciak LaRiviere

@TylerLaRiviere

Oct 2

Well in light of the massive TFR over the Chicagoland area that's grounded me for the foreseeable future, here's a selection of some of my favorite drone images I've capture this year.

TylerLaRiviere's tweet image. Well in light of the massive TFR over the Chicagoland area that's grounded me for the foreseeable future, here's a selection of some of my favorite drone images I've capture this year.

Sam Foreman

@saforem2

Sep 19

I’m happy to be able to finally talk about this work publicly we trained a diffusion transformer model for weather forecasting on 120,000 GPUs at a sustained throughput of 10 EFLOPs this is an incredible accomplishment and i’m super proud of our team

Jason Stock

@itsstock

Sep 18

Excited to share our 2025 ACM Gordon Bell Finalist: 🌎 AERIS, our 1.3–80B parameter pixel-level Swin diffusion transformer, addresses scaling issues in high-resolution weather forecasting using SWiPe parallelism to scale to 121,000 GPUs.

itsstock's tweet image. Excited to share our 2025 ACM Gordon Bell Finalist:

🌎 AERIS, our 1.3–80B parameter pixel-level Swin diffusion transformer, addresses scaling issues in high-resolution weather forecasting using SWiPe parallelism to scale to 121,000 GPUs.

Sam Foreman reposted

Jason Stock

@itsstock

Sep 18

Sam Foreman reposted

Adam Nathaniel Furman

@Furmadamadam

Sep 12

The Chicago River at night

Sam Foreman

@saforem2

Sep 13

anything called more than once goes at the top of the file anything used only once gets put inside the function it’s needed

Lucas Beyer (bl16)

@giffmana

Sep 12

When you think more about it... why put all imports at the top of the file? For those imports that you use hundred times across the file, like import numpy as np, sure. But the many imports that you only use once or twice in the whole file? Like the from PIL import Image that's…

Sam Foreman reposted

samsja

@samsja19

Sep 12

Anybody that actually trained a model at large scale would tell you how painful and stressful it is to be 24/7 on the watch for infra crash, loss spike, expert routing collapse. Not convinced of the analogy haha