Mechanical Dirk

@mechanicaldirk

Principal Engineer at @allen_ai. Engineering Lead of the OLMo project.

Seattle, WA

Entrou em Agosto de 2018

1KPosts 882Seguidores 271Seguindo

Talvez você curta

@GabrielSaadia

@zhaofeng_wu

@MaartenSap

@JesseDodge

@anmarasovic

@machelreid

@royschwartzNLP

@swabhz

@sarahwiegreffe

@hamishivi

@harsh3vedi

@GabiStanovsky

@JonathanBerant

@lambdaviking

@armancohan

Mechanical Dirk

@mechanicaldirk

18 de ago. de

Love you too Cody 😃

Cody Blakeney

@code_star

18 de ago. de

People sleep on OLMo 2, but its one of the best trained OSS models per train flop out there. The @allen_ai data team is cracked AF.

Mechanical Dirk

@mechanicaldirk

12 de ago. de

The goal of AI coding tools doesn't need to be to write the code for me. That part is easy. AI needs to save me from having to look up documentation every two lines.

Mechanical Dirk repostou

Kyle Lo @ COLM 2025 🍁

@kylelostat

1 de ago. de

thx to all the feedback from OSS community! our olmOCR lead @jakepoznanski shipped a new model fixing lotta issues + some more optimization for better throughput have fun converting PDFs!

Ai2

@allen_ai

1 de ago. de

📝 olmOCR v0.2.1 has arrived with new models! Our open‑source OCR engine now reads tougher docs with greater precision—and it’s still 100 % open. 👇

allen_ai's tweet image. 📝 olmOCR v0.2.1 has arrived with new models! Our open‑source OCR engine now reads tougher docs with greater precision—and it’s still 100 % open. 👇

Mechanical Dirk

@mechanicaldirk

25 de jul. de

Are we just alternating conferences between Vancouver and Vienna now? Because honestly, I'm down.

I’ll be in Vienna for ACL next week! Email and DMs open: 1. Always excited to talk about FLaNN theory and pre-pretraining on formal languages 2. Open pretraining (eg OLMo) 3. advice for junior faculty 4. I’m recruiting PhD students this fall

Mechanical Dirk repostou

François Fleuret

@francoisfleuret

19 de jul. de

6m later "Nobel Prize is actually a poor measure of intelligence. In this paper we show that ..."

Mechanical Dirk repostou

Adam Wathan

@adamwathan

14 de jul. de

Product idea: Notion except every keystroke doesn't feel like I'm SSH'd into a server on Mars.

Mechanical Dirk repostou

Alan Jeffares @ ICML 🇨🇦

@Jeffaresalan

10 de jul. de

Our new ICML 2025 oral paper proposes a new unified theory of both Double Descent and Grokking, revealing that both of these deep learning phenomena can be understood as being caused by prime numbers in the network parameters 🤯🤯 🧵[1/8]

Jeffaresalan's tweet image. Our new ICML 2025 oral paper proposes a new unified theory of both Double Descent and Grokking, revealing that both of these deep learning phenomena can be understood as being caused by prime numbers in the network parameters 🤯🤯

🧵[1/8]

Mechanical Dirk repostou

Kevin Farhat

@notkevinfarhat

9 de jul. de

The bottleneck in AI isn't just compute - it's access to diverse, high-quality data, much of which is locked away due to privacy, legal, or competitive concerns. What if there was a way to train better models collaboratively, without actually sharing your data? Introducing…

notkevinfarhat's tweet image. The bottleneck in AI isn't just compute - it's access to diverse, high-quality data, much of which is locked away due to privacy, legal, or competitive concerns.

What if there was a way to train better models collaboratively, without actually sharing your data?

Introducing…

Ai2

@allen_ai

9 de jul. de

Introducing FlexOlmo, a new paradigm for language model training that enables the co-development of AI through data collaboration. 🧵

Mechanical Dirk repostou

Cirrascale Cloud Services

@Cirrascale

7 de jul. de

🚨 Just announced: OLMo, Molmo & Tülu are now LIVE on the Cirrascale Inference Platform! It’s official, Cirrascale is the first to offer commercial inference endpoints for @Ai2’s OLMo, Molmo & Tülu models on our Inference Platform. Our Inference Platform provides a fully open,…

Cirrascale's tweet image. 🚨 Just announced: OLMo, Molmo &amp; Tülu are now LIVE on the Cirrascale Inference Platform!

It’s official, Cirrascale is the first to offer commercial inference endpoints for @Ai2’s OLMo, Molmo &amp; Tülu models on our Inference Platform.

Our Inference Platform provides a fully open,…

Mechanical Dirk repostou

Nathan Lambert

@natolambert

4 de jul. de

My latest post: The American DeepSeek Project Build fully open models in the US in the next two years to enable a flourishing, global scientific AI ecosystem to balance China's surge in open-source and an alternative to building products ontop of leading closed models.

natolambert's tweet image. My latest post: The American DeepSeek Project

Build fully open models in the US in the next two years to enable a flourishing, global scientific AI ecosystem to balance China's surge in open-source and an alternative to building products ontop of leading closed models.

Mechanical Dirk

@mechanicaldirk

3 de jun. de

This project is a perfect model of an OLMo contribution. Well scoped, practical, sound theoretical underpinnings, and @lambdaviking submitted the paper 24h before the deadline 😍. Integrated into the code here: github.com/allenai/OLMo-c…

mechanicaldirk's tweet card. PyTorch building blocks for the OLMo ecosystem. Contribute to allenai/OLMo-core development by creating an account on GitHub.

OLMo-core/src/olmo_core/train/callbacks/batch_size_scheduler.py at main · allenai/OLMo-core

Fonte: github.com

Ai2

@allen_ai

3 de jun. de

As we’ve been working towards training a new version of OLMo, we wanted to improve our methods for measuring the Critical Batch Size (CBS) of a training run, to unlock greater efficiency, but we found gaps between the methods in the literature and our practical needs for training…

allen_ai's tweet image. As we’ve been working towards training a new version of OLMo, we wanted to improve our methods for measuring the Critical Batch Size (CBS) of a training run, to unlock greater efficiency, but we found gaps between the methods in the literature and our practical needs for training…

Mechanical Dirk

@mechanicaldirk

1 de mai. de

The #1 question we get is, when will we have an OLMo 1B? We finally do!

Ai2

@allen_ai

1 de mai. de

We're excited to round out the OLMo 2 family with its smallest member, OLMo 2 1B, surpassing peer models like Gemma 3 1B or Llama 3.2 1B. The 1B model should enable rapid iteration for researchers, more local development, and a more complete picture of how our recipe scales.

allen_ai's tweet image. We're excited to round out the OLMo 2 family with its smallest member, OLMo 2 1B, surpassing peer models like Gemma 3 1B or Llama 3.2 1B. The 1B model should enable rapid iteration for researchers, more local development, and a more complete picture of how our recipe scales.

Mechanical Dirk repostou

Niklas Muennighoff

@Muennighoff

23 de abr. de

In Singapore @iclr_conf - feel free to come by our OLMoE Oral! Meta recently switched from Dense to MoEs for Llama 4 but hasn't released many details on this yet --- We'll explore MoEs vs Dense & other MoE insights!

Muennighoff's tweet image. In Singapore @iclr_conf - feel free to come by our OLMoE Oral!

Meta recently switched from Dense to MoEs for Llama 4 but hasn't released many details on this yet --- We'll explore MoEs vs Dense &amp; other MoE insights!

Mechanical Dirk repostou

Ian Magnusson

@IanMagnusson

15 de abr. de

🔭 Science relies on shared artifacts collected for the common good. 🛰 So we asked: what's missing in open language modeling? 🪐 DataDecide 🌌 charts the cosmos of pretraining—across scales and corpora—at a resolution beyond any public suite of models that has come before.

Ai2

@allen_ai

15 de abr. de

Ever wonder how LLM developers choose their pretraining data? It’s not guesswork— all AI labs create small-scale models as experiments, but the models and their data are rarely shared. DataDecide opens up the process: 1,050 models, 30k checkpoints, 25 datasets & 10 benchmarks 🧵

allen_ai's tweet image. Ever wonder how LLM developers choose their pretraining data? It’s not guesswork— all AI labs create small-scale models as experiments, but the models and their data are rarely shared.
DataDecide opens up the process: 1,050 models, 30k checkpoints, 25 datasets &amp; 10 benchmarks 🧵

Mechanical Dirk repostou

Ashish Vaswani

@ashVaswani

8 de abr. de

Reinforcement learning has shown success in eliciting reflection from LLMs, but what if this capability actually manifests earlier in pre-training? We investigated this question and our results are surprising 👇 [1/4]

ashVaswani's tweet image. Reinforcement learning has shown success in eliciting reflection from LLMs, but what if this capability actually manifests earlier in pre-training? We investigated this question and our results are surprising 👇

[1/4]

Mechanical Dirk repostou

Jiacheng Liu

@liujc1998

9 de abr. de

Today we're unveiling OLMoTrace, a tool that enables everyone to understand the outputs of LLMs by connecting to their training data. We do this on unprecedented scale and in real time: finding matching text between model outputs and 4 trillion training tokens within seconds. ✨

Ai2

@allen_ai

9 de abr. de

For years it’s been an open question — how much is a language model learning and synthesizing information, and how much is it just memorizing and reciting? Introducing OLMoTrace, a new feature in the Ai2 Playground that begins to shed some light. 🔦

Mechanical Dirk repostou

toddbishop

@toddbishop

9 de abr. de

From ‘black box to glass box’: Ai2 (@allen_ai) links AI outputs to training data in breakthrough for transparency geekwire.com/2025/from-blac… via @GeekWire

toddbishop's tweet card. The Allen Institute for AI (Ai2) released a new tool that links AI-generated text to training data, aiming to improve transparency and accountability in artificial intelligence by addressing one of...

From ‘black box to glass box’: Ai2 links AI outputs to training data in breakthrough for transpar...

Fonte: geekwire.com

Mechanical Dirk

@mechanicaldirk

13 de mar. de

Biggest one yet! Scroll to the bottom of the blog post (allenai.org/blog/olmo2-32B) for a few fun training stories.

mechanicaldirk's tweet card. Introducing OLMo 2 32B, the most capable and largest model in the OLMo 2 family.

OLMo 2 32B: First fully open model to outperform GPT 3.5 and GPT 4o mini | Ai2

Fonte: allenai.org

Ai2

@allen_ai

13 de mar. de

Announcing OLMo 2 32B: the first fully open model to beat GPT 3.5 & GPT-4o mini on a suite of popular, multi-skill benchmarks. Comparable to best open-weight models, but a fraction of training compute. When you have a good recipe, ✨ magical things happen when you scale it up!

$allen_ai's tweet image. Announcing OLMo 2 32B: the first fully open model to beat GPT 3.5 &amp; GPT-4o mini on a suite of popular, multi-skill benchmarks. Comparable to best open-weight models, but a fraction of training compute. When you have a good recipe, ✨ magical things happen when you scale it up!$

Mechanical Dirk repostou

Ashwinee Panda

@PandaAshwinee

5 de mar. de

people are talking about whether scaling laws are broken or pretraining is saturating. so what does that even mean? consider the loss curves from our recent gemstones paper. as we add larger models, the convex hull doesn’t flatten out on this log-log plot. that's good!

PandaAshwinee's tweet image. people are talking about whether scaling laws are broken or pretraining is saturating. so what does that even mean? consider the loss curves from our recent gemstones paper. as we add larger models, the convex hull doesn’t flatten out on this log-log plot. that's good!

Mechanical Dirk repostou

Ai2

@allen_ai

25 de fev. de

Introducing olmOCR, our open-source tool to extract clean plain text from PDFs! Built for scale, olmOCR handles many document types with high throughput. Run it on your own GPU for free—at over 3000 token/s, equivalent to $190 per million pages, or 1/32 the cost of GPT-4o!