array
@arrayailabs
Model Behavior Design & Engineering http://arrayailabs.com
AI Engineering 101 youtube.com/watch?v=qbvY0d…
youtube.com
YouTube
Al Engineering 101 with Chip Huyen (Nvidia, Stanford, Netflix)
Microsoft did it again! Building with AI agents almost never works on the first try. You spend days tweaking prompts, adding examples, hoping it gets better. Nothing systematic, just guesswork. This is exactly what Microsoft's Agent Lightning solves. It's an open-source…
Now in private beta: Aardvark, an agent that finds and fixes security bugs using GPT-5. openai.com/index/introduc…
After ~4 years building SOTA models & datasets, we're sharing everything we learned in ⚡The Smol Training Playbook We cover the full LLM cycle: designing ablations, choosing an architecture, curating data, post-training, and building solid infrastructure. We'll help you…
New Anthropic research: Signs of introspection in LLMs. Can language models recognize their own internal thoughts? Or do they just make up plausible answers when asked about them? We found evidence for genuine—though limited—introspective capabilities in Claude.
Crawling isn't innate (unlike walking). Every baby must *invent* crawling, from scratch, using extremely little data, and no reference to imitate. Which is why different babies end up with different ways of crawling. Sometimes people tell me, "you say AI isn't intelligent until…
Adaptable Intelligence. Multiple possible paths to an objective.
The @karpathy interview 0:00:00 – AGI is still a decade away 0:30:33 – LLM cognitive deficits 0:40:53 – RL is terrible 0:50:26 – How do humans learn? 1:07:13 – AGI will blend into 2% GDP growth 1:18:24 – ASI 1:33:38 – Evolution of intelligence & culture 1:43:43 - Why self…
New paper 📜: Tiny Recursion Model (TRM) is a recursive reasoning approach with a tiny 7M parameters neural network that obtains 45% on ARC-AGI-1 and 8% on ARC-AGI-2, beating most LLMs. Blog: alexiajm.github.io/2025/09/29/tin… Code: github.com/SamsungSAILMon… Paper: arxiv.org/abs/2510.04871
nanochat d32, i.e. the depth 32 version that I specced for $1000, up from $100 has finished training after ~33 hours, and looks good. All the metrics go up quite a bit across pretraining, SFT and RL. CORE score of 0.31 is now well above GPT-2 at ~0.26. GSM8K went ~8% -> ~20%,…
✍️
New paper: You can make ChatGPT 2x as creative with one sentence. Ever notice how LLMs all sound the same? They know 100+ jokes but only ever tell one. Every blog intro: "In today's digital landscape..." We figured out why – and how to unlock the rest 🔓 Copy-paste prompt: 🧵
What's your take an model behavior?
We made ChatGPT pretty restrictive to make sure we were being careful with mental health issues. We realize this made it less useful/enjoyable to many users who had no mental health problems, but given the seriousness of the issue we wanted to get this right. Now that we have…
Introducing Veo 3.1 and Veo 3.1 Fast, our latest state of the art video models with: - richer native audio - better cinematic styles - reference to video - transitions between frames - video extensions
very cool!
Introducing NotebookLM for arXiv papers 🚀 Transform dense AI research into an engaging conversation With context across thousands of related papers, it captures motivations, draws connections to SOTA, and explains key insights like a professor who's read the entire field
Very excited to share @theworldlabs ‘s latest research work RTFM!! It’s a real-time, persistent, and 3D consistent generative World Model running on *a single* H100 GPU! Blog and live demo are available below! 🤩
Generative World Models will inevitably be computationally demanding, potentially scaling beyond even the requirements of today’s LLMs. But we believe they are a crucial research direction to explore in the future of rendering and spatial intelligence. worldlabs.ai/blog/rtfm
A big part of our mission at Thinking Machines is to improve people’s scientific understanding of AI and work with the broader research community. Introducing Connectionism today to share some of our scientific insights.
Today Thinking Machines Lab is launching our research blog, Connectionism. Our first blog post is “Defeating Nondeterminism in LLM Inference” We believe that science is better when shared. Connectionism will cover topics as varied as our research is: from kernel numerics to…
Research research research research research research research research research research research research research research research research research research research research research research research research research research research research research research research…
JSON prompting for LLMs, clearly explained:
Couldn't resist. Here's a pure PyTorch from-scratch re-implementation of Gemma 3 270M in a Jupyter Notebook (uses about 1.49 GB RAM): github.com/rasbt/LLMs-fro…
Gemma 3 270M! Great to see another awesome, small open-weight LLM for local tinkering. Here's a side-by-side comparison with Qwen3. Biggest surprise that it only has 4 attention heads!
Come join us on Thursday for a gpt-oss Deep Dive! We'll take a look at the model architecture, algo gems and other technical details of gpt-oss, OpenAI's latest and first open-weight reasoning model. meetup.com/machine-learni…
United States Trends
- 1. Daboll 40.4K posts
- 2. Pond 235K posts
- 3. Schoen 18.7K posts
- 4. Schoen 18.7K posts
- 5. Giants 80.6K posts
- 6. Joe Burrow 4,878 posts
- 7. Kim Davis 11.5K posts
- 8. Go Birds 10.7K posts
- 9. Veterans Day 21K posts
- 10. Dart 26.9K posts
- 11. #MYNZ 1,502 posts
- 12. Marines 56.2K posts
- 13. Kafka 9,630 posts
- 14. Joe Dirt N/A
- 15. Semper Fi 11K posts
- 16. Jeffries 38.1K posts
- 17. Alex Singleton 1,005 posts
- 18. Johnny Carson N/A
- 19. #ROBOGIVE N/A
- 20. John Mara 2,053 posts
Something went wrong.
Something went wrong.