array

@arrayailabs

Model Behavior Design & Engineering http://arrayailabs.com

genai.science

Joined August 2025

28Posts 3Followers 41Following

array

@arrayailabs

Nov 1

AI Engineering 101 youtube.com/watch?v=qbvY0d…

arrayailabs's tweet card. Al Engineering 101 with Chip Huyen (Nvidia, Stanford, Netflix)

youtube.com

YouTube

Al Engineering 101 with Chip Huyen (Nvidia, Stanford, Netflix)

Source: youtube.com

array reposted

Microsoft did it again! Building with AI agents almost never works on the first try. You spend days tweaking prompts, adding examples, hoping it gets better. Nothing systematic, just guesswork. This is exactly what Microsoft's Agent Lightning solves. It's an open-source…

akshay_pachaar's tweet image. Microsoft did it again!

Building with AI agents almost never works on the first try.

You spend days tweaking prompts, adding examples, hoping it gets better. Nothing systematic, just guesswork.

This is exactly what Microsoft's Agent Lightning solves.

It's an open-source…

array reposted

OpenAI

@OpenAI

Oct 30

Now in private beta: Aardvark, an agent that finds and fixes security bugs using GPT-5. openai.com/index/introduc…

array reposted

Loubna Ben Allal

@LoubnaBenAllal1

Oct 30

After ~4 years building SOTA models & datasets, we're sharing everything we learned in ⚡The Smol Training Playbook We cover the full LLM cycle: designing ablations, choosing an architecture, curating data, post-training, and building solid infrastructure. We'll help you…

LoubnaBenAllal1's tweet image. After ~4 years building SOTA models &amp; datasets, we're sharing everything we learned in ⚡The Smol Training Playbook

We cover the full LLM cycle: designing ablations, choosing an architecture, curating data, post-training, and building solid infrastructure.

We'll help you…

array reposted

Anthropic

@AnthropicAI

Oct 29

New Anthropic research: Signs of introspection in LLMs. Can language models recognize their own internal thoughts? Or do they just make up plausible answers when asked about them? We found evidence for genuine—though limited—introspective capabilities in Claude.

AnthropicAI's tweet image. New Anthropic research: Signs of introspection in LLMs.

Can language models recognize their own internal thoughts? Or do they just make up plausible answers when asked about them? We found evidence for genuine—though limited—introspective capabilities in Claude.

array reposted

François Chollet

@fchollet

Oct 30

Crawling isn't innate (unlike walking). Every baby must *invent* crawling, from scratch, using extremely little data, and no reference to imitate. Which is why different babies end up with different ways of crawling. Sometimes people tell me, "you say AI isn't intelligent until…

$sarahookr's profile picture. Adaptive Intelligence. Built @Cohere_Labs, @GoogleBrain, @GoogleDeepmind. ML Efficiency, Multimodal\lingual. Changing spaces where breakthroughs happen.$

Sara Hooker

@sarahookr

Oct 30

Adaptable Intelligence. Multiple possible paths to an objective.

array reposted

Dwarkesh Patel

@dwarkesh_sp

Oct 17

The @karpathy interview 0:00:00 – AGI is still a decade away 0:30:33 – LLM cognitive deficits 0:40:53 – RL is terrible 0:50:26 – How do humans learn? 1:07:13 – AGI will blend into 2% GDP growth 1:18:24 – ASI 1:33:38 – Evolution of intelligence & culture 1:43:43 - Why self…

array reposted

Alexia Jolicoeur-Martineau

@jm_alexia

Oct 7

New paper 📜: Tiny Recursion Model (TRM) is a recursive reasoning approach with a tiny 7M parameters neural network that obtains 45% on ARC-AGI-1 and 8% on ARC-AGI-2, beating most LLMs. Blog: alexiajm.github.io/2025/09/29/tin… Code: github.com/SamsungSAILMon… Paper: arxiv.org/abs/2510.04871

array reposted

Andrej Karpathy

@karpathy

Oct 16

nanochat d32, i.e. the depth 32 version that I specced for $1000, up from $100 has finished training after ~33 hours, and looks good. All the metrics go up quite a bit across pretraining, SFT and RL. CORE score of 0.31 is now well above GPT-2 at ~0.26. GSM8K went ~8% -> ~20%,…

karpathy's tweet image. nanochat d32, i.e. the depth 32 version that I specced for $1000, up from $100 has finished training after ~33 hours, and looks good. All the metrics go up quite a bit across pretraining, SFT and RL. CORE score of 0.31 is now well above GPT-2 at ~0.26. GSM8K went ~8% -&gt; ~20%,…

array

@arrayailabs

Oct 17

✍️

Weiyan Shi

@shi_weiyan

Oct 15

New paper: You can make ChatGPT 2x as creative with one sentence. Ever notice how LLMs all sound the same? They know 100+ jokes but only ever tell one. Every blog intro: "In today's digital landscape..." We figured out why – and how to unlock the rest 🔓 Copy-paste prompt: 🧵

array

@arrayailabs

Oct 17

What's your take an model behavior?

Sam Altman

@sama

Oct 14

We made ChatGPT pretty restrictive to make sure we were being careful with mental health issues. We realize this made it less useful/enjoyable to many users who had no mental health problems, but given the seriousness of the issue we wanted to get this right. Now that we have…

array reposted

Logan Kilpatrick

@OfficialLoganK

Oct 15

Introducing Veo 3.1 and Veo 3.1 Fast, our latest state of the art video models with: - richer native audio - better cinematic styles - reference to video - transitions between frames - video extensions

array

@arrayailabs

Oct 17

very cool!

alphaXiv

@askalphaxiv

Oct 15

Introducing NotebookLM for arXiv papers 🚀 Transform dense AI research into an engaging conversation With context across thousands of related papers, it captures motivations, draws connections to SOTA, and explains key insights like a professor who's read the entire field

array reposted

Fei-Fei Li

@drfeifei

Oct 16

Very excited to share @theworldlabs ‘s latest research work RTFM!! It’s a real-time, persistent, and 3D consistent generative World Model running on *a single* H100 GPU! Blog and live demo are available below! 🤩

World Labs

@theworldlabs

Oct 16

Generative World Models will inevitably be computationally demanding, potentially scaling beyond even the requirements of today’s LLMs. But we believe they are a crucial research direction to explore in the future of rendering and spatial intelligence. worldlabs.ai/blog/rtfm

array reposted

Suzana Ilić

@suzatweet

Oct 17

that's actually really awesome huggingface.co/chat/

array reposted

Mira Murati

@miramurati

Sep 10

A big part of our mission at Thinking Machines is to improve people’s scientific understanding of AI and work with the broader research community. Introducing Connectionism today to share some of our scientific insights.

Thinking Machines

@thinkymachines

Sep 10

Today Thinking Machines Lab is launching our research blog, Connectionism. Our first blog post is “Defeating Nondeterminism in LLM Inference” We believe that science is better when shared. Connectionism will cover topics as varied as our research is: from kernel numerics to…

thinkymachines's tweet image. Today Thinking Machines Lab is launching our research blog, Connectionism. Our first blog post is “Defeating Nondeterminism in LLM Inference”

We believe that science is better when shared. Connectionism will cover topics as varied as our research is: from kernel numerics to…

array reposted

Igor Babuschkin

@ibab

Aug 19

Research research research research research research research research research research research research research research research research research research research research research research research research research research research research research research research…

array reposted

Akshay 🚀

@akshay_pachaar

Aug 19

JSON prompting for LLMs, clearly explained:

array reposted

Sebastian Raschka

@rasbt

Aug 17

Couldn't resist. Here's a pure PyTorch from-scratch re-implementation of Gemma 3 270M in a Jupyter Notebook (uses about 1.49 GB RAM): github.com/rasbt/LLMs-fro…

rasbt's tweet image. Couldn't resist.
Here's a pure PyTorch from-scratch re-implementation of Gemma 3 270M in a Jupyter Notebook (uses about 1.49 GB RAM): github.com/rasbt/LLMs-fro…

Sebastian Raschka

@rasbt

Aug 14

Gemma 3 270M! Great to see another awesome, small open-weight LLM for local tinkering. Here's a side-by-side comparison with Qwen3. Biggest surprise that it only has 4 attention heads!

rasbt's tweet image. Gemma 3 270M! Great to see another awesome, small open-weight LLM for local tinkering.
Here's a side-by-side comparison with Qwen3. Biggest surprise that it only has 4 attention heads!

array reposted

MLT

@__MLT__

Aug 18

Come join us on Thursday for a gpt-oss Deep Dive! We'll take a look at the model architecture, algo gems and other technical details of gpt-oss, OpenAI's latest and first open-weight reasoning model. meetup.com/machine-learni…

__MLT__'s tweet image. Come join us on Thursday for a gpt-oss Deep Dive! We'll take a look at the model architecture, algo gems and other technical details of gpt-oss, OpenAI's latest and first open-weight reasoning model.
meetup.com/machine-learni…

Maye🪐🇺🇲

@mayee_musk2

MLT

@__MLT__

Suzana Ilić

@suzatweet

Alexia Jolicoeur-Martineau

@jm_alexia

$sarahookr's profile picture. Adaptive Intelligence. Built @Cohere_Labs, @GoogleBrain, @GoogleDeepmind. ML Efficiency, Multimodal\lingual. Changing spaces where breakthroughs happen.$