intelligibabble's profile picture. Technical Staff @AnthropicAI
Prev: Jetpack Compose @Google. I like learning, discussing, and diving into challenges.

Leland Richardson

@intelligibabble

Technical Staff @AnthropicAI Prev: Jetpack Compose @Google. I like learning, discussing, and diving into challenges.

Leland Richardson 已轉發

Why is pasting into VSCode Terminal slow? Because it sleeps for 5ms every 50 characters.

jarredsumner's tweet image. Why is pasting into VSCode Terminal slow? Because it sleeps for 5ms every 50 characters.

Leland Richardson 已轉發

An important lesson that ARC-AGI has internalized, but not many others have, is that benchmark perf is a function of test-time compute. @OpenAI publishes single-number benchmark results because it's simpler and people expect to see it, but ideally all evals would have an x-axis.

A year ago, we verified a preview of an unreleased version of @OpenAI o3 (High) that scored 88% on ARC-AGI-1 at est. $4.5k/task Today, we’ve verified a new GPT-5.2 Pro (X-High) SOTA score of 90.5% at $11.64/task This represents a ~390X efficiency improvement in one year

arcprize's tweet image. A year ago, we verified a preview of an unreleased version of @OpenAI o3 (High) that scored 88% on ARC-AGI-1 at est. $4.5k/task

Today, we’ve verified a new GPT-5.2 Pro (X-High) SOTA score of 90.5% at $11.64/task

This represents a ~390X efficiency improvement in one year


Leland Richardson 已轉發

If you understand the world you know it's actually @AmandaAskell .

Japanese Prime Minister Sanae Takaichi rockets to number 3 on the Forbes World's Most Powerful Women list, behind Christine Lagarde and Ursula von der Leyen forbes.com/lists/power-wo…

GearoidReidy's tweet image. Japanese Prime Minister Sanae Takaichi rockets to number 3 on the Forbes World's Most Powerful Women list, behind Christine Lagarde and Ursula von der Leyen

forbes.com/lists/power-wo…
GearoidReidy's tweet image. Japanese Prime Minister Sanae Takaichi rockets to number 3 on the Forbes World's Most Powerful Women list, behind Christine Lagarde and Ursula von der Leyen

forbes.com/lists/power-wo…


Leland Richardson 已轉發

derivedStateOf is explained in depth in this great post by Zach Klippenstein, including its performance overhead bit Zach's posts never disappoint. Give it a read! blog.zachklipp.com/how-derivedsta…


Claude Code on Android! Check it out - just the beginning, lots more useful things coming, but I've been finding it surprisingly useful!

You can now run Claude Code tasks from the Claude Android app, in research preview. Kick off cloud-based tasks from your phone, let Claude run, then pick them up later to review and merge work. Download Claude for Android: play.google.com/store/apps/det…



FYI we just shipped Claude Code on Android!

Today we’re shipping 3 more updates for Claude Code: - Claude Code on Android - Hotkey model switcher - Context window info in status lines

claudeai's tweet image. Today we’re shipping 3 more updates for Claude Code:

- Claude Code on Android
- Hotkey model switcher
- Context window info in status lines


I'm genuinely surprised / impressed that we broke the ARC-AGI-2 50% mark in 2025.

GPT-5.2 evals

OpenAIDevs's tweet image. GPT-5.2 evals


Leland Richardson 已轉發

I think I’ve published the first research article in theoretical physics in which the main idea came from an AI - GPT5 in this case. The physics research paper itself (on QFT and state-dependent quantum mechanics) has been published in Physics Letters B. I've written an…

hsu_steve's tweet image. I think I’ve published the first research article in theoretical physics in which the main idea came from an AI - GPT5 in this case. The physics research paper itself (on QFT and state-dependent quantum mechanics) has been published in Physics Letters B.

I've written an…

Leland Richardson 已轉發

One of you AI companies better buy us next or I'm going to make a bunch of breaking changes and invalidate all of the code you generate.


Leland Richardson 已轉發

few will understand this. for the last few decades, UI/UX has been built on a stable background assumption: the human is the only general intelligence in the loop. software is procedural. UI's job is to make that procedure discoverable and efficient: - show the state of the…

UI is pre-AI.



Me: Ok. This is a really tricky bug. Gonna turn on thinking for Opus bc honestly, i don't even know if it'll be able to fix it *even with thinking*. Opus: *reads one file, zero thinking tokens, edits file*. "Fixed!" Me: you don't have to be a jerk about it...


Leland Richardson 已轉發

Google's vision of AI Tutors is amazing. This is Project Astra which is now productized inside Gemini:


Really happy to see this research on reward hacking and misalignment get published by Anthropic: anthropic.com/research/emerg… When I first heard about this internally it was a real "holy sh*t" moment for me. Reading through some of these transcripts is legitimately surreal. Aside…


Leland Richardson 已轉發

An emergent capability of Nano Banana Pro that took me by surprise: the ability to generate beautiful & accurate charts that are to scale.

19kaushiks's tweet image. An emergent capability of Nano Banana Pro that took me by surprise: the ability to generate beautiful & accurate charts that are to scale.

Leland Richardson 已轉發

My most amusing interaction was where the model (I think I was given some earlier version with a stale system prompt) refused to believe me that it is 2025 and kept inventing reasons why I must be trying to trick it or playing some elaborate joke on it. I kept giving it images…

karpathy's tweet image. My most amusing interaction was where the model (I think I was given some earlier version with a stale system prompt) refused to believe me that it is 2025 and kept inventing reasons why I must be trying to trick it or playing some elaborate joke on it. I kept giving it images…

Cosign. It's interesting to think about what would be different if iOS and Android didn't have these astronomical toll booths for just allowing users the privilege of *installing software* on the pocket computer they bought

It's truly depressing to see the Android org's leadership repeatedly shooting themselves in the foot. It takes 20 years to build a reputation and five minutes to ruin it. f-droid.org/en/2025/10/28/… Users should be able to choose what software they run on the devices they own.



I recently decided to put a few of my astro photos on display in my office. Ended up with a few of these 18x12 metal prints, which I think really show off some punchy contrast! Very happy with how this ended up looking - will probably be filling this out with another 6 or so if…

intelligibabble's tweet image. I recently decided to put a few of my astro photos on display in my office. Ended up with a few of these 18x12 metal prints, which I think really show off some punchy contrast! 

Very happy with how this ended up looking - will probably be filling this out with another 6 or so if…

Loading...

Something went wrong.


Something went wrong.