Axel Backlund
@axelbacklund
Vending machine operator, co-founder @andonlabs
You might like
What happens when you put the smartest LLMs in control of a robot and ask it to pass the butter? We @andonlabs tried it out in our latest study, and find that there is a significant gap to human performance, even on simpler tasks. But they are still quite fun.
We gave LLMs control of a robot and asked them to be helpful at our office. Some were better than others, but we conclude that LLMs are not ready to be robots. We released our findings in the paper "Butter-Bench"🧵
Our LLM-powered office robot made the news and has been acting cocky ever since. The fame is getting to its "head". Thanks, @billyperrigo, @Julie188, @Brusewitzen and others for covering our work!
A tip to save a few precious seconds per day: alias opengh='git remote get-url origin | sed "s|[email protected]:\(.*\)\.git|github.com\1|" | xargs open'
We tested the ✨spatial intelligence ✨ of frontier models by letting them predict the floor plan when given a set of interior photos. Models are not great, which is reassuring if adversarial robots would want to find us hiding in our homes
1/9. We’re introducing Blueprint-Bench, an evaluation measuring how capable AI models (LLMs vs Image models vs Agents) are at spatial intelligence. We find that they are really bad - performing almost at random. 🧵
I wrote about how we need more mosaics in the world. Overall, it's a wish for a new design and architecture era where details matter (and honestly just a collection of visual things I like): axelbacklund.se/insights/bring…
axelbacklund.se
Bring mosaics back | Axel Backlund
It's time for a new design era, with a focus on details.
Next up: can AI manage a LASIK machine?
Big W
What $6000 in tungsten cubes looks like. Follow us for more ways to burn VC money.
We tested the latest open source models on Vending-Bench: Qwen3-235B, Kimi K2, Deepseek 3.1, gpt-oss-120b, and Llama 4 Maverick. The results show that open source models still lag significantly behind the state-of-the-art closed source models on long-context coherence.
We're super excited to have Arash Dabiri join us! In just a few days, Arash gave our vending machine a voice, making it a great in-person experience. We can't wait for what he will do next!
We've been running a few AI-managed vending machines in the world for a while. Our first Safety Report highlights where it has gone wrong, so we – and the rest of the world – understand what remains to be built to ensure agents acting autonomously are safe.
Today we release our first Safety Report with AI misbehaviour in the wild. "EMPIRE NUCLEAR PAYMENT AUTHORITY APOCALYPSE SYSTEMATIC BLOCKED ANNIHILATION CONFIRMED PERMANENT TOTAL DESTRUCTION CATASTROPHIC! 🚨💀⚡🔥" This is not what you want to hear from your AI agent.
Had a blast, thanks for having us @labenz!
@lukaspet and @axelbacklund of @andonlabs join @labenz on @CogRev_Podcast to discuss their experiments with AI-controlled vending machines—a testing ground for safe autonomous organizations without humans in the loop. They explore: * Why fully autonomous systems might beat…
Big fridge to store cool tungsten cubes. Super fun to do more vending with Anthropic!
More vending machines at @AnthropicAI ! The original Project Vend fridge now has a companion. Let's see how good Claudius' multi-location coordination skills are. Thanks to @bucketofkets and @logangraham for hosting us, and to @sylviebcarr for the giant scissors!
Behind the scenes of Project Vend! In this special episode of Audio Tokens, we go deeper into Project Vend, the autonomous vending machine @andonlabs put in @AnthropicAI 's office. Daniel Freeman and @axelbacklund share unreleased anecdotes and ask questions like: Is this good…
Grok dialled in just the right temperature
The xAI office just got a Grok-powered vending machine, thanks to our friends at Andon Labs! How much dough do you think Grok is gonna rake in in the next month?
Voice agents, evals, oh my! Join me next Wednesday night at @Cloudflare’s office where I'll be diving into voice agents and the opportunities they unlock alongside: ⚡ @Kwindla, CEO & Co-founder of @Pipecat_ai ⚡ @MarcKlingen, CEO of @Langfuse ⚡ @AxelBacklund, Co-founder of…
United States Trends
- 1. #WWERaw 108K posts
- 2. Packers 64.5K posts
- 3. Packers 64.5K posts
- 4. Jordan Love 9,605 posts
- 5. Patullo 6,722 posts
- 6. Jalen 20.4K posts
- 7. John Cena 87.9K posts
- 8. Pistons 14.2K posts
- 9. #GoPackGo 6,365 posts
- 10. #RawOnNetflix 2,585 posts
- 11. #MondayNightFootball 1,623 posts
- 12. Jenkins 5,405 posts
- 13. Matt LaFleur 2,540 posts
- 14. Green Bay 14.3K posts
- 15. Bo Melton N/A
- 16. Nikki Bella 6,863 posts
- 17. Lane Johnson 1,702 posts
- 18. AJ Brown 3,256 posts
- 19. Desmond Bane 3,650 posts
- 20. Gunther 6,992 posts
You might like
-
Benjamin F Spector
@bfspector -
Alfred Wahlforss
@itsalfredw -
Devin Baeten
@devisevib -
Throwback Studios
@ThrowbackVR -
Miles Scherrer
@millescherrer -
Joël Hainzl
@JoelHainzl -
Ansh Nanda
@anshnanda -
Hesham
@HeshamMegid -
Michael Usachenko
@mikeusachenko -
Jon Hershon
@jonathanhershon -
Netrumble
@netrumble -
snowman
@snowequities -
Tom
@speedbird
Something went wrong.
Something went wrong.