adonis_singh's profile picture. 17 • contributor @_mcbench • ex @lmstudio

adi

@adonis_singh

17 • contributor @_mcbench • ex @lmstudio

didn't expect a @twominutepapers face reveal today

adonis_singh's tweet image. didn't expect a @twominutepapers face reveal today

kind of crazy to think that gpt-4 was $30/million in and $60/million out


seems a bit too little, a bit too late. but i haven't tested it much myself yet

Introducing the Mistral 3 family of models: Frontier intelligence at all sizes. Apache 2.0. Details in 🧵

MistralAI's tweet image. Introducing the Mistral 3 family of models: Frontier intelligence at all sizes. Apache 2.0. Details in 🧵


those dots are doin' a lot of work

adonis_singh's tweet image. those dots are doin' a lot of work

Runway Gen-4.5 represents significant advancements in both pre-training data efficiency and post-training techniques for video models and serves as our new foundation model for world modeling. Gen-4.5 scored 1,247 Elo points in the Artificial Analysis Text to Video leaderboard,…

runwayml's tweet image. Runway Gen-4.5 represents significant advancements in both pre-training data efficiency and post-training techniques for video models and serves as our new foundation model for world modeling.

Gen-4.5 scored 1,247 Elo points in the Artificial Analysis Text to Video leaderboard,…


THEY COOKED SO HARD

adonis_singh's tweet image. THEY COOKED SO HARD

🚀 Launching DeepSeek-V3.2 & DeepSeek-V3.2-Speciale — Reasoning-first models built for agents! 🔹 DeepSeek-V3.2: Official successor to V3.2-Exp. Now live on App, Web & API. 🔹 DeepSeek-V3.2-Speciale: Pushing the boundaries of reasoning capabilities. API-only for now. 📄 Tech…

deepseek_ai's tweet image. 🚀 Launching DeepSeek-V3.2 & DeepSeek-V3.2-Speciale — Reasoning-first models built for agents!

🔹 DeepSeek-V3.2: Official successor to V3.2-Exp. Now live on App, Web & API.
🔹 DeepSeek-V3.2-Speciale: Pushing the boundaries of reasoning capabilities. API-only for now.

📄 Tech…


Opus 4.5 when asked about it's deepest insight on humans

adonis_singh's tweet image. Opus 4.5 when asked about it's deepest insight on humans

it's been 3 years but feels like 30, excited about the future

today we launched ChatGPT. try talking with it here: chat.openai.com



this is so cool. i really feel like another sector of benchmarking is emerging that focuses less on qa/coding/math and such and more on how the model feels to actually use i am a big fan of this

🗳️ We made AIs vote in elections around the world to analyse their political beliefs. The results reveal a significant gap between choices at the ballot box and the preferences expressed by the machines. Across countries, most models show a consistent tilt toward left‑wing and…

RaphaelDabadie's tweet image. 🗳️ We made AIs vote in elections around the world to analyse their political beliefs.

The results reveal a significant gap between choices at the ballot box and the preferences expressed by the machines.

Across countries, most models show a consistent tilt toward left‑wing and…


opus 4.5 seems a bit more sentient


nb is just so cool, besides being able to make photorealistic stuff

adonis_singh's tweet image. nb is just so cool, besides being able to make photorealistic stuff

Nano Banana vs Nano Banana Pro We’re cooked. 💀

immasiddx's tweet image. Nano Banana vs Nano Banana Pro

We’re cooked. 💀
immasiddx's tweet image. Nano Banana vs Nano Banana Pro

We’re cooked. 💀


who genuinely scrolls on instagram photos? i dont know a single person that does this

Yeah, nano banana is gonna kill instagram

masonwarner's tweet image. Yeah, nano banana is gonna kill instagram
masonwarner's tweet image. Yeah, nano banana is gonna kill instagram


"something important" JUST SAY IT

One point I made that didn’t come across: - Scaling the current thing will keep leading to improvements. In particular, it won’t stall. - But something important will continue to be missing.



veo 4 is probably pretty soon


kimi is slowly becoming my favorite lab, not OS lab, just lab in general

It's not easy to gatekeep this 😭 bc it's way too impressive TL.DR: > It's editable Notebooklm Slides > Designer level infographic > Unlimited nano banana uasage in slides (only in next 48h)

crystalsssup's tweet image. It's not easy to gatekeep this 😭 bc it's way too impressive

TL.DR:
> It's editable Notebooklm Slides
> Designer level infographic 
> Unlimited nano banana uasage in slides (only in next 48h)


"And it's open weights"

adonis_singh's tweet image. "And it's open weights"

"Sir, Deepseek just dropped DeepSeek-Math-v2 and it beats Gemini DeepThink on IMO ProofBench and CNML"

askOkara's tweet image. "Sir, Deepseek just dropped DeepSeek-Math-v2 and it beats Gemini DeepThink on IMO ProofBench and CNML"


isn't it crazy that we have machines that do this?

apparently Opus 4.5 will start THINKING INSIDE OF FILES if you turn reasoning traces off

aidenybai's tweet image. apparently Opus 4.5 will start THINKING INSIDE OF FILES if you turn reasoning traces off


this is why i pay for an internet connection

adonis_singh's tweet image. this is why i pay for an internet connection

Loading...

Something went wrong.


Something went wrong.