ai_luminary's profile picture. Experto en IA @SimularAI | Mejorando vidas con IA 🤖🌍

Masato Yamada

@ai_luminary

Experto en IA @SimularAI | Mejorando vidas con IA 🤖🌍

Masato Yamada 已轉發

*AI does something wild* Skeptics: yawn. so what. it was trained on everything AI read every book ever written... therefore... it can't do anything interesting?? Imagine meeting a HUMAN that read every book?

AISafetyMemes's tweet image. *AI does something wild*

Skeptics: yawn. so what. it was trained on everything

AI read every book ever written... therefore... it can't do anything interesting??

Imagine meeting a HUMAN that read every book?

Sorry, but anybody who still says "LLMs just predict text" is embarrassing themselves at this point It is, as OpenAI's roon says, "categorically wrong"

AISafetyMemes's tweet image. Sorry, but anybody who still says "LLMs just predict text" is embarrassing themselves at this point

It is, as OpenAI's roon says, "categorically wrong"


Masato Yamada 已轉發

APEX is a first-of-its-kind benchmark that evaluates AI models based on their ability to perform economically valuable knowledge work. At the moment, GPT-5 is leading all metrics. But remember, this is the worst it will ever be.

kimmonismus's tweet image. APEX is a first-of-its-kind benchmark that evaluates AI models based on their ability to perform economically valuable knowledge work.

At the moment, GPT-5 is leading all metrics. 

But remember, this is the worst it will ever be.

AI has its PhD and now it’s on the job market. Introducing the AI Productivity Index (APEX), a benchmark that measures how well we’ve automated the most valuable industries in the world. Most benchmarks study abstract capabilities. APEX evaluates model performance on real…



Masato Yamada 已轉發

Verifier with NO trained parameters?🤯 Even outperform GPT-4o in terms of verification accuracy. We at Tencent AI Lab introduce CLUE - 🕵️ a verifier based on clustering where successful vs. failed reasoning creates separate hidden states. Paper link: huggingface.co/papers/2510.01…

LiangZhenwen's tweet image. Verifier with NO trained parameters?🤯

Even outperform GPT-4o in terms of verification accuracy.

We at Tencent AI Lab introduce CLUE - 🕵️

a verifier based on clustering

where successful vs. failed reasoning creates separate hidden states.

Paper link:
huggingface.co/papers/2510.01…

Masato Yamada 已轉發

Google just dropped a new Generative AI Python library for SQL Databases. Introducing Google GenAI Toolbox. This is what you need to know:

mdancho84's tweet image. Google just dropped a new Generative AI Python library for SQL Databases.

Introducing Google GenAI Toolbox. 

This is what you need to know:

Masato Yamada 已轉發

Sam Altman says technology is outpacing our wisdom, which leaves society unbalanced "this was the year AI got smarter than us" Life goes on, but something fundamental has changed The future is wide open: digital immortality, disease cures, no one knows how far it goes


Masato Yamada 已轉發

much more convinced after getting my own results: LoRA with rank=1 learns (and generalizes) as well as full-tuning while saving 43% vRAM usage! allows me to RL bigger models with limited resources😆 script: github.com/sail-sg/oat/bl…

zzlccc's tweet image. much more convinced after getting my own results:
LoRA with rank=1 learns (and generalizes) as well as full-tuning while saving 43% vRAM usage! allows me to RL bigger models with limited resources😆

script: github.com/sail-sg/oat/bl…

LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.…

thinkymachines's tweet image. LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.…


Masato Yamada 已轉發

🚨BREAKING: Creative jobs could be obsolete by 2026 AI can now design, test, and optimize 1000s of catalog ads without a human team. This founder raised $10M to build the platform killing UGC ads 🧵


Masato Yamada 已轉發

We found that visual foundation encoder can be aligned to serve as tokenizers for latent diffusion models in image generation! Our new paper introduces a new tokenizer training paradigm that produces a semantically rich latent space, improving diffusion model performance🚀🚀.

bowei_chen_19's tweet image. We found that visual foundation encoder can be aligned to serve as tokenizers for latent diffusion models in image generation!

Our new paper introduces a new tokenizer training paradigm that produces a semantically rich latent space, improving diffusion model performance🚀🚀.

Masato Yamada 已轉發

To the extent that modern post-training as a paradigm violates the bitter lesson by being human engineered, it is evidence against a purist view of the bitter lesson. Not the other way around. AI is not abstract. It’s a highly human-centered, utility-driven engineering domain.

Finally had a chance to listen through this pod with Sutton, which was interesting and amusing. As background, Sutton's "The Bitter Lesson" has become a bit of biblical text in frontier LLM circles. Researchers routinely talk about and ask whether this or that approach or idea…



Masato Yamada 已轉發

LLMs aren't a dead end or hitting a wall their issues are solvable, but will take longer than many expect the real gap is hype vs. reality the dot-com bubble proved the internet thesis was right, just early and overbuilt today feels similar: same pattern, bigger and faster


Masato Yamada 已轉發

Aravind Srinivas argues that generic financial advice is now replicable by AI faster and more thoroughly than any human. Models like Perplexity can read every analyst report, summarize market trends, and tailor strategies in real time. But human advisors remain valuable if they…


Masato Yamada 已轉發

I am very excited to announce that the third CPAL conference: cpal.cc is to be held at Tübingen, Germany in March 2026 (after Hong Kong and Stanford). Complementary to the arguably too many mega conferences on machine intelligence, CPAL aims to be focused and…


Masato Yamada 已轉發

A secret I believe that most don’t is this: Computers are becoming more trusted than humans. > Humans can lie, cheat, or change their minds. > Computers, when governed by immutable code, can’t. >The longer they prove this reliability, the more we hand them our trust. That’s…

ShaneMac's tweet image. A secret I believe that most don’t is this:

Computers are becoming more trusted than humans. 

> Humans can lie, cheat, or change their minds.
> Computers, when governed by immutable code, can’t.
>The longer they prove this reliability, the more we hand them our trust.

That’s…

Masato Yamada 已轉發

another day, another agent introducing "Caesar" — an AI agent that uses your computer like a human what it does: - automates tasks across desktop, mobile, and web - runs on any device and any OS - handles software testing beyond the web No APIs. No integrations.

Introducing ⚡ Caesr by @ask_ui The AI agent that operates your computer like a human. Clicks, types, navigates — across desktop, mobile, and web. No APIs. No integrations. Just tell it what to do.



Masato Yamada 已轉發

Lovable cloud just dropped. Here's what it means: • Add API's, Databases or even Stripe directly into your build in Lovable • One-click OpenAI API • Baked in MCP's • Custom backends No more work around, just unlimited possibilities!


Masato Yamada 已轉發

@DynaRobotics successfully zero-shot folded a new @corl_conf shirt -- even with the sleeve tucked in awkwardly! Amazing stuff. @JasonMa2020

Everyone keeps talking about this demo, if anyone won corl it was these guys



Masato Yamada 已轉發

It blows my mind that you can do this now! I opened @get_mocha and built an entire application in less than 30 minutes: • the frontend • the backend • with a database • with Stripe integration I used to charge clients 5 figures to do all of this, and I finished it today in…


Masato Yamada 已轉發

From replatforming to edge-powered AI 🚀 Join VSCO CTO Chris Haire live for a behind-the-scenes look at VSCO’s modernization journey—and how they built Canvas, an AI tool scaled at the edge. Register now: cfl.re/48ERv7q

Cloudflare's tweet image. From replatforming to edge-powered AI 🚀
Join VSCO CTO Chris Haire live for a behind-the-scenes look at VSCO’s modernization journey—and how they built Canvas, an AI tool scaled at the edge.
Register now: cfl.re/48ERv7q

Masato Yamada 已轉發

“Sir, Dario just dropped claude 4.5 and it beats GPT-5 in coding, agentic tasks, and computer use AND deepseek dropped new model with 10x cheaper inference 50%+ cheaper API costs…”

ns123abc's tweet image. “Sir, Dario just dropped claude 4.5 and it  beats GPT-5 in coding, agentic tasks, and computer use AND deepseek dropped new model with 10x cheaper inference 50%+ cheaper API costs…”

Masato Yamada 已轉發

New Microsoft Research paper stress-tests GPT-5, Gemini 2.5, GPT-4o & others on medical benchmarks and uncovers deep fragilities beneath the leaderboard wins. Key Findings: — Models still guess correctly even without images — Reasoning is often fabricated or wrong — Simple…

WesRothMoney's tweet image. New Microsoft Research paper stress-tests GPT-5, Gemini 2.5, GPT-4o & others on medical benchmarks and uncovers deep fragilities beneath the leaderboard wins.  

Key Findings:  
— Models still guess correctly even without images
— Reasoning is often fabricated or wrong
— Simple…

United States 趨勢

Loading...

Something went wrong.


Something went wrong.