Scale AI

@scale_AI

making AI work

scale.com

7월 2016에 가입

2K게시물 71K팔로워 478팔로우 중

내가 좋아할 만한 콘텐츠

@alexandr_wang

@huggingface

@karpathy

@llama_index

@janleike

@AnthropicAI

@AIatMeta

@ClementDelangue

@LangChainAI

@trychroma

@runwayml

@gdb

@ilyasut

Scale AI 님이 재게시함

Bing Liu

@vbingliu

. 10. 9.

🔄RLHF → RLVR → Rubrics → OnlineRubrics 👤 Human feedback = noisy & coarse 🧮 Verifiable rewards = too narrow 📋 Static rubrics = rigid, easy to hack, miss emergent behaviors 💡We introduce OnlineRubrics: elicited rubrics that evolve as models train. arxiv.org/abs/2510.07284

vbingliu's tweet image. 🔄RLHF → RLVR → Rubrics → OnlineRubrics

👤 Human feedback = noisy &amp; coarse
🧮 Verifiable rewards = too narrow
📋 Static rubrics = rigid, easy to hack, miss emergent behaviors

💡We introduce OnlineRubrics: elicited rubrics that evolve as models train.
arxiv.org/abs/2510.07284

Scale AI 님이 재게시함

Jason Droege

@jdroege

. 10. 9.

Sat down with @lennysan to talk about where AI is headed and how we’re making it work for model builders, enterprises and governments. Also went down memory lane about my time at Uber Eats. 🙂

Lenny Rachitsky

@lennysan

. 10. 9.

In his first in-depth interview since taking over as @scale_AI CEO, @jdroege shares: 🔸 What actually happened with Meta’s $14 billion investment 🔸 Where frontier labs are heading next 🔸 Why most enterprise data is useless for AI models 🔸 What it takes to keep making AI model…

Scale AI

@scale_AI

. 10. 1.

“I think one of the misunderstandings is that AI is this magic wand or it can solve all problems, and that’s not true today. But there is a ton of value when you get it right.” Our CEO @jdroege shared his AI success framework with CNN's @claresduffy. cnn.com/2025/09/30/tec…

scale_AI's tweet card. The artificial intelligence industry has a big problem: 95% of companies that try AI aren’t making any money from it, according to a report from the Massachusetts Institute of Technology last month....

Most companies aren’t seeing a return on AI investments. This tech CEO wants to change that | CNN...

출처: cnn.com

Scale AI 님이 재게시함

Bing Liu

@vbingliu

. 10. 1.

New @Scale_AI paper! The culprit behind reward hacking? We trace it to misspecification in high-reward tail. Our fix: rubric-based rewards to tell “excellent” responses apart from “great.” The result: Less hacking, stronger post-training! arxiv.org/pdf/2509.21500

vbingliu's tweet image. New @Scale_AI paper!

The culprit behind reward hacking? We trace it to misspecification in high-reward tail.

Our fix: rubric-based rewards to tell “excellent” responses apart from “great.”

The result: Less hacking, stronger post-training! arxiv.org/pdf/2509.21500

Scale AI 님이 재게시함

Jason Droege

@jdroege

. 9. 22.

We’re introducing SEAL Showdown, the AI leaderboard that actually captures real preferences, powered by a platform used by real people. Public benchmarks today rely on contrived tasks or narrow user groups. That leaves us guessing which models are actually preferred by people.…

Scale AI 님이 재게시함

Bing Liu

@vbingliu

. 9. 20.

🚀 Introducing SWE-Bench Pro — a new benchmark to evaluate LLM coding agents on real, enterprise-grade software engineering tasks. This is the next step beyond SWE-Bench: harder, contamination-resistant, and closer to real-world repos.

Alexandr Wang

@alexandr_wang

Riley Goodside

@goodside

hardmaru

@hardmaru

AK

@_akhaliq

Jim Fan

@DrJimFan

AI at Meta

@AIatMeta

AI Pub

@ai__pub

AI Breakfast

@AiBreakfast

Jay Hack

@mathemagic1an

Sharif Shameem

@sharifshameem

Emm | scenario.com

@emmanuel_2m

Eric Jang

@ericjang11

Tanishq Mathew Abraham, Ph.D.

@iScienceLuvr

TuringPost

@TheTuringPost

Jerry Liu

@jerryjliu0

Jack Clark

@jackclarkSF

Mo Bavarian

@mobav0

Gill Verdon

@GillVerd

Ramsri Goutham Golla

@ramsri_goutham

Viswa_tech

@viswa_tech

Flabjax

@RealFlabjax

HAMI ADLIRAN

@hamiadliran

$Devons_nemesis's profile picture. Devon's alt account, MFSEV Advocate, self proclaimed arborist, engineer, and astronaut. \The Sky Is No Longer The Limit!/$