Chi Nguyen

@NguyenSquared

Joined April 2020

5Posts 23Followers 66Following

Chi Nguyen reposted

Caspar Oesterheld

Dec 16, 2024

How do LLMs reason about playing games against copies of themselves? 🪞We made the first LLM decision theory benchmark to find out. 🧵1/10

C_Oesterheld's tweet image. How do LLMs reason about playing games against copies of themselves? 🪞We made the first LLM decision theory benchmark to find out. 🧵1/10

Chi Nguyen reposted

METR

Nov 22, 2024

How close are current AI agents to automating AI R&D? Our new ML research engineering benchmark (RE-Bench) addresses this question by directly comparing frontier models such as Claude 3.5 Sonnet and o1-preview with 50+ human experts on 7 challenging research engineering tasks.

METR_Evals's tweet image. How close are current AI agents to automating AI R&amp;D? Our new ML research engineering benchmark (RE-Bench) addresses this question by directly comparing frontier models such as Claude 3.5 Sonnet and o1-preview with 50+ human experts on 7 challenging research engineering tasks.

Jean

@1Dvp59RHbCw856D

Daniel Filan

@dfrsrchtwts

Aengus Lynch

@aengus_lynch1

Inappropriately Deep ☄️

@KhemchandaniD

Khullani

@Khullani_

ᴅᴀɴɪᴇʟ ᴍɪᴇssʟᴇʀ 🛡️

@DanielMiessler

John Mori

@hjohnmori

Rotteatear

@Rotteateare2rh

Brad pitt

@wilson_mar35320

Meihua King

@ltjawiji

Liu Ying

@Quinsealvaria

Ben Snow

@ben_snow6

James Lucassen

@jzlucassen

Yixin Lin

@yixin_lin_

Matīss Apinis

@matiss_apinis

Yannick Mühlhäuser

@yannick___m

Daniel Kokotajlo

@DKokotajlo

Ryan Kidd

@ryan_kidd44

Thomas Larsen

@thlarsen

DM

@da_me12

Shannon Yang

@shannonyangsky

Caspar Oesterheld

@C_Oesterheld

Finish It!

@FinishItPod

METR

@METR_Evals

Hannah Rose Kirk

@hannahrosekirk

Redwood Research

@redwood_ai

Yo Shavit

@yonashav

Max Nadeau

@MaxNadeau_

Eric Steinberger

@EricSteinb

Aengus Lynch

@aengus_lynch1

Laurens van der Maaten

@lvdmaaten

Samuel Marks

@saprmarks

Gillian Hadfield

@ghadfield

Rohin Shah

@rohinmshah

Zihan "Zenus" Wang

@wzihanw

Trenton Bricken

@TrentonBricken

Jerry Wei

@JerryWeiAI

Xander Davies

@alxndrdavies

Mike Krieger

@mikeyk

Misha Laskin

@MishaLaskin

Ioannis Antonoglou

@real_ioannis

Daniel Litt

@littmath

Johannes Heidecke

@JoHeidecke

Yanda Chen

@yanda_chen_

Sholto Douglas

@_sholtodouglas

Thomas Larsen

@thlarsen

Deep Cogito

@DeepCogito

Drishan Arora

@drishanarora

Tejal Patwardhan

@tejalpatwardhan

Luke Drago

@luke_drago_

Denny Zhou

@denny_zhou

Logan Kilpatrick

@OfficialLoganK

Google AI Developers

@googleaidevs

Andrew Lampinen

@AndrewLampinen

Hailey Nguyen

@hailey_huong

David Krueger

@DavidSKrueger

Yi Tay

@YiTayML

Wojciech Zaremba

@woj_zaremba

Jan Leike

@janleike

Christopher Manning

@chrmanning

Jacob Andreas

@jacobandreas

Akari Asai

@AkariAsai

Soumith Chintala

@soumithchintala

Sasha Rush

@srush_nlp

Michaël Trazzi

@MichaelTrazzi

Jack Clark

@jackclarkSF

Eric Jang

@ericjang11

Miles Brundage

@Miles_Brundage

Neel Nanda

@NeelNanda5

Kyunghyun Cho

@kchonyc

Jason Wei

@_jasonwei

Sam Bowman

@sleepinyourhat

Percy Liang

@percyliang

United States Trends

1. Renee 615 B posts
2. Good Thursday 24,8 B posts
3. Charlie Kirk 122 B posts
4. Trae 101 B posts
5. Macklin Celebrini 3.368 posts
6. hudson 263 B posts
7. Jesse Watters 14,2 B posts
8. Hawks 50,7 B posts
9. The ICE 2,16 Mn posts
10. #BeckyxCHANELCocoCrush 566 B posts
11. Zcash 4.953 posts
12. REBECCA X CHANEL LOS ANGELES 560 B posts
13. Noem 391 B posts
14. Salt Lake City 11,4 B posts
15. jimmy fallon 45,2 B posts
16. Wizards 52,1 B posts
17. Jeopardy 7.507 posts
18. Blazers 3.625 posts
19. Sharks 9.487 posts
20. Gestapo 137 B posts

Something went wrong.

Something went wrong.