
Eric Pang
@_eric_pang_
math/cs @uwaterloo | prev: ml @quora, @amazon
คุณอาจชื่นชอบ
Here's how I (almost) got the high scores in ARC-AGI-1 and 2 (the honor goes to @jerber888) while keeping the cost low. To put things into perspective: o3-preview scored 75.7% on ARC-AGI-1 last year while spending $200/task on low setting. My approach scores 77.1% while spending…
New SOTA on ARC-AGI - V1: 79.6%, $8.42/task - V2: 29.4%, $30.40/task Custom submissions by @jerber888 and @_eric_pang_ are now the best known solutions to ARC-AGI Both: * Are open source * Use Grok 4 * Implement program-synthesis outer loops with test-time adaptation

Grok 5 starts training in a few weeks
New SOTA on ARC-AGI - V1: 79.6%, $8.42/task - V2: 29.4%, $30.40/task Custom submissions by @jerber888 and @_eric_pang_ are now the best known solutions to ARC-AGI Both: * Are open source * Use Grok 4 * Implement program-synthesis outer loops with test-time adaptation

I'm back at the top of ARC-AGI with my new program. I use @grok 4 and multi-agent collaboration with evolutionary test-time compute

New SOTA on ARC-AGI - V1: 79.6%, $8.42/task - V2: 29.4%, $30.40/task Custom submissions by @jerber888 and @_eric_pang_ are now the best known solutions to ARC-AGI Both: * Are open source * Use Grok 4 * Implement program-synthesis outer loops with test-time adaptation

We just released 2 open source SOTA submission to ARC-AGI (both v1 and v2) Submissions by @jerber888 and @_eric_pang_ are the best we've seen. Both: - Open source - Use Grok 4 - Use program synthesis I asked why they used Grok 4, both said, "It was the best model I used in…
New SOTA on ARC-AGI - V1: 79.6%, $8.42/task - V2: 29.4%, $30.40/task Custom submissions by @jerber888 and @_eric_pang_ are now the best known solutions to ARC-AGI Both: * Are open source * Use Grok 4 * Implement program-synthesis outer loops with test-time adaptation

ARC just published new #1 and #2 reproducible SOTA scores on our public leaderboard from @jerber888 and @_eric_pang_. And their code is now open source! My analysis below -- includes suggestions for application layer AI and future research directions. New SOTA: - v1: 79.6%,…
The same reason is why ARC-AGI is the most important benchmark in AI. It is the only benchmark that's not saturated after repeated attempts from players big and small.

A CONFUCIAN CONFUSION / MAHJONG: TWO FILMS BY EDWARD YANG • Coming to Criterion in August! criterion.com/boxsets/8199-a… In this pair of sharp, sprawling satires, one of Taiwan’s most celebrated filmmakers captures the anything-can-happen mood of Taipei at the end of the twentieth…

Introducing Continuous Thought Machines New Blog: sakana.ai/ctm/ Modern AI is powerful, but it’s still distinct from human-like flexible intelligence. We believe neural timing is key. Our Continuous Thought Machine is built from the ground up to use neural dynamics as…
Re: Gödel's incompleteness theorems
“People get so caught up in the fact that they have limits that they rarely exert the effort required to get close to them.” amzn.to/3eKu26n

This is true but Meta doesn't hate it as long as you are still watching reels (and hence the ads)

United States เทรนด์
- 1. Baker 29.7K posts
- 2. Cowboys 72.4K posts
- 3. 49ers 30.9K posts
- 4. Fred Warner 10.2K posts
- 5. Packers 27.4K posts
- 6. Panthers 73.4K posts
- 7. Bucs 9,254 posts
- 8. Tez Johnson 2,705 posts
- 9. Niners 4,830 posts
- 10. Zac Taylor 2,787 posts
- 11. #FTTB 3,890 posts
- 12. Titans 22.5K posts
- 13. Browns 64.5K posts
- 14. Yoshi 33K posts
- 15. Cam Ward 2,213 posts
- 16. Ravens 64.2K posts
- 17. Dolphins 46.6K posts
- 18. #Bengals 2,826 posts
- 19. #KeepPounding 8,286 posts
- 20. Geno Smith 2,092 posts
คุณอาจชื่นชอบ
-
Taho (🗺,🗺)
@taho_xyz -
wchols
@wchols -
Yi Ma
@YiMaTweets -
Angle 📐
@AngleProtocol -
Manolis Kellis
@manoliskellis -
PSE
@PrivacyEthereum -
djma
@0xdjma -
𝚂𝚌𝚘𝚝𝚝 𝙼𝚘𝚘𝚛𝚎 🌐
@notscottmoore -
Samyak Jain 🦇🔊🌊
@smykjain -
Alok Vasudev
@AlokVasudev -
Zatoshi
@Zac_Aztec -
Emiliano Bonassi
@emilianobonassi -
Yaniv Tal
@yanivgraph -
Eshita
@eshita -
Hugh Karp 🐢
@HughKarp
Something went wrong.
Something went wrong.