_eric_pang_'s profile picture. math/cs @uwaterloo | prev: ml @quora, @amazon

Eric Pang

@_eric_pang_

math/cs @uwaterloo | prev: ml @quora, @amazon

ปักหมุด

Here's how I (almost) got the high scores in ARC-AGI-1 and 2 (the honor goes to @jerber888) while keeping the cost low. To put things into perspective: o3-preview scored 75.7% on ARC-AGI-1 last year while spending $200/task on low setting. My approach scores 77.1% while spending…

New SOTA on ARC-AGI - V1: 79.6%, $8.42/task - V2: 29.4%, $30.40/task Custom submissions by @jerber888 and @_eric_pang_ are now the best known solutions to ARC-AGI Both: * Are open source * Use Grok 4 * Implement program-synthesis outer loops with test-time adaptation

arcprize's tweet image. New SOTA on ARC-AGI

- V1: 79.6%, $8.42/task
- V2: 29.4%, $30.40/task

Custom submissions by @jerber888 and @_eric_pang_ are now the best known solutions to ARC-AGI

Both:
* Are open source
* Use Grok 4
* Implement program-synthesis outer loops with test-time adaptation


Eric Pang รีโพสต์แล้ว

Grok 5 starts training in a few weeks

New SOTA on ARC-AGI - V1: 79.6%, $8.42/task - V2: 29.4%, $30.40/task Custom submissions by @jerber888 and @_eric_pang_ are now the best known solutions to ARC-AGI Both: * Are open source * Use Grok 4 * Implement program-synthesis outer loops with test-time adaptation

arcprize's tweet image. New SOTA on ARC-AGI

- V1: 79.6%, $8.42/task
- V2: 29.4%, $30.40/task

Custom submissions by @jerber888 and @_eric_pang_ are now the best known solutions to ARC-AGI

Both:
* Are open source
* Use Grok 4
* Implement program-synthesis outer loops with test-time adaptation


Eric Pang รีโพสต์แล้ว

I'm back at the top of ARC-AGI with my new program. I use @grok 4 and multi-agent collaboration with evolutionary test-time compute

jerber888's tweet image. I'm back at the top of ARC-AGI with my new program. I use @grok 4 and multi-agent collaboration with evolutionary test-time compute

New SOTA on ARC-AGI - V1: 79.6%, $8.42/task - V2: 29.4%, $30.40/task Custom submissions by @jerber888 and @_eric_pang_ are now the best known solutions to ARC-AGI Both: * Are open source * Use Grok 4 * Implement program-synthesis outer loops with test-time adaptation

arcprize's tweet image. New SOTA on ARC-AGI

- V1: 79.6%, $8.42/task
- V2: 29.4%, $30.40/task

Custom submissions by @jerber888 and @_eric_pang_ are now the best known solutions to ARC-AGI

Both:
* Are open source
* Use Grok 4
* Implement program-synthesis outer loops with test-time adaptation


Eric Pang รีโพสต์แล้ว

We just released 2 open source SOTA submission to ARC-AGI (both v1 and v2) Submissions by @jerber888 and @_eric_pang_ are the best we've seen. Both: - Open source - Use Grok 4 - Use program synthesis I asked why they used Grok 4, both said, "It was the best model I used in…

New SOTA on ARC-AGI - V1: 79.6%, $8.42/task - V2: 29.4%, $30.40/task Custom submissions by @jerber888 and @_eric_pang_ are now the best known solutions to ARC-AGI Both: * Are open source * Use Grok 4 * Implement program-synthesis outer loops with test-time adaptation

arcprize's tweet image. New SOTA on ARC-AGI

- V1: 79.6%, $8.42/task
- V2: 29.4%, $30.40/task

Custom submissions by @jerber888 and @_eric_pang_ are now the best known solutions to ARC-AGI

Both:
* Are open source
* Use Grok 4
* Implement program-synthesis outer loops with test-time adaptation


Eric Pang รีโพสต์แล้ว

ARC just published new #1 and #2 reproducible SOTA scores on our public leaderboard from @jerber888 and @_eric_pang_. And their code is now open source! My analysis below -- includes suggestions for application layer AI and future research directions. New SOTA: - v1: 79.6%,…


The same reason is why ARC-AGI is the most important benchmark in AI. It is the only benchmark that's not saturated after repeated attempts from players big and small.

_eric_pang_'s tweet image. The same reason is why ARC-AGI is the most important benchmark in AI. It is the only benchmark that's not saturated after repeated attempts from players big and small.

Eric Pang รีโพสต์แล้ว

A CONFUCIAN CONFUSION / MAHJONG: TWO FILMS BY EDWARD YANG • Coming to Criterion in August! criterion.com/boxsets/8199-a… In this pair of sharp, sprawling satires, one of Taiwan’s most celebrated filmmakers captures the anything-can-happen mood of Taipei at the end of the twentieth…

Criterion's tweet image. A CONFUCIAN CONFUSION / MAHJONG: TWO FILMS BY EDWARD YANG • Coming to Criterion in August! criterion.com/boxsets/8199-a…

In this pair of sharp, sprawling satires, one of Taiwan’s most celebrated filmmakers captures the anything-can-happen mood of Taipei at the end of the twentieth…

Eric Pang รีโพสต์แล้ว

Introducing Continuous Thought Machines New Blog: sakana.ai/ctm/ Modern AI is powerful, but it’s still distinct from human-like flexible intelligence. We believe neural timing is key. Our Continuous Thought Machine is built from the ground up to use neural dynamics as…


Re: Gödel's incompleteness theorems

“People get so caught up in the fact that they have limits that they rarely exert the effort required to get close to them.” amzn.to/3eKu26n

robkhenderson's tweet image. “People get so caught up in the fact that they have limits that they rarely exert the effort required to get close to them.” amzn.to/3eKu26n


This is true but Meta doesn't hate it as long as you are still watching reels (and hence the ads)

_eric_pang_'s tweet image. This is true but Meta doesn't hate it as long as you are still watching reels (and hence the ads)

most of my friends & i have completely stopped posting to insta (even stories). all we do is exchange memes & reels on messages. i remember when this happened to fb very vividly. this should be titled zuck’s law.



Loading...

Something went wrong.


Something went wrong.