leopoldasch's profile picture. http://situational-awareness.ai

Leopold Aschenbrenner

@leopoldasch

http://situational-awareness.ai

Épinglé

Virtually nobody is pricing in what's coming in AI. I wrote an essay series on the AGI strategic picture: from the trendlines in deep learning and counting the OOMs, to the international situation and The Project. SITUATIONAL AWARENESS: The Decade Ahead

leopoldasch's tweet image. Virtually nobody is pricing in what's coming in AI.

I wrote an essay series on the AGI strategic picture: from the trendlines in deep learning and counting the OOMs, to the international situation and The Project.

SITUATIONAL AWARENESS: The Decade Ahead
leopoldasch's tweet image. Virtually nobody is pricing in what's coming in AI.

I wrote an essay series on the AGI strategic picture: from the trendlines in deep learning and counting the OOMs, to the international situation and The Project.

SITUATIONAL AWARENESS: The Decade Ahead

Leopold Aschenbrenner a reposté

Driving around (where else?) SF dropping off some ✨special packages ✨. Favorite stop so far:

_TamaraWinter's tweet image. Driving around (where else?) SF dropping off some ✨special packages ✨. Favorite stop so far:
_TamaraWinter's tweet image. Driving around (where else?) SF dropping off some ✨special packages ✨. Favorite stop so far:

The Scaling Era, by the brilliant @dwarkesh_sp with @g_leech_, and edited by @rebeccahiscott, is out today: press.stripe.com/scaling Since we announced this book, the question we’ve gotten most is ‘why now?’

_TamaraWinter's tweet image. The Scaling Era, by the brilliant @dwarkesh_sp with @g_leech_, and edited by @rebeccahiscott, is out today: press.stripe.com/scaling

Since we announced this book, the question we’ve gotten most is ‘why now?’


Self-recommending!

What is intelligence? What will it take to create AGI? What happens once we succeed? The Scaling Era: An Oral History of AI, 2019–2025 by @dwarkesh_sp and @g_leech_ explores the questions animating those at the frontier of AI research. It’s out today: press.stripe.com/scaling



Leopold Aschenbrenner a reposté

When will AI systems be able to carry out long projects independently? In new research, we find a kind of “Moore’s Law for AI agents”: the length of tasks that AIs can do is doubling about every 7 months.

METR_Evals's tweet image. When will AI systems be able to carry out long projects independently?

In new research, we find a kind of “Moore’s Law for AI agents”: the length of tasks that AIs can do is doubling about every 7 months.

The “compressed 21st century”: a great essay on what it might look like to make 100 years of progress in 10 years post AGI (if all goes well). Favorite phrase: “a country of geniuses in a datacenter”

Machines of Loving Grace: my essay on how AI could transform the world for the better darioamodei.com/machines-of-lo…



Leopold Aschenbrenner a reposté

Leopold Aschenbrenner’s SITUATIONAL AWARENESS predicts we are on course for Artificial General Intelligence (AGI) by 2027, followed by superintelligence shortly thereafter, posing transformative opportunities and risks. This is an excellent and important read :…


Leopold Aschenbrenner a reposté

what it looks like when deep learning is hitting a wall:

max_a_schwarzer's tweet image. what it looks like when deep learning is hitting a wall:

Strawberry has landed. 𝗛𝗼𝘁 𝘁𝗮𝗸𝗲 𝗼𝗻 𝗚𝗣𝗧'𝘀 𝗻𝗲𝘄 𝗼𝟭 𝗺𝗼𝗱𝗲𝗹: It is definitely impressive. BUT 0. It’s not AGI, or even close. 1. There’s not a lot of detail about how it actually works, nor anything like full disclosure of what has been tested. 2. It is not…



Leopold Aschenbrenner a reposté

OpenAI's o1 "broke out of its host VM to restart it" in order to solve a task. From the model card: "the model pursued the goal it was given, and when that goal proved impossible, it gathered more resources [...] and used them to achieve the goal in an unexpected way."

The system card (openai.com/index/openai-o…) nicely showcases o1's best moments -- my favorite was when the model was asked to solve a CTF challenge, realized that the target environment was down, and then broke out of its host VM to restart it and find the flag.

max_a_schwarzer's tweet image. The system card (openai.com/index/openai-o…) nicely showcases o1's best moments -- my favorite was when the model was asked to solve a CTF challenge, realized that the target environment was down, and then broke out of its host VM to restart it and find the flag.


Leopold Aschenbrenner a reposté

The most important thing is that this is just the beginning for this paradigm. Scaling works, there will be more models in the future, and they will be much, much smarter than the ones we're giving access to today.

max_a_schwarzer's tweet image. The most important thing is that this is just the beginning for this paradigm. Scaling works, there will be more models in the future, and they will be much, much smarter than the ones we're giving access to today.

Leopold Aschenbrenner a reposté

o1 is trained with RL to “think” before responding via a private chain of thought. The longer it thinks, the better it does on reasoning tasks. This opens up a new dimension for scaling. We’re no longer bottlenecked by pretraining. We can now scale inference compute too.

polynoamial's tweet image. o1 is trained with RL to “think” before responding via a private chain of thought. The longer it thinks, the better it does on reasoning tasks. This opens up a new dimension for scaling. We’re no longer bottlenecked by pretraining. We can now scale inference compute too.

Leopold Aschenbrenner a reposté

Today, I’m excited to share with you all the fruit of our effort at @OpenAI to create AI models capable of truly general reasoning: OpenAI's new o1 model series! (aka 🍓) Let me explain 🧵 1/

polynoamial's tweet image. Today, I’m excited to share with you all the fruit of our effort at @OpenAI to create AI models capable of truly general reasoning: OpenAI's new o1 model series! (aka 🍓) Let me explain 🧵 1/

Leopold Aschenbrenner a reposté

Why we should pass the ENFORCE Act, and let BIS do its job city-journal.org/article/keepin…


Leopold Aschenbrenner a reposté

Why Leopold Aschenbrenner didn't go into economic research. From dwarkeshpatel.com/p/leopold-asch… This was a great listen, and @leopoldasch is underrated

paulnovosad's tweet image. Why Leopold Aschenbrenner didn't go into economic research.

From dwarkeshpatel.com/p/leopold-asch…

This was a great listen, and @leopoldasch is underrated

Leopold Aschenbrenner a reposté

The pace is incredibly fast


Leopold Aschenbrenner a reposté

I am awe struck at the rate of progress of AI on all fronts. Today's expectations of capability a year from now will look silly and yet most businesses have no clue what is about to hit them in the next ten years when most rules of engagement will change. It's time to…

vkhosla's tweet image. I am awe struck at the rate of progress of AI on all fronts. Today's expectations of capability a year from now will look silly and yet most businesses have no clue what is about to hit them in the next ten years when most rules of engagement will change. It's time to…

Leopold Aschenbrenner a reposté

On average, when agents can do a task, they do so at ~1/30th of the cost of the median hourly wage of a US bachelor’s degree holder. One example: our Claude 3.5 Sonnet agent fixed bugs in an ORM library at a cost of <$2, while the human baseline took >2 hours.

METR_Evals's tweet image. On average, when agents can do a task, they do so at ~1/30th of the cost of the median hourly wage of a US bachelor’s degree holder. One example: our Claude 3.5 Sonnet agent fixed bugs in an ORM library at a cost of &amp;lt;$2, while the human baseline took &amp;gt;2 hours.

Leopold Aschenbrenner a reposté

How well can LLM agents complete diverse tasks compared to skilled humans? Our preliminary results indicate that our baseline agents based on several public models (Claude 3.5 Sonnet and GPT-4o) complete a proportion of tasks similar to what humans can do in ~30 minutes. 🧵

METR_Evals's tweet image. How well can LLM agents complete diverse tasks compared to skilled humans? Our preliminary results indicate that our baseline agents based on several public models (Claude 3.5 Sonnet and GPT-4o) complete a proportion of tasks similar to what humans can do in ~30 minutes. 🧵

Leopold Aschenbrenner a reposté

Google DM graded their own status quo as sub-SL3 (~SL-2). It would take SL-3 to stop cybercriminals or terrorists, SL-4 to stop North Korea, and SL-5 to stop China. We're not even close to on track - and Google is widely believed to have the best security of the AI labs!

ns_whit's tweet image. Google DM graded their own status quo as sub-SL3 (~SL-2).

It would take SL-3 to stop cybercriminals or terrorists, SL-4 to stop North Korea, and SL-5 to stop China.

We&apos;re not even close to on track - and Google is widely believed to have the best security of the AI labs!
ns_whit's tweet image. Google DM graded their own status quo as sub-SL3 (~SL-2).

It would take SL-3 to stop cybercriminals or terrorists, SL-4 to stop North Korea, and SL-5 to stop China.

We&apos;re not even close to on track - and Google is widely believed to have the best security of the AI labs!

Leopold Aschenbrenner a reposté

What are the most important things for policymakers to do on AI right now? There are two: - Secure the leading labs - Create energy abundance in the US The Grand Bargain for AI - let's dig in...🧵

ns_whit's tweet image. What are the most important things for policymakers to do on AI right now? There are two:

- Secure the leading labs
- Create energy abundance in the US

The Grand Bargain for AI - let&apos;s dig in...🧵

Leopold Aschenbrenner a reposté

cramming more cognition into integrated circuits openai.com/index/gpt-4o-m…

charles_irl's tweet image. cramming more cognition into integrated circuits 

openai.com/index/gpt-4o-m…

Loading...

Something went wrong.


Something went wrong.