Triplsr's profile picture.

Sridi aziz

@Triplsr

Sridi aziz memposting ulang

Can AI solve math research problems that have eluded human mathematicians? Our new benchmark, FrontierMath: Open Problems, is designed to help find out. AI hasn’t solved any of these yet, but the game is young!

EpochAIResearch's tweet image. Can AI solve math research problems that have eluded human mathematicians? Our new benchmark, FrontierMath: Open Problems, is designed to help find out.

AI hasn’t solved any of these yet, but the game is young!

Sridi aziz memposting ulang

AI data center buildouts already rival the Manhattan Project in scale, but there’s little public info about them. So we spent the last few months reading legal permits, staring at satellite images, and scouring news sources. Here’s what you need to know. 🧵

EpochAIResearch's tweet image. AI data center buildouts already rival the Manhattan Project in scale, but there’s little public info about them.

So we spent the last few months reading legal permits, staring at satellite images, and scouring news sources.

Here’s what you need to know. 🧵

Sridi aziz memposting ulang

GPT-5 on Sudoku-Bench 🧩 Since releasing Sudoku-Bench in May 2025, when no LLM could solve a classic 9x9 puzzle, we've been evaluating the latest generation of models. GPT-5 now leads our leaderboard with 33% puzzles solved--approximately 2x the previous leader--and is the first


Sridi aziz memposting ulang

A new eval, Remote Labor Index, measures AI's ability to automate real-world, economically valuable projects from remote work platforms. Currently entirely unsaturated (max score of 2.5%) A great collaboration between @scale_AI and @cais!

Can AI automate jobs? We created the Remote Labor Index to test AI’s ability to automate hundreds of long, real-world, economically valuable projects from remote work platforms. While AIs are smart, they are not yet that useful: the current automation rate is less than 3%.

hendrycks's tweet image. Can AI automate jobs?

We created the Remote Labor Index to test AI’s ability to automate hundreds of long, real-world, economically valuable projects from remote work platforms.

While AIs are smart, they are not yet that useful:
the current automation rate is less than 3%.
hendrycks's tweet image. Can AI automate jobs?

We created the Remote Labor Index to test AI’s ability to automate hundreds of long, real-world, economically valuable projects from remote work platforms.

While AIs are smart, they are not yet that useful:
the current automation rate is less than 3%.
hendrycks's tweet image. Can AI automate jobs?

We created the Remote Labor Index to test AI’s ability to automate hundreds of long, real-world, economically valuable projects from remote work platforms.

While AIs are smart, they are not yet that useful:
the current automation rate is less than 3%.


Sridi aziz memposting ulang

This is coming from the guy who made jQuery. If I got a compliment like this I would either faint or immediately retire. Highest honor in JS imo

I've been using @tan_stack Start for a new project and it's super good. The server functions completely replace the need for TRPC/GraphQL/REST, the middleware is composable and fully typed, and having TSRouter's nice typing and stateful search params is icing on the cake. A+!



Sridi aziz memposting ulang

Andrej Karpathy calls AI Agents slop "Overall, the models they are not there. And I feel like the industry [...] it's making too big of a jump and it's trying to pretend that this is amazing. And it's not—it's slop! And I think they are not coming to terms with it. And maybe


Sridi aziz memposting ulang

So, reminder: the quality of code output by these systems is *very low* and the AIs themselves don't understand the output. This is obvious to anyone who knows how to program. There are still use cases, for example, to output a large volume of low-quality code that is not


Loading...

Something went wrong.


Something went wrong.