You might like
On Dec 1 last year, no good Gemini model at all (we were at 1.5), no image model that got text right, no good video model at all, no Deepseek R1, o1 had just come out with test time inference, FrontierMath was 2% not 41%, no one got to 10% on HLE...Just so you can plan for 2026.
This seems like a very important detail. Flash is, in some sense, very different from Pro.
"how can flash beat pro??" -> the answer is RL! flash is not just a distilled pro. we've had lots of exciting research progress on agentic RL which made its way into flash but was too late for pro. can't wait to finally bring them to pro👀
How did people in 1913 see the world? How did they think about the future? We trained LLMs exclusively on pre-1913 texts—no Wikipedia, no 20/20. The model literally doesn't know WWI happened. Announcing the Ranke-4B family of models. Coming soon: github.com/DGoettlich/his…
As a journalist, you can frame almost anything. If the facts are on your side, you can claim the public simply doesn’t understand them. If they aren’t, you can emphasize that the public has “concerns" and elevate anecdotal issues to the forefront.
Is this a joke? “Waymo isn’t safer than human drivers because it killed two animals.” Okay, but human taxi drivers periodically kill actual people
If we interpret this as the P(Winning), it is an underspecified question. For a specific individual, for a random individual, for a random team in these leagues? If the latter, then they are the same - they have the same number of teams. The P(Win) for a random team is 1/32.
Which is trophy is harder to win? A or B
I have a niche question for anyone familiar with @OpenAI billing. I added prepaid funds of $175. Separately (!!) I was awarded $175 in credit grands in late November. What's odd is that my usage seems to be decreasing BOTH? It should reduce credit grant first @OpenAIDevs
I created a normal distribution infographic with my school's branding from the new OpenAI model (left) and Gemini Nano Banana Pro (right). The text in both is very well done, but Gemini still has the edge, at least in terms of adherence to branding. OpenAI made up a slogan.
It seems like there would be efficiency gains if LLMs were trained on a single language, and a different model focuses on translation. And yet, that usually isn't what happens.
"Smells like Claude" has to be a new meme.
An engineer showed Gemini what another AI said about its code Gemini responded (in its "private" thoughts) with petty trash-talking, jealousy, and a full-on revenge plan 🧵
The hill I’m going to die on is, The Academy’s social media account shouldn’t be written in first-person with everyday internet speak. You are *The Academy*. And that should mean something. You are not a 20-something influencer trying to have an online brand.
I am not opposed to editors publishing in their own journals, but look at the Management Science publication count for David Simchi-Levi, who was EIC from 2019 to 2023. YEAR. COUNT 1990 1 1992 1 2000 1 2019 2 2021 2 2022 10 2023 4 2024 1 2025 4
This would work for me way too often.
the pm at oai responsible for naming consolidation
Price control apologia. Don't supply what politicians demand. grumpy-economist.com/p/price-contro…
I've noticed that a lot of people tend to assume that capabilities of consumer-grade AI models (like GPT-5) are the same as the true frontier of AI capabilities. This is a mistake. For example, OpenAI's unreleased model that won gold on the IMO in July is clearly better than…
I’ll come out and say it. Gemini-3 won’t be a leap in AI progress at all or reignite the scaling laws which broke down. GPT-5 showed clearly transformers with RL are tapped out & other techniques while being better performance don’t scale in quality. I am buying PUTs. The…
Inequality accelerates climate change? You are claiming that the shape of the income distribution affects the second derivative of climate-related variables? Kids, don't get high on your own supply. And this isn't a panel of economists. It's a panel of mostly other people, all…
Today, I joined 500+ researchers from 70 countries in calling on world leaders to create an International Panel on Inequality modelled after the IPCC— as recommended by the G20 Committee on Inequality led by @JosephEStiglitz. Help us spread the call. 🔗wid.world/news-article/5…
I highly encourage econ to not go down the road many parts of the sciences have gone down, being seen as political actors. People trust us on economic topics *because* we don't do things like this. (Also global individual level inequality has been falling straight for 35 yrs...)
Today, I joined 500+ researchers from 70 countries in calling on world leaders to create an International Panel on Inequality modelled after the IPCC— as recommended by the G20 Committee on Inequality led by @JosephEStiglitz. Help us spread the call. 🔗wid.world/news-article/5…
NYTimes article quotes someone saying they are "terrified" of Waymo in paragraph 6. Waits until paragraph 33 (out of 44 paragraphs) to mention that they are 91 percent safer than human drivers. How outraged would liberals be if a news outlet covered vaccines like this?
Please don't steal OpenAI's inclination for hype/views via, for lack of better word, "teasing". If it's good, it's good and it will be obvious.
I am on the (red-hot) 2025–26 academic job market! I study wide-ranging, important, and contemporary issues in education policy and domestic violence. My job market paper studies academic accommodations in higher education: trends and drivers, who uses them, and academic impacts.
United States Trends
- 1. Christmas 3.56M posts
- 2. Santa 1.19M posts
- 3. Happy Holidays 524K posts
- 4. Feliz Navidad 486K posts
- 5. Groq 3,436 posts
- 6. Pebble 6,378 posts
- 7. Mike Preston 1,386 posts
- 8. Merry Xmas 197K posts
- 9. Stephen Colbert 19.8K posts
- 10. Baylor 2,515 posts
- 11. Tumblr 17.9K posts
- 12. Hawaii 13.1K posts
- 13. The Players 101K posts
- 14. James Nnaji 1,672 posts
- 15. Marty Supreme 12.5K posts
- 16. WM Phx 3,663 posts
- 17. Nasry Asfura 20.3K posts
- 18. #SoleRetriever N/A
- 19. Bsky 42.3K posts
- 20. SDNY 14.6K posts
Something went wrong.
Something went wrong.