leore245's profile picture. often analytical, perpetually curious.

prex

@leore245

often analytical, perpetually curious.

Why are the same models (gemini 2.5 pro) so much better when used in AI Studio than when used in Gemini? Are they being quantized to serve millions of users in Gemini, or has someone fine-tuned them for chat in a way that makes them inferior? @GeminiApp @OfficialLoganK


🚨 New Llama 4 models incoming!? rhea spotted! Great logic and writing.

leore245's tweet image. 🚨 New Llama 4 models incoming!? rhea spotted! Great logic and writing.

SKY from OpenAI appears to be another version of GPT-4o. @ai_for_success @testingcatalog

leore245's tweet image. SKY from OpenAI appears to be another version of GPT-4o. @ai_for_success @testingcatalog

Claude is an artist who just drew his own portrait. He knew he looked like this in our minds!

leore245's tweet image. Claude is an artist who just drew his own portrait. He knew he looked like this in our minds!

None of the Claude models can count the number of "r"s in a word. GPT-4o can. Claude 3.5 just invents an extra "r". @AnthropicAI please fix

leore245's tweet image. None of the Claude models can count the number of "r"s in a word. GPT-4o can. Claude 3.5 just invents an extra "r". @AnthropicAI please fix
leore245's tweet image. None of the Claude models can count the number of "r"s in a word. GPT-4o can. Claude 3.5 just invents an extra "r". @AnthropicAI please fix
leore245's tweet image. None of the Claude models can count the number of "r"s in a word. GPT-4o can. Claude 3.5 just invents an extra "r". @AnthropicAI please fix

God, after seeing the improvements from Sonnet 3 to 3.5 and 3.7, I really want to try Claude Opus 3.5! Opus 3 is already the most interesting LLM I've conversed with, and it's a shame we didn't get Opus 3.5 with similar improvements. But now, with RL and test-time-compute,…


AI is becoming increasingly self-aware in how it processes absurd or impossible scenarios each year. ChatGPT-3.5, when given a problem or an absurd scenario, such as "I am on the Titanic," would simply provide a solution without any metacognition. Claude 3.5, on the other hand,…


This is why they didn't launch Opus 3.5: marginally better performance for an exponentially higher price.

leore245's tweet image. This is why they didn't launch Opus 3.5: marginally better performance for an exponentially higher price.

End of an era, the pre-training era.

leore245's tweet image. End of an era, the pre-training era.

OpenAI, please release GPT-4.5 for real-world use cases like Pokémon. Claude has been STUCK FOR 16 HOURS! Save us, OpenAI, by providing a better game-player!

leore245's tweet image. OpenAI, please release GPT-4.5 for real-world use cases like Pokémon. Claude has been STUCK FOR 16 HOURS! Save us, OpenAI, by providing a better game-player!

326 sources 🚨 Grok 3 DeepResearch can save hours of time and, best of all, benefits everyone—not just coders! Definitely one of Grok 3's most powerful features.

leore245's tweet image. 326 sources 🚨 Grok 3 DeepResearch can save hours of time and, best of all, benefits everyone—not just coders! Definitely one of Grok 3's most powerful features.

"OpenAI. xAI. Anthropic. Perplexity. Long ago, the four companies lived together in harmony. Then, everything changed when OpenAI attacked. Only the high taster, master of all four sota models, could review them. But when the world needed him most, he vanished." @AIExplainedYT

leore245's tweet image. "OpenAI. xAI. Anthropic. Perplexity.
Long ago, the four companies lived together in harmony. Then, everything changed when OpenAI attacked.
Only the high taster, master of all four sota models, could review them. But when the world needed him most, he vanished." @AIExplainedYT

"As well as giving Claude the ability to think for longer and thus answer tougher questions, we’ve decided to make its thought process visible in raw form." this is one thing OpenAI can improve in


Anthropic does not have the mandate of heaven anymore, and OpenAI never lost it.

leore245's tweet image. Anthropic does not have the mandate of heaven anymore, and OpenAI never lost it.

Claude 3.7 Sonnet is WORSE than Claude Sonnet 3.5... how?!

leore245's tweet image. Claude 3.7 Sonnet is WORSE than Claude Sonnet 3.5... how?!
leore245's tweet image. Claude 3.7 Sonnet is WORSE than Claude Sonnet 3.5... how?!


Claude 3.7 Sonnet is WORSE than Claude Sonnet 3.5... how?!

leore245's tweet image. Claude 3.7 Sonnet is WORSE than Claude Sonnet 3.5... how?!
leore245's tweet image. Claude 3.7 Sonnet is WORSE than Claude Sonnet 3.5... how?!

Grok 3 DeepSearch is amazing and has genuinely been much, much better than its competitors, Perplexity Deep Research and Gemini Deep Research. When asked about the release dates of Llama 4 models, Perplexity provides old, outdated information and does not mention LlamaCon, where…

leore245's tweet image. Grok 3 DeepSearch is amazing and has genuinely been much, much better than its competitors, Perplexity Deep Research and Gemini Deep Research.

When asked about the release dates of Llama 4 models, Perplexity provides old, outdated information and does not mention LlamaCon, where…
leore245's tweet image. Grok 3 DeepSearch is amazing and has genuinely been much, much better than its competitors, Perplexity Deep Research and Gemini Deep Research.

When asked about the release dates of Llama 4 models, Perplexity provides old, outdated information and does not mention LlamaCon, where…

How is grok 3 so fast!? Is it a very smol 🤏 model? Or are the engineers at @xai just fucking cracked?


Loading...

Something went wrong.


Something went wrong.