Dan Mac

@daniel_mac8

AI Field Engineer @sourcegraph | Writing Token Stream | Goodness, Truth and AI | Building at http://github.com/DannyMac180

Science & Technology

United States

tokenstream.substack.com

於十月 2013 加入

25千貼文 15千位跟隨者 4千個跟隨中

你可能會喜歡

@TheWriterRatul

@AmOnlyMo

@JourneyActive

@spike0ekips

@1981Herr

@craniumcrackR86

@Ronnpinoy

@hariharmath

@mustardnut

置頂

Dan Mac

@daniel_mac8

19 小時

Aristotle, an AI system specialized for Mathematics from @HarmonicMath, solved Erdős problem #481. Days ago the same system solved problem #124. Controversy ensued as #124 was supposedly the "easy" version. #481 is *not* an easy version. Terrence Tao even commented: "Nice!"…

daniel_mac8's tweet image. Aristotle, an AI system specialized for Mathematics from @HarmonicMath, solved Erdős problem #481.

Days ago the same system solved problem #124.

Controversy ensued as #124 was supposedly the "easy" version.

#481 is *not* an easy version.

Terrence Tao even commented: "Nice!"…

Dan Mac

@daniel_mac8

13 小時

Anthropic acquires Bun. The strategy is clear: Claude becomes the full stack AI computing platform. > Claude Opus/Sonnet/Haiku = Compute > Claude Code = Orchestration > Bun = Execution Claude isn't just the intelligence. It's the entire AI environment. Claude is a flywheel.

Anthropic

@AnthropicAI

16 小時

Anthropic is acquiring @bunjavascript to further accelerate Claude Code’s growth. We're delighted that Bun—which has dramatically improved the JavaScript and TypeScript developer experience—is joining us to make Claude Code even better. Read more: anthropic.com/news/anthropic…

AnthropicAI's tweet card. Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.

Anthropic acquires Bun as Claude Code reaches $1B milestone

來源: anthropic.com

Dan Mac

@daniel_mac8

7 小時

Opus 4.5 is a glorious creature.

Dan Mac

@daniel_mac8

21 小時

If the OG is hyped, I’m hyped. Sam-ta Claus is coming to town.

Jimmy Apples 🍎/acc

@apples_jimmy

年12月2日

“ Altman said OpenAI is planning to ship a new reasoning model next week that is ‘ ahead of [Google’s] Gemini 3 ‘ “

Dan Mac 已轉發

Dan Mac

@daniel_mac8

19 小時

Dan Mac

@daniel_mac8

11 小時

Honest question: is Grok 4.1 Fast really better than Opus 4.5 at tool calling or is this pure unadulterated benchmaxxing?

Alejandro Cuadron

@Alex_Cuadron

14 小時

Very unexpected results! Grok 4.1 Fast Reasoning beats every frontier model in Tau2-Verified! Congrats team! I was certainly not expecting a Fast model to beat @AnthropicAI 's Opus 4.5 in agentic tasks @xai @elonmusk @Yuhu_ai_ Check it out: github.com/amazon-agi/tau…

Alex_Cuadron's tweet image. Very unexpected results! Grok 4.1 Fast Reasoning beats every frontier model in Tau2-Verified!

Congrats team! I was certainly not expecting a Fast model to beat @AnthropicAI 's Opus 4.5 in agentic tasks @xai @elonmusk @Yuhu_ai_

Check it out: github.com/amazon-agi/tau…

Dan Mac

@daniel_mac8

14 小時

Sama declares 🔴 Code Red 🔴 at OpenAI. The below chart shows why. For the first time since Nov. '22, OpenAI is falling behind Google and Anthropic on model capability rather than only coding or cost/performance ratio. Don't count OpenAI out of the race just yet though...…

daniel_mac8's tweet image. Sama declares 🔴 Code Red 🔴 at OpenAI.

The below chart shows why.

For the first time since Nov. '22, OpenAI is falling behind Google and Anthropic on model capability rather than only coding or cost/performance ratio.

Don't count OpenAI out of the race just yet though...…

Dan Mac

@daniel_mac8

年12月1日

Oh, you think your LLM is bad at instruction following? Try getting your 2.5 year old to go to bed…

Dan Mac

@daniel_mac8

年12月2日

Pretty sure David Sacks is safe here. Do you know anyone that regularly reads the New York Times? Not even trying to be a dick. The only regular interaction anyone I know has with NYT is Wordle. Sad because it was the best.

David Sacks

@DavidSacks

年11月30日

INSIDE NYT’S HOAX FACTORY Five months ago, five New York Times reporters were dispatched to create a story about my supposed conflicts of interest working as the White House AI & Crypto Czar. Through a series of “fact checks” they revealed their accusations, which we debunked…

DavidSacks's tweet image. INSIDE NYT’S HOAX FACTORY

Five months ago, five New York Times reporters were dispatched to create a story about my supposed conflicts of interest working as the White House AI &amp; Crypto Czar.

Through a series of “fact checks” they revealed their accusations, which we debunked…

Dan Mac

@daniel_mac8

年12月1日

Non-human intelligences creating new knowledge. Pretty, pretty cool.

Carina Hong

@CarinaLHong

年12月1日

Claude by @AnthropicAI proved Erdos problem #124 in Lean.

Dan Mac 已轉發

eric provencher

@pvncher

年12月1日

I often get asked “what are the best models”, so I added my current model recommendations list here!

eric provencher

@pvncher

年12月1日

Finally revamped all the @RepoPrompt docs, making them more approachable and up to date! And, you can now copy all the docs to the clipboard to talk to an LLM about them! It's also the last day of the Black Friday sale! repoprompt.com/docs

Repo Prompt

來源: repoprompt.com

Dan Mac 已轉發

Dan Mac

@daniel_mac8

年12月1日

Sorry DeepSeek bros, these benchmarks aren’t very impressive. Is DeepSeek still relevant?

DeepSeek

@deepseek_ai

年12月1日

🚀 Launching DeepSeek-V3.2 & DeepSeek-V3.2-Speciale — Reasoning-first models built for agents! 🔹 DeepSeek-V3.2: Official successor to V3.2-Exp. Now live on App, Web & API. 🔹 DeepSeek-V3.2-Speciale: Pushing the boundaries of reasoning capabilities. API-only for now. 📄 Tech…