rishdotblog's profile picture. Co-Founder @tryfactiq (YC W23)

Rishabh Srivastava

@rishdotblog

Co-Founder @tryfactiq (YC W23)

置顶

Open-sourcing Introspect: MIT-licensed Deep-Research for your internal data! Works with spreadsheets, databases, PDFs, and web search. Has a remarkably simple architecture – Sonnet agent armed with recursive tool calling and 3 default tools. Best for use-cases where you want to…

rishdotblog's tweet image. Open-sourcing Introspect: MIT-licensed Deep-Research for your internal data!

Works with spreadsheets, databases, PDFs, and web search. Has a remarkably simple architecture – Sonnet agent armed with recursive tool calling and 3 default tools.

Best for use-cases where you want to…

Launching something new today. Thought I had everything covered and could have a chill launch week. Then, found tons of bugs and this happened 🫠

rishdotblog's tweet image. Launching something new today. Thought I had everything covered and could have a chill launch week.

Then, found tons of bugs and this happened 🫠

Been taking Opus 4.5 for a spin. Opus 4.5 + Claude Code is super worth it for planning, but I still prefer Codex for actual coding and reliability. Opus +ves - Great at using web search to get info that it needs - Thoroughly explores the codebase - Creates fairly concise plans…


Rishabh Srivastava 已转帖

Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, & tool use, and an open model flow—not just the final weights, but the entire training journey. Best fully open 32B reasoning model & best 32B base model. 🧵

allen_ai's tweet image. Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, & tool use, and an open model flow—not just the final weights, but the entire training journey.
Best fully open 32B reasoning model & best 32B base model. 🧵

I like the new Codex Max – but it's extremely emotionally challenged when writing frontend copy 😅 It's also meh at design Very, very good at verifiable tasks (sp on the backend) though!

rishdotblog's tweet image. I like the new Codex Max – but it's extremely emotionally challenged when writing frontend copy 😅 It's also meh at design

Very, very good at verifiable tasks (sp on the backend) though!

Huh, it's somehow gone to shit in the last 30 minutes. Guess they're still figuring out how to handle more traffic w/o compromising quality

Gemini Pro 3 + Antigravity is very good. Antigravity still has janky UX – but its capabilities more than make up for it. Handles major refactors and large codebases extremely well Gemini's long-context supremacy really shining through here



Gemini Pro 3 + Antigravity is very good. Antigravity still has janky UX – but its capabilities more than make up for it. Handles major refactors and large codebases extremely well Gemini's long-context supremacy really shining through here


ChatGPT has (finally) started taking credit for the thankless work it does 😅

rishdotblog's tweet image. ChatGPT has (finally) started taking credit for the thankless work it does 😅

Added `grok-4-fast` to my agentic data analysis benchmark – super cheap, super fast, super good

rishdotblog's tweet image. Added `grok-4-fast` to my agentic data analysis benchmark – super cheap, super fast, super good

Haiku 4.5 hits a sweet spot for agentic data analysis workflows Super nice blend of low cost, low latency, and high quality outputs. I found it better than gpt-5. Will try to publish proper evals if I can find the time!

rishdotblog's tweet image. Haiku 4.5 hits a sweet spot for agentic data analysis workflows

Super nice blend of low cost, low latency, and high quality outputs. I found it better than gpt-5. Will try to publish proper evals if I can find the time!


Haiku 4.5 hits a sweet spot for agentic data analysis workflows Super nice blend of low cost, low latency, and high quality outputs. I found it better than gpt-5. Will try to publish proper evals if I can find the time!

rishdotblog's tweet image. Haiku 4.5 hits a sweet spot for agentic data analysis workflows

Super nice blend of low cost, low latency, and high quality outputs. I found it better than gpt-5. Will try to publish proper evals if I can find the time!

You're doing yourself a disservice if you still have not used Codex It worked uninterrupted for 35 mins for a super complex task - and got it right first try Quite nuts - it's already a much better programmer than me (for verifiable tasks) already.

rishdotblog's tweet image. You're doing yourself a disservice if you still have not used Codex

It worked uninterrupted for 35 mins for a super complex task - and got it right first try

Quite nuts - it's already a much better programmer than me (for verifiable tasks) already.

Man OpenAI killed it this DevDay. Tons of startups will have to pivot as a result of this. "Ride the waves caused by constant churn" seems to be the only viable strategy for an early stage co moving forward 😅


Fascinating chart. Survey from April 2024

rishdotblog's tweet image. Fascinating chart. Survey from April 2024

Google's on a roll. That's a lot of performance for that tiny size! I just embedded 1.4 million documents in ~80 mins on my M2 Max for free. Would've been ~$200 with the text-embedding-3-large, with worse quality.

Introducing EmbeddingGemma🎉 🔥With only 308M params, this is the top open model under 500M 🌏Trained on 100+ languages 🪆Flexible embeddings (768 to 128 dims) with Matryoshka 🤗Works with your favorite open tools 🤏Runs with as little as 200MB developers.googleblog.com/en/introducing…

osanseviero's tweet image. Introducing EmbeddingGemma🎉

🔥With only 308M params, this is the top open model under 500M
🌏Trained on 100+ languages
🪆Flexible embeddings (768 to 128 dims) with Matryoshka
🤗Works with your favorite open tools
🤏Runs with as little as 200MB

developers.googleblog.com/en/introducing…


Quick poll - what looks better in dark mode? First image or second image?

rishdotblog's tweet image. Quick poll - what looks better in dark mode? First image or second image?
rishdotblog's tweet image. Quick poll - what looks better in dark mode? First image or second image?

Loading...

Something went wrong.


Something went wrong.