
Small Data SF: Think Small, Build Big
@smalldatasf
Rethinking AI & data from the ground up. A community for those building smarter AI and data systems with real workloads, not petabytes.
Scott Haines has the scars to prove it: four big projects, four painful lessons, and the surprising power of thinking small. His story isn’t just cautionary—it’s a playbook for anyone who wants resilient systems and saner engineering. Hear it first-hand at Small Data SF.

🚀 Don’t miss Zulf's session at @smalldatasf! Hands-on lab: ⚡ Capture data in real time 🦆 Stream into MotherDuck 🔄 End-to-end in minutes This is the perfect opportunity for data engineers to experience real-time data streaming without the complexity. #smalldatasf

Sometimes you need to crash, burn, and refactor to really see the value in “small first.” Scott Haines will unpack four hard lessons where embracing smallness didn’t just rescue projects, but led to better product, better performance, better dev happiness. If you want avoidable…

What if instead of chasing larger models, we chased smarter ones—models that do more with less, generalize better, and are easier to deploy? Shelby Heinecke will share what "smaller models" truly mean and how they unlock impact in real settings. If you're building AI and worried…

We ran a bench mark of DuckDB vs Spark and found that data sets under 20 GBs ran about 100 times faster on DuckDB than they did on Apache Spark! You don't need a multi-node cluster for your smaller data sets! This benchmark uses plain parquet files and COUNT distinct to truly…
Are you ready to explore when Apache Spark might not be the best tool for your data projects? Join us at Small Data SF on November 4 and 5 in San Francisco for an insightful talk by Holden Karau, a prominent figure in the world of big data.

Small data means breaking existing paradigms, simplifying, speeding things up and lowering costs This insight from #smalldatasf 2024 is just a taste of what's coming in 2025!
🚀 Small News! Estuary is a Gold Sponsor of @smalldatasf 2025, happening Nov 4-5!! Two days of hands-on workshops, talks, and community, all centered around efficient, local-first development and smarter ways to work with data and AI. See you there! #smalldatasf

"Don't duck up the numbers: Where AI hype meets BI reality." This is going to be a fun panel with Barr Moses (Monte Carlo), Barry McCardel (Hex), Colin Zima (Omni) and Tristan Handy (dbt). Join us!

Small data slaps. Tagging all our Spark friends out there.
Tag someone who needs to hear this: Small Data slaps. 💥 Oh, and join us Nov 4-5 at Small Data SF to learn why small data is smart!

Dr Shelby Heinecke, who leads AI research at Salesforce, will speak at Small Data SF November 5th on how small models don't need more parameters, they just need better data. She'll speak about the highly efficient xLAM family of small action models her team built at Salesforce.

Miss Small Data SF in 2024? Catch out our highlight reel below and learn why one attendee said: "Small Data SF was on another level. The lineup was unbeatable, the content was razor-sharp, and the people were next-level inspiring."
Small Data SF may be a small conference, but the density of talent amongst the speakers and attendees is unmatched. Learn from industry luminaries, on stage and in the audience, on how and why to make your data and AI stack more efficient.

Last year, Wes McKinney of pandas fame, spoke about how people have written a lot of Spark code, making our industry sticky to Spark. This year, we're joined by Holden Karau, Spark PMC member and author of a number of Spark books, talking about when *not* to use Spark.
George Fraser, CEO of Fivetran, has been a pioneer in advocacy around small data. Watch his thoughts from a panel at Small Data SF 2024. This year he returns November 5th to share more wisdom in a talk, alongside many other data luminaries.
Guess who’s back… back again at Small Data SF. 🎤 The data Jedi, Benn Stancil is returning 💎 Missed his epic talk last year? Here’s your chance to feel the force in action. #smalldatasf
United States 趨勢
- 1. Jets 101K posts
- 2. Justin Fields 19.1K posts
- 3. Broncos 42.5K posts
- 4. Drake Maye 7,137 posts
- 5. Aaron Glenn 7,852 posts
- 6. George Pickens 2,741 posts
- 7. London 209K posts
- 8. Sean Payton 3,516 posts
- 9. Tyler Warren 1,898 posts
- 10. Cooper Rush 1,224 posts
- 11. Jerry Jeudy N/A
- 12. Garrett Wilson 4,722 posts
- 13. Steelers 29.1K posts
- 14. TMac N/A
- 15. Bo Nix 4,749 posts
- 16. #Pandu N/A
- 17. Pop Douglas 1,476 posts
- 18. Karty 1,546 posts
- 19. #Patriots 2,768 posts
- 20. Tyrod 2,595 posts
Something went wrong.
Something went wrong.