FullStackML's profile picture. 🛠️ Build data tools for AI / ML. Ex-Data Scientist @Microsoft. PhD in CS. Telling jokes with a poker face.

Dmitry Petrov

@FullStackML

🛠️ Build data tools for AI / ML. Ex-Data Scientist @Microsoft. PhD in CS. Telling jokes with a poker face.

DBT + Fivetran 🚀 A huge milestone for the "modern data stack". Consolidation is on - who's next? Snowflake ❄️? Databricks 🔥? But maybe that doesn’t even matter. The next wave is here: Multimodal data stack It's not replacing the old one - it's for different users: 🤖 AI, not…

@dbt_labs and @fivetran are joining forces to define the future of data: open data infrastructure. One foundation for movement, transformation, and AI—built to be open, reliable, and interoperable. Read more about our shared vision getdbt.com/blog/dbt-labs-…

getdbt's tweet image. @dbt_labs and @fivetran are joining forces to define the future of data: open data infrastructure.

One foundation for movement, transformation, and AI—built to be open, reliable, and interoperable.

Read more about our shared vision getdbt.com/blog/dbt-labs-…


AI isn't just about text and code. What about sounds, videos, and sensors? 🎧🎬🔬 I’ll be at @MLOpsWorld Summit (Oct 6-9 in Austin, TX) sharing how to query inside the file ⚡️ Come nerd out with me in Texas 👋🤠 #MLOpsWorld2025

FullStackML's tweet image. AI isn't just about text and code. What about sounds, videos, and sensors? 🎧🎬🔬

I’ll be at @MLOpsWorld Summit (Oct 6-9 in Austin, TX) sharing how to query inside the file ⚡️

Come nerd out with me in Texas 👋🤠
#MLOpsWorld2025

"90% of code will be AI-written" 🤖 Sounds insane - until you see the pattern. When the building blocks exist, coding is just connecting the dots 🔗 And nobody connects dots better than AI. That’s why AI crushes boilerplate web apps 🛠️ - the blocks are there. And why it…


Spent the weekend reading this. Easily the best agentic book so far. My agent recommends I read more. Any suggestions?

This Google engineer just released a 424-page free book on Agentic Design Patterns. Covers advanced prompt engineering, multi-agent frameworks, RAG, agent tool use and MCP. 100% free with practical code examples.

Saboo_Shubham_'s tweet image. This Google engineer just released a 424-page free book on Agentic Design Patterns.

Covers advanced prompt engineering, multi-agent frameworks, RAG, agent tool use and MCP.

100% free with practical code examples.


To stay ahead of the curve in vibe-coding you must rotate IDEs every 2 months Cursor → Claude → Cursor → (???) → repeat. Productivity is temporary, but vibes are forever 😎✨


"Heavy Data": messy, multimodal, and lives in object storage, not databases. This term I first heard from @RobFergusonIII - it nails it! Time to rethink how we manage and query heavy data. datachain.ai/blog/from-big-…


Dmitry Petrov 님이 재게시함

2.5 years into the AI craze, and I continue to firmly believe that if your company wasn’t already interesting/succeeding without AI, then doing “whatever plus AI” isn’t going to save you. For the few that seem this way (eg Cursor), I think their moat is a lot weaker than it…


Dmitry Petrov 님이 재게시함

DataChain enables reproducibility. It versions and tracks dependencies, code. A quick demo from @FullStackML :


Making sense of millions of audio files! An incredible use case for extracting actionable insights from complex data.

A small DataChain video on processing audio data from @huggingface with 🤗 models. We need more tools to do ETLs, analytics, governance, preparation for unstructured data at scale! - stream files from tar or wds archives! 🤯 - enrich, prepare, version, publish datasets 🚀 -…



🚀 datachain

1/N DataChain hit 2000 stars ⭐ on GitHub a week ago. Thanks for your interest and support 🤗 It was built to address those needs and pain points we saw in the DVC community when people have to deal with millions of files (e.g. images, pdfs, audio, etc).

DVCorg's tweet image. 1/N DataChain hit 2000 stars ⭐ on GitHub a week ago. Thanks for your interest and support 🤗 It was built to address those needs and pain points we saw in the DVC community when people have to deal with millions of files (e.g. images, pdfs, audio, etc).


Dmitry Petrov 님이 재게시함

gm

haro_ca_'s tweet image. gm

After trending in Hacker News, our open-source is now trending in GitHub. What’s next - Netflix special? github.com/iterative/data…

FullStackML's tweet image. After trending in Hacker News, our open-source is now trending in GitHub.

What’s next - Netflix special?

github.com/iterative/data…

Now you can publish datasets from DataChain to @huggingface with a single command! ...because who has time for two? 🚀📚

Datasets + LLMs + Pydantic = DataChain ...now with @huggingface !💛 DataChain by @DVCorg just added @huggingface support ! Create, Load, Transform HF Datasets with LLMs easily. - Pydantic for dataset schema - Use your own or public HF Datasets - Run your own or public HF Models

lhoestq's tweet image. Datasets + LLMs + Pydantic = DataChain
...now with @huggingface !💛

DataChain by @DVCorg just added @huggingface support ! Create, Load, Transform HF Datasets with LLMs easily.

- Pydantic for dataset schema
- Use your own or public HF Datasets
- Run your own or public HF Models


Loading...

Something went wrong.


Something went wrong.