#apachedatafusion search results

Mim

Mar 21, 2023

I am happy to announce 🤡 the winner of of TPCH-SF30 on the free tier Colab notebook is #Tableau Hyper Engine, #DuckDB and #Apachedatafusion could not finish it as they get OOM. this is what I call serieus engineering !!! colab.research.google.com/drive/1sqp_V34…

mim_djo's tweet image. I am happy to announce 🤡 the winner of of TPCH-SF30 on the free tier Colab notebook is #Tableau Hyper Engine, #DuckDB and #Apachedatafusion could not finish it as they get OOM.
this is what I call serieus engineering !!!
colab.research.google.com/drive/1sqp_V34…

Mim

@mim_djo

Jul 31, 2023

#apachedatafusion is progressing well, querying 42 million rows using the free tier of Colab is something, other SQL engines will crash, I think it will eventually catch up to #DuckDB colab.research.google.com/drive/1KzofqAW…

mim_djo's tweet image. #apachedatafusion is progressing well, querying 42 million rows using the free tier of Colab is something, other SQL engines will crash, I think it will eventually catch up to #DuckDB
colab.research.google.com/drive/1KzofqAW…

Paul Dix

@pauldix

Sep 10

A piece I wrote about rebuilding InfluxDB in #rustlang and #ApacheArrow and #ApacheDataFusion is up on InfoQ!

InfoQ

@InfoQ

Sep 10

Engineering a Time Series Database Using Open Source: Rebuilding InfluxDB 3 in Apache Arrow and Rust bit.ly/46fmuUK authored by @pauldix, reviewed by @olimpiupop

infoq.com

Engineering a Time Series Database Using Open Source: Rebuilding InfluxDB 3 in Apache Arrow and Rust

At times, to evolve your product, you need to rebuild it from scratch. The article provides the story behind the rewrite of InfluxDB from scratch using a different programming language - Rust - and...

Source: infoq.com

Pierre Zemb

@PierreZ

Jun 4

Just merged our first SQL query using #ApacheDataFusion at work! 🎉 Integrating it was a fantastic experience, It will soon be part of the core of #Materia. Wrote about why DataFusion is a game-changer for #Rustlang #database systems: pierrezemb.fr/posts/thank-yo…

Mim

@mim_djo

Dec 23

#delta_rs which is basically the standard way for writing delta table using Python ( no spark) just exposed a SQL interface , it was a simple change as it is already based on #apachedatafusion tested using #Microsoftfabric notebook : github.com/djouallah/Fabr…

mim_djo's tweet image. #delta_rs which is basically the standard way for writing delta table using Python ( no spark)
just exposed a SQL interface , it was a simple change as it is already based on #apachedatafusion
tested using #Microsoftfabric notebook :
github.com/djouallah/Fabr…

Mim

@mim_djo

Mar 19, 2023

start playing with #Apachedatafusion, the good thing it is mature enough that I could run the same test unmodified, that's a testament to SQL I guess, still memory issues, when I increase the data size colab.research.google.com/drive/1WJ2ICxJ…

mim_djo's tweet image. start playing with #Apachedatafusion, the good thing it is mature enough that I could run the same test unmodified, that's a testament to SQL I guess, still memory issues, when I increase the data size

colab.research.google.com/drive/1WJ2ICxJ…

Andrey Velichkevich

@andreyvelichk

Jun 5

Check out this Kubeflow in-memory data cache solution built on #ApacheArrow & #ApacheDataFusion! It optimizes sharding of #ApacheIceberg tables and enhances #Kuberentes for #GenAI workloads. github.com/kubeflow/commu… 📽️ #KubeCon + #CloudNativeCon talk: youtu.be/s4KAe7AtN7s

andreyvelichk's tweet card. Speed up Your ML Workloads With Kubernetes Powered In-memory Data......

youtube.com

YouTube

Speed up Your ML Workloads With Kubernetes Powered In-memory Data......

Source: youtube.com

Jagdish Parihar

@jatin6972

May 24

Exploring #ApacheDataFusion’s Catalog: maps tables across S3/Postgres/Iceberg & turns unresolved plans into typed logical plans. Still wrapping my head around it, but once the catalog clicks the optimizer is wide open to tweak. 🦀🚀 #RustLang #DataEngineering

Kris Jenkins (@[email protected])

@krisajenkins

Apr 25

In this week's Developer Voices, Andrew Lamb takes us through #ApacheDataFusion, exploring how this #Rust toolkit shaves years off the prospect of creating a custom database. Fascinating stuff for any data and architecture fans like me. 😁 youtu.be/8QNNCr8WfDM

krisajenkins's tweet card. DataFusion - The Database Building Toolkit (with Andrew Lamb)

youtube.com

YouTube

DataFusion - The Database Building Toolkit (with Andrew Lamb)

Source: youtube.com

Mim

@mim_djo

Apr 8, 2023

let's try simple sorting of a parquet file but with bigger VM, 16 CPU/60 GB RAM #DuckDB 29.6s #ApacheDatafusion 1min 4s #Apachespark : need to configure java, not interested #Polars 1min 6s #Clickhouse 1min 24s #pyarrow 2m 3s github.com/djouallah/parq…

Mim

@mim_djo

Jul 15

you can pass #apachedatafusion dataframe directly to delta table python and it works great but ... behind the scene it calls collect() which load the whole damn data into memory, if you have a lot of data to process use daft or duckdb

Henry Medina

@CraftyTech

Jul 22

Startups are doubling down on #ApacheDataFusion for its powerful, scalable data processing capabilities! 🚀 With a flexible architecture and open-source community support, it's transforming data analytics. Discover why it's the future! #BigData #TechTrends #Startups

#apachedatafusion search results

Mim

Mim

Paul Dix

InfoQ

Engineering a Time Series Database Using Open Source: Rebuilding InfluxDB 3 in Apache Arrow and Rust

Pierre Zemb

Mim

Mim

Andrey Velichkevich

YouTube

Jagdish Parihar

Kris Jenkins (@[email protected])

YouTube

Mim

Mim

Henry Medina

InfluxData

Mim

Felipe O. Carvalho

InfluxData

​Apache DataFusion Meetup: Chicago December 2024 Recap

InfluxData

Carles Dijous

InfluxData

InfluxData

Paul Dix

InfoQ

Engineering a Time Series Database Using Open Source: Rebuilding InfluxDB 3 in Apache Arrow and Rust

Carles Dijous

Henry Medina

Mim

Andrey Velichkevich

YouTube

Pierre Zemb

Jagdish Parihar

Kris Jenkins (@[email protected])

YouTube

InfluxData

​Apache DataFusion Meetup: Chicago December 2024 Recap

Mim

Felipe O. Carvalho

InfluxData

InfluxData

YouTube

InfluxData

Apache DataFusion is Now the Fastest Single Node Engine for Querying Apache Parquet Files

InfluxData

Mim

InfluxData

YouTube

InfluxData

InfluxData

InfluxData

InfluxData

YouTube

ApacheDataFusion

InfluxData

Mim

Mim

Mim

Mim

United States Trends

Apache DataFusion Meetup: Chicago December 2024 Recap

Apache DataFusion Meetup: Chicago December 2024 Recap