_vutrinh's profile picture. My mom read my articles to support her son. Now, she can design a data architecture and write ETL scripts.

vutrinh

@_vutrinh

My mom read my articles to support her son. Now, she can design a data architecture and write ETL scripts.

Parquet is not a columnar format. Indeed, it’s a hybrid format combining the best of row and column formats. Parquet groups data into subsets of rows. (horizontal partition.) In each subset, data for each column is stored close together. (vertical partition) A Parquet file is…

_vutrinh's tweet image. Parquet is not a columnar format.

Indeed, it’s a hybrid format combining the best of row and column formats.

Parquet groups data into subsets of rows. (horizontal partition.)

In each subset, data for each column is stored close together. (vertical partition)

A Parquet file is…

🚀🚀 DuckDB is great. It allows us to execute analytics SQLs on the local laptop with minutes set up. Here are some bullet points about its storage after my sefl-learning process via DuckDB’s materials and source code. ◉ Two modes: persistent and in-memory; the latter will…


vutrinh reposted

Paper I would love to read but instead have to write? 🤔

penberg's tweet image. Paper I would love to read but instead have to write? 🤔

vutrinh reposted

Have you ever wondered how the Parquet dataset is written on the Disk? Parquet is a self-described file format that contains all the information needed for the application that consumes the file. Parquet organizes data in a hybrid format behind the scenes.

shivang_in's tweet image. Have you ever wondered how the Parquet dataset is written on the Disk?

Parquet is a self-described file format that contains all the information needed for the application that consumes the file.

Parquet organizes data in a hybrid format behind the scenes.

🚀🚀 How does Apache Spark execute the applications for us? A few weeks ago, I wrote an article that gave an overview of Apache Spark. Let’s revisit how Spark handles processing—from user-defined logic to execution by the executors: ◉ Defining the Application: The user defines…

_vutrinh's tweet image. 🚀🚀 How does Apache Spark execute the applications for us?

A few weeks ago, I wrote an article that gave an overview of Apache Spark. Let’s revisit how Spark handles processing—from user-defined logic to execution by the executors:

◉ Defining the Application: The user defines…

🤔 My humble observation Large-scale cloud OLAP has increasingly converged toward the lakehouse paradigm. Below are some insights from my research—feel free to discuss or share corrections if you find anything off! 📌 In this context: ➝ Internal tables refer to data loaded…


🚀🚀 How does the @ApacheSpark plan the execution for us? (With the help of Catalyst Optimizer) When defining DataFrame transformation logic, it must first go through an optimized process before execution. This involves four key phases: ◉ Analysis: Spark SQL starts by…

_vutrinh's tweet image. 🚀🚀 How does the @ApacheSpark plan the execution for us?

(With the help of Catalyst Optimizer)

When defining DataFrame transformation logic, it must first go through an optimized process before execution. This involves four key phases:

◉ Analysis: Spark SQL starts by…

🚀🚀 How does the @ApacheIceberg reading process look like? ◉ The reader first visits the catalog to retrieve the table's current metadata file location. ◉ After fetching the metadata file, it collects the table’s schema and checks partition schemes to understand the data…

_vutrinh's tweet image. 🚀🚀 How does the @ApacheIceberg  reading process look like?

◉ The reader first visits the catalog to retrieve the table's current metadata file location.

◉ After fetching the metadata file, it collects the table’s schema and checks partition schemes to understand the data…

vutrinh reposted

🎉 Wow. This is truly an epic masterpiece. Article from Vu Trinh(@_vutrinh), with its vivid illustrations, breaks down and explains the technical architecture of AutoMQ in a very clear and understandable way. If you're interested in the cloud-native technical architecture of…

AutoMQ_Lab's tweet image. 🎉 Wow. This is truly an epic masterpiece. Article from Vu Trinh(@_vutrinh), with its vivid illustrations, breaks down and explains the technical architecture of AutoMQ in a very clear and understandable way. If you're interested in the cloud-native technical architecture of…

United States Trends

Loading...

Something went wrong.


Something went wrong.