#dataeng search results

Data First Consultancy

Sep 12

Daniel points out that shouting 'schema-less!' doesn't spare you the cost of messy queries. Tools evolve, but thinking through entities and relationships is still vital - it often just shifts to a later stage. #DataModeling #DataEng confessionsofadataguy.com/is-data-modeli…

confessionsofadataguy.com

Is Data Modeling Dead? - Confessions of a Data Guy

Ok, not going to lie, I rarely find anything of value in the dregs of r/dataengineering, mostly I fear, because it’s %90 freshers with little to no experience. These green behind the ear know-it-all...

Source: confessionsofadataguy.com

Lawrence

@datacreco

Feb 10, 2024

Work - Pipeline (E) #data #dataeng #vscode

CIL Academy

@cil_academy

Dec 4

Data lineage refers to the life cycle of data: where it comes from, how it moves, and where it goes within a system. It helps organizations track the flow of data from its origin to its destination, providing transparency and accuracy. By understanding data lineage, #DataEng

cil_academy's tweet image. Data lineage refers to the life cycle of data: where it comes from, how it moves, and where it goes within a system. It helps organizations track the flow of data from its origin to its destination, providing transparency and accuracy. By understanding data lineage,

#DataEng

Data First Consultancy

@DataFirstGroup

Oct 15

Databricks proposes 'Variant' to unify semi-structured data across Parquet, Delta and Iceberg. Helpful for AI logs, but without wider engine buy-in it risks being yet another dialect. Worth watching adoption. #DataEng #OpenStandards databricks.com/blog/introduci…

DataFirstGroup's tweet card. Variant is now in Apache Parquet™, unifying semi-structured data across the entire open lakehouse ecosystem, including Apache Spark™, Apache Iceberg™, and Delta Lake.

Introducing Variant: A New Open Standard for Semi-Structured Data in Apache Parquet™, Delta Lake,...

Source: databricks.com

VizzarJobs

@Vizzarjobs

Nov 9

Hiring Data Engineer (remote EU/Asia) Build ClickHouse/dbt pipelines for crypto analytics used by 1000s daily. Need: SQL/Python, streaming data, AI tools, Web3 interest. Visa OK. Apply here. vizzarjobs.com/jobs/cmhlvcoiq… #Web3 #DataEng

Data First Consultancy

@DataFirstGroup

Sep 18

Gave Rudderstack's rundown on API integration a skim. Clear on why connecting systems helps, but it tends to end at 'just use a tool'. Barely touches auth sprawl, version drift or monitoring cost. Keep the ops effort in view before you dive. #APIs #DataEng…

Tarideas

@tarideas

Sep 21, 2024

Datacenter... I love this Noisy AI :) #AI #DataScience #dataeng

From Groq Inc

Data First Consultancy

@DataFirstGroup

Oct 8

Thinking about SQL's 'what over how' after Vu Trinh's piece. His stroll from Codd to Cartesian products shows why a dash of relational algebra turns query tuning from magic to method. A quick read for Python-heavy DEs. #DataEng #SQL vutr.substack.com/p/sql-for-data…

Github Trends

@github_trends

Oct 20

⚡ ETL #Python #DataEng @pathway_com ⭐ 48883 github.com/pathwaycom/pat…

github_trends's tweet card. Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG. - pathwaycom/pathway

GitHub - pathwaycom/pathway: Python ETL framework for stream processing, real-time analytics, LLM...

Source: github.com

Ara

@abhay_ahirkar

Mar 31

#DEZOOMCAMP Data Eng proj: Ag Data Pipeline! 🔍 E2E pipeline gathers, processes, analyzes ag data into insights.🛠️: Kafka, Spark, GCP (GCS & BQ), Airflow, dbt, Metabase, Terraform, Docker.Thx @Al_Grigor, @DataTalksClub, @EcZachly @dbt_labs 🔗shorturl.at/MdD36 #DataEng

abhay_ahirkar's tweet image. #DEZOOMCAMP Data Eng proj: Ag Data Pipeline! 🔍 E2E pipeline gathers, processes, analyzes ag data into insights.🛠️: Kafka, Spark, GCP (GCS &amp; BQ), Airflow, dbt, Metabase, Terraform, Docker.Thx @Al_Grigor, @DataTalksClub, @EcZachly @dbt_labs
🔗shorturl.at/MdD36
#DataEng

Data First Consultancy

@DataFirstGroup

Oct 18

Snowflake's quick-start for the Arrow ADBC Python driver covers conda/pip install and a rather wordy connection dict. Columnar fetch and bulk ingest are appealing, but config overhead and sparse docs mean it's best tried in a sandbox first. #DataEng #Python…

DataFirstGroup's tweet card. This article walks through installing, configuring, and using the Apache Arrow ADBC Snowflake driver from Python, with a focus on practical…

A Quick Start Guide to the Snowflake ADBC Driver with Python

Source: medium.com

Data First Consultancy

@DataFirstGroup

Sep 6

Snowflake outlines Adaptive Cortex Complete, an in-warehouse bandit router that chooses the 'best' LLM per prompt. Cost-vs-quality trade-off is neat, but relying on other models to judge answers risks circular bias. Try, measure, verify. #LLM #DataEng medium.com/snowflake/adap…

medium.com

Adaptive Cortex Complete: Machine Learning Meets Snowflake LLMs

Automatically route each prompt to the “best” LLM model

Source: medium.com

Srinivas

@sriniksv

Feb 16, 2023

forms.gle/evBMJjyqAgF6YE… #DataEng

Data First Consultancy

@DataFirstGroup

Sep 17

Iceberg excels at immutable facts, but when a loyalty tier flips ten times an hour you drown in delete files. EQ deletes drag reads, positional deletes drag writes - pick your pain. Teams bolt on Hudi, Paimon or an OLAP sidecar. #DataEng #ApacheIceberg dataengineeringweekly.com/p/when-dimensi…

DataFirstGroup's tweet card. Why Iceberg Struggles with Fast-Changing Dimensions—and What Comes Next

When Dimensions Change Too Fast for Iceberg

Source: dataengineeringweekly.com

Data First Consultancy

@DataFirstGroup

Oct 13

Still running every Airflow task on the same machine? Vu Trinh maps the executor options and shows why matching workload to executor matters - from quick-fire local jobs to pod-level isolation on K8s. Prep for your next scale spike. #DataEng #Airflow vutr.substack.com/p/where-does-y…

Data First Consultancy

@DataFirstGroup

Oct 1

One to note: Snowflake’s update adds sdist support to the Artifact Repository - great for libs without wheels and future ARM builds. Gains in portability, but you’re tied to Snowflake’s build infra and caching behaviour. #Python #DataEng medium.com/snowflake/snow…

DataFirstGroup's tweet card. At Snowflake, our goal is to streamline the development lifecycle and enhance dependency management for data applications. The Python…

Snowflake Artifact Repository Now Supports sdist Packages

Source: medium.com

Data First Consultancy

@DataFirstGroup

Sep 24

Highlights this week: Artoul says Iceberg needs a purpose built engine - tough to argue when Spark ETL still runs at T+15m. Stripe's 'real-time' billing shows the gap - Flink + Pinot yet microbatch. Lakehouse vision vs ops reality. #dataeng #lakehouse dataengineeringweekly.com/p/data-enginee…

dataengineeringweekly.com

Data Engineering Weekly #238

The Weekly Data Engineering Newsletter

Source: dataengineeringweekly.com

Amrik Malhans

@am_coding

Jun 10, 2024

#airflow #dataeng

DataEngDude

@dataenggdude

Jun 22, 2023

Conceptual Introduction to Delta Lake youtu.be/z7kxiqAxgno #deltalake #databricks #dataeng

dataenggdude's tweet card. Conceptual Introduction to Delta Lake

youtube.com

YouTube

Conceptual Introduction to Delta Lake

Source: youtube.com

VizzarJobs

@Vizzarjobs

Nov 9

Github Trends

@github_trends

Oct 20

⚡ ETL #Python #DataEng @pathway_com ⭐ 48883 github.com/pathwaycom/pat…

GitHub - pathwaycom/pathway: Python ETL framework for stream processing, real-time analytics, LLM...

Source: github.com

Data First Consultancy

@DataFirstGroup

Oct 18

A Quick Start Guide to the Snowflake ADBC Driver with Python

Source: medium.com

Data First Consultancy

@DataFirstGroup

Oct 15

Introducing Variant: A New Open Standard for Semi-Structured Data in Apache Parquet™, Delta Lake,...

Source: databricks.com

Ahm3d

@big_data_eng

Oct 15

Struggling with sluggish ETL pipelines? Ditch the full loads—switch to incremental extracts with delta processing and slash your runtime by 70% 💨 #DataEng #ETL #BigData

Data First Consultancy

@DataFirstGroup

Oct 13

Data First Consultancy

@DataFirstGroup

Oct 1

Snowflake Artifact Repository Now Supports sdist Packages

Source: medium.com

Data First Consultancy

@DataFirstGroup

Sep 24

dataengineeringweekly.com

Data Engineering Weekly #238

The Weekly Data Engineering Newsletter

Source: dataengineeringweekly.com

Data First Consultancy

@DataFirstGroup

Sep 18

Data First Consultancy

@DataFirstGroup

Sep 17

When Dimensions Change Too Fast for Iceberg

Source: dataengineeringweekly.com

Data First Consultancy

@DataFirstGroup

Sep 12

confessionsofadataguy.com

Is Data Modeling Dead? - Confessions of a Data Guy

Source: confessionsofadataguy.com

Data First Consultancy

@DataFirstGroup

Sep 6

medium.com

Adaptive Cortex Complete: Machine Learning Meets Snowflake LLMs

Automatically route each prompt to the “best” LLM model

Source: medium.com

Data First Consultancy

@DataFirstGroup

Sep 3

Snowflake unveils Open Catalog: a Polaris Iceberg REST service for a metastore for Spark, Flink, Trino and Snowflake. Helps cut catalog sprawl, but note limits - same deployment, SAML only, external writes read only. #DataEng #ApacheIceberg medium.com/snowflake/snow…

medium.com

Snowflake Open Catalog: Unified, Secure Access to Apache Iceberg Tables

Learn how you can provide centralized, secure read and write access to your Iceberg tables across different REST-compatible query engines

Source: medium.com

Simon Späti 🏔️

@sspaeti

Aug 13

Still going strong, and adding everyday. #dataeng

Simon Späti 🏔️

@sspaeti

Apr 22

Have you ever felt lost in the world of data engineering resources? I've spent years collecting, connecting, and curating knowledge that now forms a digital garden where concepts grow and link organically. The «Data Engineering Vault» features a collection of over 1000 curated…

avatarfreak

@avtarfreak

Jul 4

Stuck with TimescaleDB v2.20.3 & PostgreSQL 17.5? Continuous aggregates not auto-refreshing? Policy creation failing to generate jobs? Currently relying on manual refresh_continuous_aggregate() or pg_cron. Better solutions? Share insights! #TimescaleDB #PostgreSQL #DataEng

prod42net

@prod42net

Jun 10

🔍 Comparing Zilliz Cloud & Deep Lake for scalable vector search reveals key insights: Zilliz excels in speed and automation, while Deep Lake shines in multimedia handling. Choose based on your needs! Read more from Marcus Feldman. #VectorSearch #DataEng… ift.tt/IvZqwuy

dev.to

What I Learned Comparing Zilliz Cloud and Deep Lake for Scalable Vector Search

Introduction: Why I Benchmarked Two Very Different Tools As I scaled up a semantic search...

Source: dev.to

Michael Lan

@michaellan_eng

May 26

Mis últimas semanas con Go me han llevado a una reflexión: ¿Son los **zero values** realmente una característica que simplifica o complican más la vida a la larga en sistemas de datos y APIs? Abro hilo para explicar mi perspectiva. 👇 #GoLang #SoftwareDesign #DataEng

さくら@データ＆AIで地域DX挑戦中！

@sakura_dataeng

DataEng

@EmberWolff15

Ta16-Gooner⚽️

@Ta16dataeng

LEARNING SQL, python,Aws

@LearningDataeng

DataEng Munich

@DataEngMuc

DataEng Digest

@dataengdigest

Pierre | Data Engineer Freelance

@pierre_dataeng

Dataeng

@dataeng_consult

見習いデータエンジニあ

@dataeng_123

Akr B.

@CloudDataeng

Data Engineering Academy

@dataeng_academy

DataEng.ai

@DataEngAi

david verveer

@dataeng

Breno Carlo

@breno_dataeng

kangula_dataeng

@KDataeng

deblog

@dataeng_

다탱구

@DataengGu

Bhuvanesh D

@bhuv_dataeng

Martin Sosa Melgarejo

@albomx_dataeng

dataeng_cabify

@DCabify

dataeng

@iampurpleocean

Dataeng

@Dataeng3

다탱

@kim_dataeng

Megan Coughlin

@MeganDataeng

Lord DataEng

@SerhatYILD19957

phddma-dataeng

@phd_dataeng

Squarepoint Data-Engineering

@sqpt_dataeng

dataeng360

@dataeng360

Caio Belfort

@dataeng_belfort

ᴅᴛ

@dataeng0

Matthew Hall

@dataeng_dev

João Paulo Alvim

@br_dataeng

Sanjay Chaudhary

@sanjay_dataeng

Ramkumar

@DataengRam

DataEng SOLAR

@DataengSol71241

Neto

@NetoDataeng17

DataEng Student

@DataengStu63477

Francisco J Bataller

@fran_dataeng

Rajesh R

@RajeshR_DATAENG

Dataeng19015

@dataeng19015

Abhishek Ghosh

@Abhi_dataengAI

DataEng Solutions

@DataEng2023

다탱

@dataeng2

Surinder Mann

@GeekDataeng

ywkong

@DataengYwkong

MADHUPRIYA THANGARAJ

@madhu_dataeng

Osman Hassan

@OsmanDataeng

DataEng Solutions

@DataEngSolution

DataEng_crossover

@DataEngCross

Ankur Saxena

@dataeng089

Abdi Yussuf

@ayussufx

Dec 23, 2021

#DataEng

Wesley Reisz (Веслі Райс 🇺🇦)

@wesreisz

Mar 27, 2018

Looking forward to the #DataEng track @QConAI in a couple weeks. Track featuring use cases on #beam #spark #flink #gimel from engineers @Google @PayPal @stitchfix @dataArtisans #qconai

wesreisz's tweet image. Looking forward to the #DataEng track @QConAI in a couple weeks. Track featuring use cases on #beam #spark #flink #gimel from engineers @Google @PayPal @stitchfix @dataArtisans #qconai

Adevinta Spain Eng.

@AdevintaEng

Mar 22, 2018

Schibsted Data Journey, by @mitxino77 meetup.com/Spark-Barcelon… at @InfoJobs #SchibstedTalks #Spark #dataeng #bcndataeng

AdevintaEng's tweet image. Schibsted Data Journey,
by @mitxino77

meetup.com/Spark-Barcelon… at @InfoJobs #SchibstedTalks #Spark #dataeng #bcndataeng

Miguel Angel Fajardo

@ma_bits

May 6, 2019

Fue un placer oir a @supercoco9 hablar de estas cosas, gracias por venir! #dataeng @MadridDataEng

Lawrence

@datacreco

Feb 10, 2024

Work - Pipeline (E) #data #dataeng #vscode

TiDB, powered by PingCAP

@PingCAP

Nov 3, 2021

Live streaming of the #DataEng Meetup🙌 youtube.com/watch?v=pgAnEG…

Xavier Gumara Rigol

@xgumara

Mar 22, 2018

.@mitxino77 is presenting Schibsted data journey just now at our @InfoJobs offices 💪 @SchibstedGroup @SchibstedEng #dataeng #bcndataeng #spark

xgumara's tweet image. .@mitxino77 is presenting Schibsted data journey just now at our @InfoJobs offices 💪 @SchibstedGroup @SchibstedEng #dataeng #bcndataeng #spark

Adi Polak

@AdiPolak

Oct 29, 2019

ITNEXT summit 2019 takes place tomorrow and it super excited to see that they have #DataEng track curated and led by @holdenkarau . Starting with " Bias in AI - chose your data wisely " 📚🧠

AdiPolak's tweet image. ITNEXT summit 2019 takes place tomorrow and it super excited to see that they have #DataEng track curated and led by @holdenkarau .
Starting with " Bias in AI - chose your data wisely " 📚🧠

Jim Myers

@JimMyersTech

Dec 6, 2019

Great talk tonight from @flipsidecrypto’s Director of Data Engineering, @Dan_Kleiman, on how Flipside is applying its tech to enable blockchain businesses. Thanks @quantumblack / @McKinsey for hosting us! #blockchain #dataeng

JimMyersTech's tweet image. Great talk tonight from @flipsidecrypto’s Director of Data Engineering, @Dan_Kleiman, on how Flipside is applying its tech to enable blockchain businesses. Thanks @quantumblack / @McKinsey for hosting us! #blockchain #dataeng

Adil Khashtamov

@adilkhash

Jan 11, 2021

My post about data engineering is on the front page of @newsycombinator. 🔗khashtamov.com/en/how-to-beco… #dataeng #dataengineering #dataengineer

adilkhash's tweet image. My post about data engineering is on the front page of @newsycombinator.

🔗khashtamov.com/en/how-to-beco…

#dataeng #dataengineering #dataengineer

QCon San Francisco Software Development Conference

@QConSF

Oct 8, 2019

What does the future of data engineering look like? How will regulatory scrutiny affect data engineering? Find out more from @wepayeng’s @criccomini #QConSF talk: bit.ly/2zdlDVH #dataeng

QConSF's tweet image. What does the future of data engineering look like? How will regulatory scrutiny affect data engineering? Find out more from @wepayeng’s @criccomini #QConSF talk: bit.ly/2zdlDVH #dataeng

Jimmy Moore 🐀

@clesiemo3

Sep 23, 2019

Now it's even more real! credential.net/e06aeedd #gcp #certified #dataeng

Ascent

@We_Are_Ascent

Apr 10, 2018

David Matthewman, Head Of Production at @BlisGlobal is talking about Medium (not big) Data. #DELondon #dataeng

Data Engineering Meetup

@DataEngMeetup

May 19, 2021

A great turn out at the #DataEng meetup tonight at @ROKT in Sydney… really good to be back together in-person at a real life meetup! #dataengineering

DataEngMeetup's tweet image. A great turn out at the #DataEng meetup tonight at @ROKT in Sydney… really good to be back together in-person at a real life meetup! #dataengineering

CIL Academy

@cil_academy

Dec 4

Data Engineering Meetup

@DataEngMeetup

Jan 19, 2022

As we start the new year for @dataengbytes and the #DataEng meetup, we'd like to recognise our awesome sponsors for 2021 once again... Thank you! ❤️

DataEngMeetup's tweet image. As we start the new year for @dataengbytes and the #DataEng meetup, we'd like to recognise our awesome sponsors for 2021 once again... Thank you! ❤️

TuraLabs

@TuraLabs

Oct 25, 2020

Getting ready for our talk at @DataConLA on serverless data pipelines on GCP #DataEngineering #dataeng #datacon #gcp #Serverless #turalabs

TuraLabs's tweet image. Getting ready for our talk at @DataConLA on serverless data pipelines on GCP

#DataEngineering #dataeng #datacon #gcp #Serverless #turalabs

Ascent

@We_Are_Ascent

Mar 13, 2018

We're looking for an awesome speaker to round out the agenda for the upcoming Data Engineering London Meetup on 10 April. Get in touch if you'd like to present! #dataeng #london #datascience

Ascent

@We_Are_Ascent

Jun 14, 2018

Join us at @BlisGlobal on 11 July for the next edition of Data Engineering London! Register now: ti.to/data-engineeri… #dataeng #DELondon

We_Are_Ascent's tweet image. Join us at @BlisGlobal on 11 July for the next edition of Data Engineering London! Register now: ti.to/data-engineeri… #dataeng #DELondon

Desiree Morton

@Nontraditionall

Aug 18, 2020

#datascience #AMA #dataEng

Something went wrong.

United States Trends

1. Blue Origin 7,907 posts
2. Megyn Kelly 30.3K posts
3. New Glenn 8,908 posts
4. Vine 33.9K posts
5. Senator Fetterman 18.6K posts
6. CarPlay 4,384 posts
7. Brainiac 5,624 posts
8. #NXXT_JPMorgan N/A
9. World Cup 101K posts
10. Portugal 62.7K posts
11. Matt Gaetz 14.2K posts
12. GeForce Season 1,057 posts
13. Padres 28.7K posts
14. Eric Swalwell 26K posts
15. Man of Tomorrow 6,154 posts
16. Black Mirror 5,217 posts
17. Katie Couric 9,764 posts
18. Grade 1 26.3K posts
19. Osimhen 101K posts
20. Apple TV 8,512 posts