#dataeng search results

Data lineage refers to the life cycle of data: where it comes from, how it moves, and where it goes within a system. It helps organizations track the flow of data from its origin to its destination, providing transparency and accuracy. By understanding data lineage, #DataEng

cil_academy's tweet image. Data lineage refers to the life cycle of data: where it comes from, how it moves, and where it goes within a system. It helps organizations track the flow of data from its origin to its destination, providing transparency and accuracy. By understanding data lineage,

#DataEng

Databricks proposes 'Variant' to unify semi-structured data across Parquet, Delta and Iceberg. Helpful for AI logs, but without wider engine buy-in it risks being yet another dialect. Worth watching adoption. #DataEng #OpenStandards databricks.com/blog/introduci…


Hiring Data Engineer (remote EU/Asia) Build ClickHouse/dbt pipelines for crypto analytics used by 1000s daily. Need: SQL/Python, streaming data, AI tools, Web3 interest. Visa OK. Apply here. vizzarjobs.com/jobs/cmhlvcoiq… #Web3 #DataEng


Gave Rudderstack's rundown on API integration a skim. Clear on why connecting systems helps, but it tends to end at 'just use a tool'. Barely touches auth sprawl, version drift or monitoring cost. Keep the ops effort in view before you dive. #APIs #DataEng


Datacenter... I love this Noisy AI :) #AI #DataScience #dataeng

From Groq Inc

Thinking about SQL's 'what over how' after Vu Trinh's piece. His stroll from Codd to Cartesian products shows why a dash of relational algebra turns query tuning from magic to method. A quick read for Python-heavy DEs. #DataEng #SQL vutr.substack.com/p/sql-for-data…


#DEZOOMCAMP Data Eng proj: Ag Data Pipeline! 🔍 E2E pipeline gathers, processes, analyzes ag data into insights.🛠️: Kafka, Spark, GCP (GCS & BQ), Airflow, dbt, Metabase, Terraform, Docker.Thx @Al_Grigor, @DataTalksClub, @EcZachly @dbt_labs 🔗shorturl.at/MdD36 #DataEng

abhay_ahirkar's tweet image. #DEZOOMCAMP Data Eng proj: Ag Data Pipeline! 🔍 E2E pipeline gathers, processes, analyzes ag data into insights.🛠️: Kafka, Spark, GCP (GCS & BQ), Airflow, dbt, Metabase, Terraform, Docker.Thx @Al_Grigor, @DataTalksClub, @EcZachly @dbt_labs  
🔗shorturl.at/MdD36  
#DataEng

Snowflake's quick-start for the Arrow ADBC Python driver covers conda/pip install and a rather wordy connection dict. Columnar fetch and bulk ingest are appealing, but config overhead and sparse docs mean it's best tried in a sandbox first. #DataEng #Python


Snowflake outlines Adaptive Cortex Complete, an in-warehouse bandit router that chooses the 'best' LLM per prompt. Cost-vs-quality trade-off is neat, but relying on other models to judge answers risks circular bias. Try, measure, verify. #LLM #DataEng medium.com/snowflake/adap…

medium.com

Adaptive Cortex Complete: Machine Learning Meets Snowflake LLMs

Automatically route each prompt to the “best” LLM model


Iceberg excels at immutable facts, but when a loyalty tier flips ten times an hour you drown in delete files. EQ deletes drag reads, positional deletes drag writes - pick your pain. Teams bolt on Hudi, Paimon or an OLAP sidecar. #DataEng #ApacheIceberg dataengineeringweekly.com/p/when-dimensi…


Still running every Airflow task on the same machine? Vu Trinh maps the executor options and shows why matching workload to executor matters - from quick-fire local jobs to pod-level isolation on K8s. Prep for your next scale spike. #DataEng #Airflow vutr.substack.com/p/where-does-y…


One to note: Snowflake’s update adds sdist support to the Artifact Repository - great for libs without wheels and future ARM builds. Gains in portability, but you’re tied to Snowflake’s build infra and caching behaviour. #Python #DataEng medium.com/snowflake/snow…


Highlights this week: Artoul says Iceberg needs a purpose built engine - tough to argue when Spark ETL still runs at T+15m. Stripe's 'real-time' billing shows the gap - Flink + Pinot yet microbatch. Lakehouse vision vs ops reality. #dataeng #lakehouse dataengineeringweekly.com/p/data-enginee…

dataengineeringweekly.com

Data Engineering Weekly #238

The Weekly Data Engineering Newsletter


Hiring Data Engineer (remote EU/Asia) Build ClickHouse/dbt pipelines for crypto analytics used by 1000s daily. Need: SQL/Python, streaming data, AI tools, Web3 interest. Visa OK. Apply here. vizzarjobs.com/jobs/cmhlvcoiq… #Web3 #DataEng


Snowflake's quick-start for the Arrow ADBC Python driver covers conda/pip install and a rather wordy connection dict. Columnar fetch and bulk ingest are appealing, but config overhead and sparse docs mean it's best tried in a sandbox first. #DataEng #Python


Databricks proposes 'Variant' to unify semi-structured data across Parquet, Delta and Iceberg. Helpful for AI logs, but without wider engine buy-in it risks being yet another dialect. Worth watching adoption. #DataEng #OpenStandards databricks.com/blog/introduci…


Struggling with sluggish ETL pipelines? Ditch the full loads—switch to incremental extracts with delta processing and slash your runtime by 70% 💨 #DataEng #ETL #BigData


Still running every Airflow task on the same machine? Vu Trinh maps the executor options and shows why matching workload to executor matters - from quick-fire local jobs to pod-level isolation on K8s. Prep for your next scale spike. #DataEng #Airflow vutr.substack.com/p/where-does-y…


One to note: Snowflake’s update adds sdist support to the Artifact Repository - great for libs without wheels and future ARM builds. Gains in portability, but you’re tied to Snowflake’s build infra and caching behaviour. #Python #DataEng medium.com/snowflake/snow…


Highlights this week: Artoul says Iceberg needs a purpose built engine - tough to argue when Spark ETL still runs at T+15m. Stripe's 'real-time' billing shows the gap - Flink + Pinot yet microbatch. Lakehouse vision vs ops reality. #dataeng #lakehouse dataengineeringweekly.com/p/data-enginee…

dataengineeringweekly.com

Data Engineering Weekly #238

The Weekly Data Engineering Newsletter


Gave Rudderstack's rundown on API integration a skim. Clear on why connecting systems helps, but it tends to end at 'just use a tool'. Barely touches auth sprawl, version drift or monitoring cost. Keep the ops effort in view before you dive. #APIs #DataEng


Iceberg excels at immutable facts, but when a loyalty tier flips ten times an hour you drown in delete files. EQ deletes drag reads, positional deletes drag writes - pick your pain. Teams bolt on Hudi, Paimon or an OLAP sidecar. #DataEng #ApacheIceberg dataengineeringweekly.com/p/when-dimensi…


Snowflake outlines Adaptive Cortex Complete, an in-warehouse bandit router that chooses the 'best' LLM per prompt. Cost-vs-quality trade-off is neat, but relying on other models to judge answers risks circular bias. Try, measure, verify. #LLM #DataEng medium.com/snowflake/adap…

medium.com

Adaptive Cortex Complete: Machine Learning Meets Snowflake LLMs

Automatically route each prompt to the “best” LLM model


Snowflake unveils Open Catalog: a Polaris Iceberg REST service for a metastore for Spark, Flink, Trino and Snowflake. Helps cut catalog sprawl, but note limits - same deployment, SAML only, external writes read only. #DataEng #ApacheIceberg medium.com/snowflake/snow…

medium.com

Snowflake Open Catalog: Unified, Secure Access to Apache Iceberg Tables

Learn how you can provide centralized, secure read and write access to your Iceberg tables across different REST-compatible query engines


Still going strong, and adding everyday. #dataeng

Have you ever felt lost in the world of data engineering resources? I've spent years collecting, connecting, and curating knowledge that now forms a digital garden where concepts grow and link organically. The «Data Engineering Vault» features a collection of over 1000 curated…



Stuck with TimescaleDB v2.20.3 & PostgreSQL 17.5? Continuous aggregates not auto-refreshing? Policy creation failing to generate jobs? Currently relying on manual refresh_continuous_aggregate() or pg_cron. Better solutions? Share insights! #TimescaleDB #PostgreSQL #DataEng


🔍 Comparing Zilliz Cloud & Deep Lake for scalable vector search reveals key insights: Zilliz excels in speed and automation, while Deep Lake shines in multimedia handling. Choose based on your needs! Read more from Marcus Feldman. #VectorSearch #DataEngift.tt/IvZqwuy

dev.to

What I Learned Comparing Zilliz Cloud and Deep Lake for Scalable Vector Search

Introduction: Why I Benchmarked Two Very Different Tools As I scaled up a semantic search...


Mis últimas semanas con Go me han llevado a una reflexión: ¿Son los **zero values** realmente una característica que simplifica o complican más la vida a la larga en sistemas de datos y APIs? Abro hilo para explicar mi perspectiva. 👇 #GoLang #SoftwareDesign #DataEng


Looking forward to the #DataEng track @QConAI in a couple weeks. Track featuring use cases on #beam #spark #flink #gimel from engineers @Google @PayPal @stitchfix @dataArtisans #qconai

wesreisz's tweet image. Looking forward to the #DataEng track @QConAI in a couple weeks. Track featuring use cases on #beam #spark #flink #gimel from engineers @Google @PayPal @stitchfix @dataArtisans #qconai

Fue un placer oir a @supercoco9 hablar de estas cosas, gracias por venir! #dataeng @MadridDataEng

ma_bits's tweet image. Fue un placer oir a @supercoco9 hablar de estas cosas, gracias por venir! #dataeng @MadridDataEng

.@mitxino77 is presenting Schibsted data journey just now at our @InfoJobs offices 💪 @SchibstedGroup @SchibstedEng #dataeng #bcndataeng #spark

xgumara's tweet image. .@mitxino77 is presenting Schibsted data journey just now at our @InfoJobs offices 💪 @SchibstedGroup @SchibstedEng #dataeng #bcndataeng #spark

ITNEXT summit 2019 takes place tomorrow and it super excited to see that they have #DataEng track curated and led by @holdenkarau . Starting with " Bias in AI - chose your data wisely " 📚🧠

AdiPolak's tweet image. ITNEXT summit 2019 takes place tomorrow and it super excited to see that they have #DataEng track curated and led by @holdenkarau .
Starting with " Bias in AI - chose your data wisely " 📚🧠

Great talk tonight from @flipsidecrypto’s Director of Data Engineering, @Dan_Kleiman, on how Flipside is applying its tech to enable blockchain businesses. Thanks @quantumblack / @McKinsey for hosting us! #blockchain #dataeng

JimMyersTech's tweet image. Great talk tonight from @flipsidecrypto’s Director of Data Engineering, @Dan_Kleiman, on how Flipside is applying its tech to enable blockchain businesses. Thanks @quantumblack / @McKinsey for hosting us! #blockchain #dataeng

What does the future of data engineering look like? How will regulatory scrutiny affect data engineering? Find out more from @wepayeng’s @criccomini #QConSF talk: bit.ly/2zdlDVH #dataeng

QConSF's tweet image. What does the future of data engineering look like? How will regulatory scrutiny affect data engineering? Find out more from @wepayeng’s @criccomini #QConSF talk: bit.ly/2zdlDVH #dataeng

David Matthewman, Head Of Production at @BlisGlobal is talking about Medium (not big) Data. #DELondon #dataeng

We_Are_Ascent's tweet image. David Matthewman, Head Of Production at @BlisGlobal is talking about Medium (not big) Data. #DELondon #dataeng

A great turn out at the #DataEng meetup tonight at @ROKT in Sydney… really good to be back together in-person at a real life meetup! #dataengineering

DataEngMeetup's tweet image. A great turn out at the #DataEng meetup tonight at @ROKT in Sydney… really good to be back together in-person at a real life meetup! #dataengineering
DataEngMeetup's tweet image. A great turn out at the #DataEng meetup tonight at @ROKT in Sydney… really good to be back together in-person at a real life meetup! #dataengineering
DataEngMeetup's tweet image. A great turn out at the #DataEng meetup tonight at @ROKT in Sydney… really good to be back together in-person at a real life meetup! #dataengineering
DataEngMeetup's tweet image. A great turn out at the #DataEng meetup tonight at @ROKT in Sydney… really good to be back together in-person at a real life meetup! #dataengineering

Data lineage refers to the life cycle of data: where it comes from, how it moves, and where it goes within a system. It helps organizations track the flow of data from its origin to its destination, providing transparency and accuracy. By understanding data lineage, #DataEng

cil_academy's tweet image. Data lineage refers to the life cycle of data: where it comes from, how it moves, and where it goes within a system. It helps organizations track the flow of data from its origin to its destination, providing transparency and accuracy. By understanding data lineage,

#DataEng

As we start the new year for @dataengbytes and the #DataEng meetup, we'd like to recognise our awesome sponsors for 2021 once again... Thank you! ❤️

DataEngMeetup's tweet image. As we start the new year for @dataengbytes and the #DataEng meetup, we'd like to recognise our awesome sponsors for 2021 once again... Thank you! ❤️

Getting ready for our talk at @DataConLA on serverless data pipelines on GCP #DataEngineering #dataeng #datacon #gcp #Serverless #turalabs

TuraLabs's tweet image. Getting ready for our talk at @DataConLA on serverless data pipelines on GCP

#DataEngineering #dataeng #datacon #gcp #Serverless #turalabs

We're looking for an awesome speaker to round out the agenda for the upcoming Data Engineering London Meetup on 10 April. Get in touch if you'd like to present! #dataeng #london #datascience

We_Are_Ascent's tweet image. We're looking for an awesome speaker to round out the agenda for the upcoming Data Engineering London Meetup on 10 April. Get in touch if you'd like to present! #dataeng #london #datascience

Join us at @BlisGlobal on 11 July for the next edition of Data Engineering London! Register now: ti.to/data-engineeri… #dataeng #DELondon

We_Are_Ascent's tweet image. Join us at @BlisGlobal on 11 July for the next edition of Data Engineering London! Register now: ti.to/data-engineeri… #dataeng #DELondon

Loading...

Something went wrong.


Something went wrong.


United States Trends