#dataeng search results
Daniel points out that shouting 'schema-less!' doesn't spare you the cost of messy queries. Tools evolve, but thinking through entities and relationships is still vital - it often just shifts to a later stage. #DataModeling #DataEng confessionsofadataguy.com/is-data-modeli…
confessionsofadataguy.com
Is Data Modeling Dead? - Confessions of a Data Guy
Ok, not going to lie, I rarely find anything of value in the dregs of r/dataengineering, mostly I fear, because it’s %90 freshers with little to no experience. These green behind the ear know-it-all...
Data lineage refers to the life cycle of data: where it comes from, how it moves, and where it goes within a system. It helps organizations track the flow of data from its origin to its destination, providing transparency and accuracy. By understanding data lineage, #DataEng
Databricks proposes 'Variant' to unify semi-structured data across Parquet, Delta and Iceberg. Helpful for AI logs, but without wider engine buy-in it risks being yet another dialect. Worth watching adoption. #DataEng #OpenStandards databricks.com/blog/introduci…
Hiring Data Engineer (remote EU/Asia) Build ClickHouse/dbt pipelines for crypto analytics used by 1000s daily. Need: SQL/Python, streaming data, AI tools, Web3 interest. Visa OK. Apply here. vizzarjobs.com/jobs/cmhlvcoiq… #Web3 #DataEng
Gave Rudderstack's rundown on API integration a skim. Clear on why connecting systems helps, but it tends to end at 'just use a tool'. Barely touches auth sprawl, version drift or monitoring cost. Keep the ops effort in view before you dive. #APIs #DataEng…
Thinking about SQL's 'what over how' after Vu Trinh's piece. His stroll from Codd to Cartesian products shows why a dash of relational algebra turns query tuning from magic to method. A quick read for Python-heavy DEs. #DataEng #SQL vutr.substack.com/p/sql-for-data…
⚡ ETL #Python #DataEng @pathway_com ⭐ 48883 github.com/pathwaycom/pat…
#DEZOOMCAMP Data Eng proj: Ag Data Pipeline! 🔍 E2E pipeline gathers, processes, analyzes ag data into insights.🛠️: Kafka, Spark, GCP (GCS & BQ), Airflow, dbt, Metabase, Terraform, Docker.Thx @Al_Grigor, @DataTalksClub, @EcZachly @dbt_labs 🔗shorturl.at/MdD36 #DataEng
Snowflake's quick-start for the Arrow ADBC Python driver covers conda/pip install and a rather wordy connection dict. Columnar fetch and bulk ingest are appealing, but config overhead and sparse docs mean it's best tried in a sandbox first. #DataEng #Python…
Snowflake outlines Adaptive Cortex Complete, an in-warehouse bandit router that chooses the 'best' LLM per prompt. Cost-vs-quality trade-off is neat, but relying on other models to judge answers risks circular bias. Try, measure, verify. #LLM #DataEng medium.com/snowflake/adap…
medium.com
Adaptive Cortex Complete: Machine Learning Meets Snowflake LLMs
Automatically route each prompt to the “best” LLM model
Iceberg excels at immutable facts, but when a loyalty tier flips ten times an hour you drown in delete files. EQ deletes drag reads, positional deletes drag writes - pick your pain. Teams bolt on Hudi, Paimon or an OLAP sidecar. #DataEng #ApacheIceberg dataengineeringweekly.com/p/when-dimensi…
Still running every Airflow task on the same machine? Vu Trinh maps the executor options and shows why matching workload to executor matters - from quick-fire local jobs to pod-level isolation on K8s. Prep for your next scale spike. #DataEng #Airflow vutr.substack.com/p/where-does-y…
One to note: Snowflake’s update adds sdist support to the Artifact Repository - great for libs without wheels and future ARM builds. Gains in portability, but you’re tied to Snowflake’s build infra and caching behaviour. #Python #DataEng medium.com/snowflake/snow…
Highlights this week: Artoul says Iceberg needs a purpose built engine - tough to argue when Spark ETL still runs at T+15m. Stripe's 'real-time' billing shows the gap - Flink + Pinot yet microbatch. Lakehouse vision vs ops reality. #dataeng #lakehouse dataengineeringweekly.com/p/data-enginee…
dataengineeringweekly.com
Data Engineering Weekly #238
The Weekly Data Engineering Newsletter
Conceptual Introduction to Delta Lake youtu.be/z7kxiqAxgno #deltalake #databricks #dataeng
youtube.com
YouTube
Conceptual Introduction to Delta Lake
Hiring Data Engineer (remote EU/Asia) Build ClickHouse/dbt pipelines for crypto analytics used by 1000s daily. Need: SQL/Python, streaming data, AI tools, Web3 interest. Visa OK. Apply here. vizzarjobs.com/jobs/cmhlvcoiq… #Web3 #DataEng
⚡ ETL #Python #DataEng @pathway_com ⭐ 48883 github.com/pathwaycom/pat…
Snowflake's quick-start for the Arrow ADBC Python driver covers conda/pip install and a rather wordy connection dict. Columnar fetch and bulk ingest are appealing, but config overhead and sparse docs mean it's best tried in a sandbox first. #DataEng #Python…
Databricks proposes 'Variant' to unify semi-structured data across Parquet, Delta and Iceberg. Helpful for AI logs, but without wider engine buy-in it risks being yet another dialect. Worth watching adoption. #DataEng #OpenStandards databricks.com/blog/introduci…
Struggling with sluggish ETL pipelines? Ditch the full loads—switch to incremental extracts with delta processing and slash your runtime by 70% 💨 #DataEng #ETL #BigData
Still running every Airflow task on the same machine? Vu Trinh maps the executor options and shows why matching workload to executor matters - from quick-fire local jobs to pod-level isolation on K8s. Prep for your next scale spike. #DataEng #Airflow vutr.substack.com/p/where-does-y…
One to note: Snowflake’s update adds sdist support to the Artifact Repository - great for libs without wheels and future ARM builds. Gains in portability, but you’re tied to Snowflake’s build infra and caching behaviour. #Python #DataEng medium.com/snowflake/snow…
Highlights this week: Artoul says Iceberg needs a purpose built engine - tough to argue when Spark ETL still runs at T+15m. Stripe's 'real-time' billing shows the gap - Flink + Pinot yet microbatch. Lakehouse vision vs ops reality. #dataeng #lakehouse dataengineeringweekly.com/p/data-enginee…
dataengineeringweekly.com
Data Engineering Weekly #238
The Weekly Data Engineering Newsletter
Gave Rudderstack's rundown on API integration a skim. Clear on why connecting systems helps, but it tends to end at 'just use a tool'. Barely touches auth sprawl, version drift or monitoring cost. Keep the ops effort in view before you dive. #APIs #DataEng…
Iceberg excels at immutable facts, but when a loyalty tier flips ten times an hour you drown in delete files. EQ deletes drag reads, positional deletes drag writes - pick your pain. Teams bolt on Hudi, Paimon or an OLAP sidecar. #DataEng #ApacheIceberg dataengineeringweekly.com/p/when-dimensi…
Daniel points out that shouting 'schema-less!' doesn't spare you the cost of messy queries. Tools evolve, but thinking through entities and relationships is still vital - it often just shifts to a later stage. #DataModeling #DataEng confessionsofadataguy.com/is-data-modeli…
confessionsofadataguy.com
Is Data Modeling Dead? - Confessions of a Data Guy
Ok, not going to lie, I rarely find anything of value in the dregs of r/dataengineering, mostly I fear, because it’s %90 freshers with little to no experience. These green behind the ear know-it-all...
Snowflake outlines Adaptive Cortex Complete, an in-warehouse bandit router that chooses the 'best' LLM per prompt. Cost-vs-quality trade-off is neat, but relying on other models to judge answers risks circular bias. Try, measure, verify. #LLM #DataEng medium.com/snowflake/adap…
medium.com
Adaptive Cortex Complete: Machine Learning Meets Snowflake LLMs
Automatically route each prompt to the “best” LLM model
Snowflake unveils Open Catalog: a Polaris Iceberg REST service for a metastore for Spark, Flink, Trino and Snowflake. Helps cut catalog sprawl, but note limits - same deployment, SAML only, external writes read only. #DataEng #ApacheIceberg medium.com/snowflake/snow…
medium.com
Snowflake Open Catalog: Unified, Secure Access to Apache Iceberg Tables
Learn how you can provide centralized, secure read and write access to your Iceberg tables across different REST-compatible query engines
Still going strong, and adding everyday. #dataeng
Have you ever felt lost in the world of data engineering resources? I've spent years collecting, connecting, and curating knowledge that now forms a digital garden where concepts grow and link organically. The «Data Engineering Vault» features a collection of over 1000 curated…
Stuck with TimescaleDB v2.20.3 & PostgreSQL 17.5? Continuous aggregates not auto-refreshing? Policy creation failing to generate jobs? Currently relying on manual refresh_continuous_aggregate() or pg_cron. Better solutions? Share insights! #TimescaleDB #PostgreSQL #DataEng
🔍 Comparing Zilliz Cloud & Deep Lake for scalable vector search reveals key insights: Zilliz excels in speed and automation, while Deep Lake shines in multimedia handling. Choose based on your needs! Read more from Marcus Feldman. #VectorSearch #DataEng… ift.tt/IvZqwuy
dev.to
What I Learned Comparing Zilliz Cloud and Deep Lake for Scalable Vector Search
Introduction: Why I Benchmarked Two Very Different Tools As I scaled up a semantic search...
Mis últimas semanas con Go me han llevado a una reflexión: ¿Son los **zero values** realmente una característica que simplifica o complican más la vida a la larga en sistemas de datos y APIs? Abro hilo para explicar mi perspectiva. 👇 #GoLang #SoftwareDesign #DataEng
Looking forward to the #DataEng track @QConAI in a couple weeks. Track featuring use cases on #beam #spark #flink #gimel from engineers @Google @PayPal @stitchfix @dataArtisans #qconai
Schibsted Data Journey, by @mitxino77 meetup.com/Spark-Barcelon… at @InfoJobs #SchibstedTalks #Spark #dataeng #bcndataeng
.@mitxino77 is presenting Schibsted data journey just now at our @InfoJobs offices 💪 @SchibstedGroup @SchibstedEng #dataeng #bcndataeng #spark
ITNEXT summit 2019 takes place tomorrow and it super excited to see that they have #DataEng track curated and led by @holdenkarau . Starting with " Bias in AI - chose your data wisely " 📚🧠
Great talk tonight from @flipsidecrypto’s Director of Data Engineering, @Dan_Kleiman, on how Flipside is applying its tech to enable blockchain businesses. Thanks @quantumblack / @McKinsey for hosting us! #blockchain #dataeng
My post about data engineering is on the front page of @newsycombinator. 🔗khashtamov.com/en/how-to-beco… #dataeng #dataengineering #dataengineer
What does the future of data engineering look like? How will regulatory scrutiny affect data engineering? Find out more from @wepayeng’s @criccomini #QConSF talk: bit.ly/2zdlDVH #dataeng
David Matthewman, Head Of Production at @BlisGlobal is talking about Medium (not big) Data. #DELondon #dataeng
A great turn out at the #DataEng meetup tonight at @ROKT in Sydney… really good to be back together in-person at a real life meetup! #dataengineering
Data lineage refers to the life cycle of data: where it comes from, how it moves, and where it goes within a system. It helps organizations track the flow of data from its origin to its destination, providing transparency and accuracy. By understanding data lineage, #DataEng
As we start the new year for @dataengbytes and the #DataEng meetup, we'd like to recognise our awesome sponsors for 2021 once again... Thank you! ❤️
Getting ready for our talk at @DataConLA on serverless data pipelines on GCP #DataEngineering #dataeng #datacon #gcp #Serverless #turalabs
We're looking for an awesome speaker to round out the agenda for the upcoming Data Engineering London Meetup on 10 April. Get in touch if you'd like to present! #dataeng #london #datascience
Join us at @BlisGlobal on 11 July for the next edition of Data Engineering London! Register now: ti.to/data-engineeri… #dataeng #DELondon
Something went wrong.
Something went wrong.
United States Trends
- 1. Blue Origin 7,907 posts
- 2. Megyn Kelly 30.3K posts
- 3. New Glenn 8,908 posts
- 4. Vine 33.9K posts
- 5. Senator Fetterman 18.6K posts
- 6. CarPlay 4,384 posts
- 7. Brainiac 5,624 posts
- 8. #NXXT_JPMorgan N/A
- 9. World Cup 101K posts
- 10. Portugal 62.7K posts
- 11. Matt Gaetz 14.2K posts
- 12. GeForce Season 1,057 posts
- 13. Padres 28.7K posts
- 14. Eric Swalwell 26K posts
- 15. Man of Tomorrow 6,154 posts
- 16. Black Mirror 5,217 posts
- 17. Katie Couric 9,764 posts
- 18. Grade 1 26.3K posts
- 19. Osimhen 101K posts
- 20. Apple TV 8,512 posts