waitingforcode's profile picture. Freelance Data Engineer and instructor, enjoy solving data problems with #ApacheSpark #AWS #GCP #Azure 👨‍🏭 | contact@waitingforcode.com

Bartosz Konieczny

@waitingforcode

Freelance Data Engineer and instructor, enjoy solving data problems with #ApacheSpark #AWS #GCP #Azure 👨‍🏭 | [email protected]

Bartosz Konieczny heeft deze post opnieuw geplaatst

⏰ Final Reminder – Delta Lake Webinar Tomorrow! Wondering if data engineering design patterns can unlock new insights into Delta Lake? Or how Delta Lake can become a key part of your streaming data architecture? Join @newfront (@bufbuild) and @waitingforcode as they tackle…

DeltaLakeOSS's tweet image. ⏰ Final Reminder – Delta Lake Webinar Tomorrow!

Wondering if data engineering design patterns can unlock new insights into Delta Lake? Or how Delta Lake can become a key part of your streaming data architecture?

Join @newfront (@bufbuild) and @waitingforcode as they tackle…

Bartosz Konieczny heeft deze post opnieuw geplaatst

Why don’t Iceberg or Delta Lake have secondary indexes? Because analytics workloads and OLTP workloads optimize for opposite I/O patterns. See my dive into data layout, pruning, and what “indexing” really means in open table formats: jack-vanlightly.com/blog/2025/10/8…


Bartosz Konieczny heeft deze post opnieuw geplaatst

Are you wondering if general concepts like data engineering design patterns can help you learn about #DeltaLake? Or, if it's possible to leverage Delta Lake within your streaming data architecture? In this webinar, Scott Haines and Bartosz Konieczny will answer these two…

DeltaLakeOSS's tweet image. Are you wondering if general concepts like data engineering design patterns can help you learn about #DeltaLake? Or, if it's possible to leverage Delta Lake within your streaming data architecture?

In this webinar, Scott Haines and Bartosz Konieczny will answer these two…

Bartosz Konieczny heeft deze post opnieuw geplaatst

Releasing Soon! Pre-order now shroffpublishers.com/books/97893680… Data Engineering Design Patterns By Bartosz Konieczny  @waitingforcode. with @OReillyMedia Focusing on various aspects of data engineering, including data ingestion, data quality, idempotency, and more. #dataengineering

shroffpub's tweet image. Releasing Soon! Pre-order now shroffpublishers.com/books/97893680…
Data Engineering Design Patterns
By Bartosz Konieczny  @waitingforcode. with @OReillyMedia
Focusing on various aspects of data engineering, including data ingestion, data quality, idempotency, and more. #dataengineering

Bartosz Konieczny heeft deze post opnieuw geplaatst

If you want to understand the consistency models of the mentioned table formats of the paper, I've written about it extensively and written formal models. * jack-vanlightly.com/analyses/2024/… * jack-vanlightly.com/analyses/2024/… * jack-vanlightly.com/analyses/2024/… * github.com/Vanlightly/tab…


Bartosz Konieczny heeft deze post opnieuw geplaatst

Data Engineering patterns on the cloud by Bartosz Konieczny is on sale on Leanpub! Its suggested price is $39.00; get it for $24.65 with this coupon: leanpub.com/sh/ygsnqbRD @waitingforcode #CloudComputing #AmazonWebServices #GoogleCloudPlatform #MicrosoftAzure


Bartosz Konieczny heeft deze post opnieuw geplaatst

Join @newfront and @waitingforcode and learn all about streaming Delta Lake tables with Apache Spark Structured Streaming! 🦀 🗓 March 21st 🕝 9:00AM PT / 12:00PM ET 💻 Join this webinar via LinkedIn, YouTube, or Zoom! Learn more: linkedin.com/events/streami… #deltalake #streaming

DeltaLakeOSS's tweet image. Join @newfront and @waitingforcode and learn all about streaming Delta Lake tables with Apache Spark Structured Streaming! 🦀

🗓 March 21st
🕝 9:00AM PT / 12:00PM ET
💻 Join this webinar via LinkedIn, YouTube, or Zoom!

Learn more: linkedin.com/events/streami…

#deltalake #streaming

Bartosz Konieczny heeft deze post opnieuw geplaatst

I have been busy the last few months writing a book for O'Reilly about how to build ML systems (batch, real-time, and LLMs), distilling much of what I have learnt from both working with customers as well as students. Why could the book interest you? * Data Scientists - transition…


Bartosz Konieczny heeft deze post opnieuw geplaatst

I don't want to start a flame war here, but IMO it is a mistake to jump straight to distributed databases (and 90% of the content below is distributed databases) without first learning fundamentals on single node databases. Here's my 10 things to understand about databases:…

Ten things to understand about your database: 1) High level Architecture 2) How writes work? (Replication, data distribution, internal organisation etc) 3) How reads work? (Consistency guarantees, tuning options, etc) 4) CAP theorem, ex. CP or AP 5) Transactions and Concurrency…



Bartosz Konieczny heeft deze post opnieuw geplaatst

Data Engineering patterns on the cloud by Bartosz Konieczny is on sale on Leanpub! Its suggested price is $39.00; get it for $26.10 with this coupon: leanpub.com/sh/1T4q5Z81 @waitingforcode #CloudComputing #AmazonWebServices #GoogleCloudPlatform #MicrosoftAzure


Bartosz Konieczny heeft deze post opnieuw geplaatst

Chapter 4 of The Architecture of Serverless Data Systems: CockroachDB (serverless). jack-vanlightly.com/analyses/2023/…


Bartosz Konieczny heeft deze post opnieuw geplaatst

The early release of Delta Lake: The Definitive Guide is here! 🎉 The latest edition includes the addition of Chapter 12: Performance Tuning. Download here ➡️ bit.ly/472DVY7 Authors @dennylee, Prashanth Babu, Tristen Wentling, & @newfront #opensource #deltalake #oss


Bartosz Konieczny heeft deze post opnieuw geplaatst

Data Engineering patterns on the cloud: How to solve common data engineering problems with cloud services? leanpub.com/data-engineeri… by Bartosz Konieczny is the featured book on the Leanpub homepage! leanpub.com @waitingforcode #CloudComputing #AmazonWebServices


Last week I spent some time to understand the #PySpark applyInPandasWithState. This week I'm refactoring the code, hoping to still understand it 2 months later ;) 👉 waitingforcode.com/apache-spark-s…

waitingforcode's tweet image. Last week I spent some time to understand the #PySpark applyInPandasWithState. This week I'm refactoring the code, hoping to still understand it 2 months later ;) 👉 waitingforcode.com/apache-spark-s…

In the previous release #PySpark has got an interesting streaming feature -> the arbitrary stateful processing. It has a different API than the Scala version but is more adapted to the Python world. More 👉 waitingforcode.com/apache-spark-s…

waitingforcode's tweet image. In the previous release #PySpark has got an interesting streaming feature -> the arbitrary stateful processing. It has a different API than the Scala version but is more adapted to the Python world.
More 👉 waitingforcode.com/apache-spark-s…

Bartosz Konieczny heeft deze post opnieuw geplaatst

A list of articles I share again and again when developers ask me about Kafka 🧵


Bartosz Konieczny heeft deze post opnieuw geplaatst

[ANNOUNCEMENT] Congrats to the Apache Spark community and all the contributors! The Apache Spark 3.5.0 release is here. Try it out! spark.apache.org/releases/spark…


It's not a rebranding but more a regrouping 😉 All my additional #dataengineering content is now available from there waitingforcode.com/better (planning to add some stream processing materials soon)

waitingforcode's tweet image. It's not a rebranding but more a regrouping 😉 All my additional #dataengineering content is now available from there waitingforcode.com/better (planning to add some stream processing materials soon)

If Delta Lake implemented the commits only, I could stop exploring this transactional part after the previous article. But as for RDBMS, #DeltaLake implements other ACID-related concepts, such as isolation levels 👉 waitingforcode.com/delta-lake/tab…


One of the great features of table file formats is the ability to handle write conflicts. It wouldn't be possible without commits that are the topic of my #DeltaLake blog post. waitingforcode.com/delta-lake/tab…


Loading...

Something went wrong.


Something went wrong.