Joe Reis wrote this recently:
Boring is back. Because the focus isn’t on managing the underlying technology anymore, we can now turn our attention to the boring stuff we conveniently ignored during the heyday of the modern data stack era - data governance, data management, data modeling, and other “enterprisey” practices. This is the stuff that’s boring as hell to most people, yet critical to making data work in any organization.
While I agree with this view, I think it uncovers something not boring at all: "[...] data governance, data management, data modeling, and other “enterprisey” practices. This is the stuff that’s boring as hell to most people, yet critical to making data work in any organization."
It's boring as hell to most people because it's currently difficult to move forward in these areas (data governance, data management, data modeling, etc.). We lack proper semantics and models for domains that are most of the time a matter of human relationships.
It's boring because we don't have the tools to hack it. We don't have that short feedback loop.
What if we have the good tools1 ? The good material to finally move forward on those subjects? 🤔
📡 Expected Contents
BigQuery is cheaper than you think
Earlier this year, Jordan Tigani - founding engineer on Google BigQuery - claimed that "Big Data is Dead".
That's DuckDB's premise, and in most cases, I think they are right.
Still, it's not black and white.
In his great blog post, Alexander shows how BigQuery is not as expensive as we might think.
It resonates in some way with the recent AWS announcement around S3 Express One Zero Storage class, making object storage accessible for performance-critical applications...
S3 is now one of the backbone of the Internet. Could data warehouses such as BigQuery or MotherDuck could be the same for data applications?
Data Engineering As Code
I'm so glad to see Antoine shipping his first blog post. The work he's been doing at Gorgias is fantastic.
It has been an eye-opener on the use of Terraform, not only for cloud resources but for all resources.
The Future of Terraform: ClickOps
Following Alexis's blog last time, I've found this great blog about Terraform. Especially the "what's next".
Very interesting read looking at What will Infrastructure as Code (IaC) look like in five years.
Here is the plot:
Infrastructure is slow to change
More people creating and modifying infrastructure is good
CDKTF, AWS CDK, and Pulumi are steps in the wrong direction
ClickOps enables more users
OneTable
It sounds like everyone wants to be the standard 😅
Current contributors across Onehouse, Microsoft, Google, and others are planning to incubate OneTable into the Apache Software Foundation.
Microsoft and Google are participating in this project to make a joint between Apache Hudi, Delta Lake, and Apache Iceberg.
This is called OneTable:
OneTable provides omnidirectional interop between lakehouse table formats
OneTable is NOT a new or separate format, OneTable provides abstractions and tools for the translation of lakehouse table format metadata
Interesting to follow as it starts to be confusing to have many lakehouse table formats...
📰 The Blog Post
From Data Engineer to YAML Engineer
Resonating a lot with the above sharings; I cross-post with the great Julien - Ju Data Engineering Newsletter - our views on what's coming next for data engineering practices. Hint: it's about YAML. At least being declarative.
In the same vibe, Anna wrote a fantastic blog post on Kestra's blog about YAML and how we supercharge it at Kestra.
🎨 Beyond The Bracket
That's the kind of stuff I love. It shows how early we are in that data world.
One could think of building a kind of "OSINT activity tracker". Track activity of restaurants near government security offices: if people stay at night, or places a busier than usual, we can expect something important to happen.
Having good contextual data allows us to predict much more than we could imagine.
In some sense, it reminds me how a coastline 100 years ago may have influenced modern elections in Alabama today or how 109 years after WW1 started, the pre-WW1 imperial borders are still visible on the election map of Poland.
TL;DR; don't underestimate the butterfly effect. Both at the society scale and yourself.
And we hit December🎄!
2023 was the start of a new arc for me; in every part of my life... I can't wait to start 2024 which sounds already promising!
Thank you so much for reading this! I received lovely feedback recently, especially that one:
[...] I wanted to thank you for everything you had done for me. People like you are heroes to us.
🫶
To be honest, I write this newsletter more for myself than for anyone or an audience. I know you're all coming from different backgrounds and I hope the very different subjects I approach in those writings reflect your holistic interests.
This kind of "love letter" is probably my holy grail. Thanks for that.
Don't underestimate how one-to-one relationships and discussions can change one's mind. This one hits me right in the center and is a kind of evolutionary growth or confirmation bias at play. It reinforces my tone.
Thanks again for reading this newsletter.
See you very soon 😉
hint: I don't think it's only about dashboards and data catalogs…
Thank you for the mention 🤗