Discover more from From An Engineer Sight
Issue 29 - The Meta Data Stack
SQL is not designed for analytics
I recently came across the work of Robert Kegan, developmental psychologistmr. Especially his model of adult cognitive, affective, and social development.
Long short story, he identifies three stages of adult development:
Stage 3, communal:
[...] you are in relationships; and, tacitly, you find yourself defined by them. Stage 3 develops a more accurate, more complex understanding of the self/other boundary. For stage 2, other people are meaningless unless they directly affect one’s immediate interest. For stage 3, “the other’s point of view matters to us intrinsically, not just extrinsically as a means of satisfying our more egocentric purposes. [...] The prototype relationship is the “school chum”; developing intense peer friendships is what typically drives the transition to the communal mode. [...] Stage 3’s limitation is that it cannot resolve conflicts between responsibilities to different relationships. If one person wants you to do something, and another person wants you to do something different, there is no good basis for decision, because relationships have no internal structure; they consist simply of sharing experience. [...] The communal mode is characteristic of pre-modern (“traditional”) cultures. It’s impossible to base a large-scale society on the communal mode, because it’s so ineffective at coordinating complex group activities. [...] In modern societies, stage 3 is developmentally appropriate for adolescents. It is not adequate to fully cope with what modern societies demand of adults. Stage 3 adults in the West are developmentally traditional people6 living in a modern world7—and that causes friction.
Stage 4, rationalist:
Here relationships are relativized. They move from subject to object, and are subordinated to, and organized by, a system. You no longer are in relationships that define you; you have relationships. You no longer are a stream of transient emotional experiences; you have experiences. [...] Systematic people relate mainly on the basis of each other’s principles, projects, and commitments, rather than their feelings. To stage 3, that sounds cold and distant, but for stage 4, it means seeing the other person for who they really are. [...] A stage 4 social system is rational in at least the sense that there is a reason for the nature of each role and relationship; and the reasons together provide an interlocking structure of justification. [...] Stage 4 has the capacity to take the perspective of a social system as a whole, and to support its smooth functioning [...] Stage 4 is definitional of modernity, in the sense of European culture and society over the past 250 years or so. [...]
Stage 5, meta-rationalist:
Here systems are relativized. They move from subject to object, and are subordinated to, and organized by, the process of meaning-making itself. You are no longer defined as a system of principles, projects, and commitments. You have several such systems, “multiple selves,” none of them entirely coherent, and which have different values—and this is no longer a problem, because you respect all of them. [...] Development beyond stage 4 is driven by seeing contradictions within and between systems [...] at some point you realize that all principles are somewhat arbitrary or relative. There is no ultimately true principle on which a correct system can be built. It’s not just that we don’t yet know what the absolute truth is; it is that there cannot be one. All systems come to seem inherently empty. [...] This uncomfortable midpoint of the stage 4 to 5 transition is sometimes called “stage 4.5.” Here it’s common to commit to explicit nihilism. Understanding that there is no ultimate meaning, one comes to the wrong conclusion that there are no meanings at all. [...] The cutting edge of Western culture and society is currently at stage 4.5, in the transition from stage 4 to stage 5. Modernity/systematically has broken down, but we haven’t yet consolidated a positive new mode (personal, social, and cultural fluidity).
But here we're about data. I will use it as a vagueness analogy.
After all, data is the better asset we have to describe reality; thus it should be quite tight to it.
There are still people starting a managed "modern" data stack with tools such as Airbyte, Snowflake and HighTouch - migrating from tools that work but aren’t “sexy” anymore. They thinking it will be easy. That it will solve every data problems and boost the business. But it's probably the late majority.
Early adopters and even the early majority are closer to what Robert Keagan introduce as stage 4.5 of nihilism.
We find this idea in very recent posts, claiming that our goal with data and its promise are either unattainable or illusory.
But I think we're heading toward stage 5, i.e. post-modernism. What some could call the "meta data stack"2.
Benn implicitly refers to it in this post. The "meta data stack" is not about one stack. Not about tools. It's about choosing the good system according to the needs. It's about being mature and more septic.
Some would say "nothing new": "that's an architect's call". Experimented craftsmen already know that no tool is the solution, neither one system. It's a combination of tools, people, context and luck that make a system work at a certain time.
Those craftsmen are probably already at stage 5. Yet there is a gate for those in stage 4.5.
We can glimpse what's stage 5 - the meta data stack - by sniffing what the last tools and visions are bringing to the table.
Again, I have bias toward tools because they make us change our practices (or because I'm still myself in stage 4...). Having discussions and meeting saying we should apply best practices and apply great ideas is fine. But I think nothing can replace action and practice in shaping our environment.
Hence it's decisive to choose the good toolset and bring the vision attached to it. Vision here is what make people dream. Propelling them into action immediately just after the meeting.
The last few years brought new tools not focused on fundamental technical revolution but on semantic. Terraform, React, dbt, more recently, Malloy, Kestra, etc. They're not really new technologies: they are just new semantics.
Declarative here is the first three hiding the forest. What those tools have in common are their ability to adapt to a broad set of systems. To get people to sit around the table to speak the same language.
I believe this is the key aspect that will carry us through stage 5, the "post modern data stack".
📡 Expected Contents
dbt or not dbt 🤔 ?
Don't make me wrong: I was an early adopter of dbt some years ago, and like almost everyone, I now consider it a standard.
But growing community concerns are coming: large projects often result in a spaghetti codebase. The cloud version is being prioritized over open source (see recent dbt Coalesce announcements) 👀
One competitor in that space is SQLMesh. It mix some concepts of dbt & some of Terraform. They just released a web UI, making the whole UX experience slick (having CLI + UI is 👌)
You can develop your SQL, view lineage and dependencies, and manage workflows (+ it's completely free and included in open source)
Worth keeping on the radar. The ETL game is not in its final stage 🌱
About web scraping
There are few, if any, legal domains where hypocrisy is as baked into the ecosystem as it is with web scraping.
If like me, you did quite a lot of web scraping, you probably asked yourself at some point: is it legal? What are the limits ?
This blog come as a good wrapper. Quite boring too, but as the legal of all those LLM things is blurry, it's a good reference to keep in the library.
Is ClickOps the new DevOps?
Alexis comes with a real banger here. YAML engineering is definitely a thing, and sometimes it feels we're losing flexibility and speed.
[...] by enforcing such a strict infrastructure code policy, I had inadvertently constructed a labyrinth between my stakeholders and the Cloud itself. A maze, where only a select few possess the map while others stand stranded at the entrance, yearning to explore but lacking the means to do so.
Push vs Pull in Software Development
That's something I'm seeing more and more in the usages of products and in general problem-solving. When is it most appropriate to employ either pushing or pulling techniques?
📰 The Blog Post
That’s a bit clickbait, but I really think there is something next after SQL - in the analytics space (not OLTP). Let me know what you think of that take 😅
SQL is not Designed for Analytics
Thanks for reading From An Engineer Sight! Subscribe for free to receive new posts and support my work.
🎨 Beyond The Bracket
“In Defence of Witches: Why women are still on trial” by Molla Chollet.
Not much to say: go read that book! Probably my best read of the year 🙏.
It truly makes me ponder how much of our history continues to hinder and underestimate 50% (and thereby 100%) of humanity.
Beyond political and ethical considerations, I often imagine what our world could be if every human were regarded as equal. To what extent are we sub-optimizing our systems and society due to “physical” disparities?
While I love sun, rainy days are way more productive on my side.
Thick socks on, wearing a heavy sweater and warm light on the desk.
You start scratching the keyboard and flow state come naturally.
Going outside by the end of day sounds like a fresher.
Like feeling closer to the sea or high in the mountain haze.
It also comes with a weird mix of nostalgia and ambition.
Hope you're doing well 🙏
See you soon in December 🎄
Some psychologist folks and "collapsologists" think we are in a stage 4 moving back to a more communal state. As an optimist person, I definitely think we're heading to state 5; just having a hard time to go through the stage 4.5.
Nomenclature - giving names to thing and ideas - provides important markers for the adoption and use of innovations. In the data industry, new names come every month or so. Sounding like buzzword the first time we heard them. I don't know about the "meta data stack", but I'm using it to refer to what come next, i.e. the "post-modern data stack".