Utterly brilliant article. One paragraph I'd take issue with is the below.
"The Data Mesh philosophy and focus on data being curated and easily accessible as products have revolutionised how teams interact with and work with data. Instead of a swamp of raw, source data, it is now often organised in different layers, from raw ingested data to filtered or cleaned data to business-curated data that can be fed into analytical solutions. The focus on data as a product normalised this approach (also called the Medallion Architecture)"
Having worked with Data Warehouse & BI (or formerly Decision Support System) developers and being one myself in the past - the medallion architecture concept of layering (staging > curated > business) is nothing new. It was the "modern data stack" wave which make us get embroiled with technology with shiny tools like DbT, Kafka, Spark etc, dropping focus on modelling the data in a valuable way. This was not the case prior where your choice of tech was limited (had its challenges for sure) - but you'd have 1 platform in a corporate - big DW DB like Oracle, Teradata, SQL Server, perhaps some WhereScapeRED on top - coupled with semantic layer like Business Objects Universe. The data and how it was modelled was far more of center of attention than later with things like Hadoop that solved some problems, but introduced far more others. :)
Thanks Radek! I agree with you, I think the rebranding of that stage -> curated -> business layering has been the main thing. I think there is better opportunity to delineate and make it less complicated by spreading it out over a set number of technologies (rather than all in one SQL database). But that has also had an opposite effect of making it more complicated in a lot of places too. It is a never ending struggle!
Very good article. Data products has become term of the year, everyone is talking about it so keen to see how it evolves over time or like every other trend it will fizzle out itself ! Thank you
Thanks Goutam! Yup, it is a term that is not well understood but still very popular. If companies don't understand it (and how to deal with it), it will lose its power and people will dismiss the term and the eventual products that get built
Always respect your views and opinions Yordan! And I don’t think I will ever convince the engineering crowd because to be fair, your products are the datasets/ tables you work with. But for most other people in the Data Ecosystem, they don’t think of it that way (maybe have internal vs. external vocabulary around data products?)
But to give you a teaser, I like to call foundational ones Data Assets (these are overarching data elements that can feed into multiple use cases) and then within the Gold layer that is more curated you have curated Data Sets/ Layers (these are more specific and focused tables that feed into data products within data marts). But I will get into that in future articles!
Awesome descriptions, as always! Keep it up! I keep learning more.
Thanks Morgan, I very much appreciate the kind words!
Utterly brilliant article. One paragraph I'd take issue with is the below.
"The Data Mesh philosophy and focus on data being curated and easily accessible as products have revolutionised how teams interact with and work with data. Instead of a swamp of raw, source data, it is now often organised in different layers, from raw ingested data to filtered or cleaned data to business-curated data that can be fed into analytical solutions. The focus on data as a product normalised this approach (also called the Medallion Architecture)"
Having worked with Data Warehouse & BI (or formerly Decision Support System) developers and being one myself in the past - the medallion architecture concept of layering (staging > curated > business) is nothing new. It was the "modern data stack" wave which make us get embroiled with technology with shiny tools like DbT, Kafka, Spark etc, dropping focus on modelling the data in a valuable way. This was not the case prior where your choice of tech was limited (had its challenges for sure) - but you'd have 1 platform in a corporate - big DW DB like Oracle, Teradata, SQL Server, perhaps some WhereScapeRED on top - coupled with semantic layer like Business Objects Universe. The data and how it was modelled was far more of center of attention than later with things like Hadoop that solved some problems, but introduced far more others. :)
Thanks Radek! I agree with you, I think the rebranding of that stage -> curated -> business layering has been the main thing. I think there is better opportunity to delineate and make it less complicated by spreading it out over a set number of technologies (rather than all in one SQL database). But that has also had an opposite effect of making it more complicated in a lot of places too. It is a never ending struggle!
Very good article. Data products has become term of the year, everyone is talking about it so keen to see how it evolves over time or like every other trend it will fizzle out itself ! Thank you
Thanks Goutam! Yup, it is a term that is not well understood but still very popular. If companies don't understand it (and how to deal with it), it will lose its power and people will dismiss the term and the eventual products that get built
Great post, Dylan! My view is slightly different, but I can see your point.
Always respect your views and opinions Yordan! And I don’t think I will ever convince the engineering crowd because to be fair, your products are the datasets/ tables you work with. But for most other people in the Data Ecosystem, they don’t think of it that way (maybe have internal vs. external vocabulary around data products?)
Oh, a separation would make a ton of sense!
So, drum roll… what should we call Data Mesh ‘Data Products’?
Oh I will get to that ;)
But to give you a teaser, I like to call foundational ones Data Assets (these are overarching data elements that can feed into multiple use cases) and then within the Gold layer that is more curated you have curated Data Sets/ Layers (these are more specific and focused tables that feed into data products within data marts). But I will get into that in future articles!
Great topic for clarification, Dylan.
Here is my POV - https://www.linkedin.com/posts/kamalm_datatrust-datacatalog-datagovernance-activity-7246526771485892608-tYlw
Thanks Kamal, I'll take a look!