Issue #3 - Pulling Back the Data Ecosystem Curtain
What are we even talking about? What is in this ecosystem?
Read time: 7 minutes
The last two weeks were about introducing the overarching concept of the data ecosystem and why we need to start thinking about it.
But, like so many things in data, we touched on the high-level buzzwords and never really got into the details of what was part of it.
Below you will see the overarching image of the Data Ecosystem, with each realm touching on and interacting with the others.
This image is a start, but let’s explain each element of this framework to truly underscore why it is included and the impact it makes within the Data Ecosystem!
1) Getting Started with the Business Drivers
Data doesn’t start with a Python script, SQL server or an EC2 instance. It starts with an ask from the business.
In fact, the heart of the data ecosystem lies within the business drivers, and any success needs to be underpinned by these factors.
The picture above shows a simplified view of these elements. Within it you have the core elements of the overall machine; from how it is set up (the org structure) to how it functions (the operating model) to how the business actually delivers value (the business model). These elements lead to the business strategy and its overall needs/ requirements that highlight the goals and direction of the organization (where are we going?). Finally, all of these driving forces help dictate and define the data strategy, shaping how data aligns with and supports business objectives.
These elements work together in a symbiotic fashion. For example, a robust data strategy lays the foundation for all data work, but this must dovetail with the business needs and requirements from different business domains (like marketing and finance). The business needs, in turn, reflect the broader business model and strategic direction. Meanwhile, the organisational structure and operating model serve as the skeleton that supports this and should be set up to allow data activities to serve the business’s needs, instead of preventing progress.
In addition to these drivers, we have data investment decisions. Getting money for data activities is often done at the business level and really depends on a well-articulated need for data within the business model, strategy and domain needs. These data investment decisions then flow into the data strategy (i.e., how much money do we have to deliver against our data goals).
Finally, we have the overall data approach/ philosophy and the data use cases. These two things are the tangible outcomes of the business drivers that inform the overall data work in the next section.
Crucial for the data use cases is stakeholder alignment around business users. Data work is useless unless its end users and consumers are bought in and aligned to what the outcomes are.
2) Delivering in the Data Lifecycle Process
The data lifecycle process takes the business’s data desires and makes them real.
Now most people have seen a certain form of this diagram before. This isn’t new and likely can be cast in multiple different ways. But it all starts at the same point, data sourcing.
After agreeing upon the Data Use Cases, the data team needs to source the right data from across the business. Within this category come many considerations like the types of data, different sources, how it is generated, etc.
This feeds into the enterprise data platform, where data teams have to consider platform choices (like storage or compute technologies), how to feed the data in (engineering, ingestion) and how to structure the data within the platform (data modelling or creating data assets). The overall data approach or philosophy (e.g., do we go with a Mesh approach or use a common Data Lake) plays into these decisions quite a bit.
Integration considerations follow up from this, with teams needed to ensure that the data is organised correctly, things are done in a secure way, and it is ready to be consumed by the analytics teams. Now this step and the previous one flow across one another quite a bit and could all be bundled into one or take place in a different order depending on the preferences/ experiences of the person.
The final step in the data lifecycle is actually consuming the data and turning it into something meaningful. This is done through BI, analytics and data science techniques, producing data solutions in an agile/ DevOps (or DataOps) type methodology. The goal here is to harness insights from the raw data in a meaningful way that aligns with the original data use cases set out by the business.
This entire lifecycle is underpinned by data management and everything that fits within that broad term. We won’t unpack that right now, but things like governance, cataloguing, contracts, and MDM all play into this process if we want the data lifecycle to be efficient and effective.
3) From Insights to Business Decisioning
In the end, the data lifecycle is useless if the business does nothing with those insights or tools. Hence the extension of the data ecosystem, where data consumption informs the business decisioning.
With a foundational level of data literacy and culture, and analytical outputs like dashboards, reporting, or self-service tools, the business can actually harness the power of data. The goal of transforming raw data into tangible outputs that can influence critical business decisions is realised here—making data not just a byproduct of business operations but a central tenet of strategy formulation and execution.
4) Considering the Surrounding Influences
No ecosystem exists in a vacuum, and the data ecosystem is no different.
Surrounding influences such as the growth of the data market, the proliferation of data tools, and the ever-present debate of build versus buy constantly plague data decisioning and progress (for good or for bad).
Other factors like ethics, regulations and new opportunities force teams to change how they approach data solutions and force the business to make risk and reward considerations.
And no data ecosystem is complete without training and development considerations, as your data ecosystem needs a strong team to survive and thrive.
All these external factors influence the data ecosystem in one way or another, helping dictate the pace and direction of an organisation’s data journey.
5) Dealing with Data Detractors
The last bit within this ecosystem is the Data Detractors, the classic organisational issues that often impede progress.
These include the burdens of tech debt, the costs associated with investment in data initiatives, and the often-underestimated efforts required for change management and transformation. This list probably could include a whole host of challenges that might be encompassed in other parts of the ecosystem, but those will be tackled where they sit in the ecosystem.
Instead, these are three overarching blockers that stand in the way of every project that data aims to undergo. Recognising and addressing these detractors is crucial, as they always find a way of changing the mind of certain executives when it comes to their willingness to go ‘all in’ on data.
Bringing it All Together
That was A LOT of content in five minutes of reading. I don’t blame you if your head is spinning.
But rest assured, this newsletter will dive into each of these factors in a lot more detail from week to week moving forward.
Today the goal was to understand the overarching context of the data ecosystem and everything it encompasses. In the many weeks to follow, we will shed light on each of these factors, understanding the role that component plays, how it interacts with other elements, and shedding light on what you can do with that knowledge.
As for next week, we will touch on a very pertinent topic to underpin this thinking: the rapid growth of data and how this has handicapped our ability to deliver value.
So thanks for reading and feel free to leave comments, feedback and questions below!
Thanks for the read! Comment below and share the newsletter/ issue if you think it is relevant! Feel free to also follow me on LinkedIn (very active) or Medium (not so active). See you amazing folks next week!
Interesting till now. Being a Quality professional in Data Analytics , looking forward for testing aspects and one general question is how one can define effort estimation for large projects.
I believe it is not just me, we all struggle with ROI when it comes to Data Management. Any thoughts or inputs would help. Thanks for this.