Issue #44 - Upstream Observability to Mitigate Data Issues
To solve the problem, you need to know it exists. Enter data observability (and doing it where it counts)
Read time: 14 minutes
The word 'trust' causes ongoing headaches in the data world.
Most companies and employees simply do not trust the data.
So, despite all the work data professionals do on building dashboards, constructing ML models, engineering pipelines, and making use of the ever-growing treasure trove of data, the business might not trust those insights because the data doesn’t “seem right.”
And it all stems from the classic term, poor data quality, and the cascade of downhill consequences it triggers:
Trust erosion: Once stakeholders doubt one metric or data point, they question everything
Compounding errors: Poorly managed data flows into models, dashboards, and AI systems, turning incorrect data into incorrect information or insights
Return to manual: Automation is great, but if stakeholders don’t trust it, it’s in their nature to go back to manual insight gathering, which is cumbersome and wasteful
Operational downtime: Teams spend hours or days troubleshooting instead of creating value (think about all those engineering tickets)
Perhaps worst of all, most organisations only discover data issues after they've already impacted business decisions. This causes significant downstream data quality costs, which Gartner estimates at $12.9 million per organisation, on average. Given this study was conducted five years ago, I can imagine this has only increased with the amount of data and the growing number of sources.
In the data quality community, there is a 1-10-100 rule (developed from research by G. Loabovitz and Y. Chang) about the cost of insufficient quality data:
Prevention – At the lowest level is the cost of verifying a record as entered to ensure its accuracy. This comes in at a low $1 cost
Remediation – If there is an error, the cost to fix it after it is created is around $10
Failure – And finally, it can cost $100 to fix each record after it has failed within the system

It is worth noting that this research was done over 30 years ago, and the amount of data records has increased exponentially since then. Every organisation now has billions of records, and properly fixing them after failure can cost millions of dollars (not to mention all the operational delays caused by such issues)
Therefore, this 100x cost multiplier creates a compelling case for shifting quality efforts upstream (and actually investing in tools to combat this).
Enter the idea of ‘shifting left’!
Every organisation wants to get ahead of these data quality problems, so they are moving from remediation to prevention in how they address them. For most, this begins with Data Observability—detecting data issues that arise before they become apparent to your data consumers.
The Rise of Data Observability
Data Observability wasn’t a commonly used term 5 years ago.
Back then, companies only had 3-4 primary data sources and only a handful of dashboards or models to inform, finding and fixing errors wasn’t too onerous.
Enter 2025. Data volume has exploded, growing from 33 zettabytes worldwide in 2018 to an estimated 182 zettabytes by 2025.
That number means nothing to me, but it’s the growth rate that you should pay attention to:
5-6x as much data as 7 years ago underscores the growing complexity of company data systems
This also means there are at least 3-4x more sources of data in an organisation
And finally, this data is probably coming in faster and needs to be handled quicker (e.g., streaming data)
All of this puts a significant burden on data engineering teams and makes traditional manual quality checks completely impractical.
That's why data observability has emerged as one of the fastest-growing technology categories in the data landscape, with a market size of $2.14 billion in 2023 and a growth rate estimated at over 12% CAGR. And given that the data industry is valued at around $400-700 billion (everybody has conflicting estimates, but they are all substantial) and is continuing to grow (especially with AI), this investment in preventative data quality makes sense.
I mean, the reason for this is simple: To realize value from data, organisations can’t spend can't spend half their time and resources finding and fixing data errors.
And any data professional can tell you that they often spend up to half of their time dealing with backend data quality issues rather than extracting insights.
So What Is Data Observability?
Put simply, Data Observability is a magnifying glass into your data ecosystem. It provides organisations the ability to understand the health and reliability of data systems, processes, and pipelines.
Data Observability draws inspiration from application observability in software engineering but applies those concepts specifically to data systems. With data flowing through numerous sources, undergoing countless transformations, being stored in multiple locations, and being utilised via various destinations (from dashboards to AI models to activation), you need a tool to monitor all of this data.
Therefore, after setting the rules and standards, a data observability tool/ platform will be embedded into every stage of the data pipelines and platform to detect anomalies in how the data flows (also referred to as data health indicators).
In other words, data quality is about defining what's right, while data observability is about detecting what's wrong—often before you even know to look for it
Now, observability encompasses a wide range of things. To make it easy, five key pillars form the foundation of a comprehensive data observability approach:
Freshness: Is the data being updated at the expected frequency?
Volume: Are we receiving the expected amount of data?
Schema: Has the structure of the data changed unexpectedly?
Lineage: Where did the data come from and how has it been transformed?
Distribution: Do the statistical properties of the data match expectations?
If you recall my first article on data quality, it outlined eight data quality measures.
You can see below how observability platforms help monitor and ensure these data quality measures, allowing teams to identify and prevent issues before they become a bigger problem!

Together, these pillars form a comprehensive monitoring framework that can detect both apparent failures and subtle degradations in data quality. By focusing on sourced or ingested data within the pipelines, these tools identify data quality issues at the source before they are fed into analytical tools, thereby preventing the corruption of datasets or insights gathered further along in the data process.
For example, a CRM data pipeline might deliver customer data daily (good freshness). The team expects 10,000 records (good volume), but if those records suddenly contain null values for 'Lead Source' fields (distribution issue) due to a Salesforce API update that changed field names (schema and lineage issue), traditional monitoring might miss the problem. This gap wouldn't be discovered until the marketing attribution dashboard shows zero leads from paid channels, causing the marketing team to draw wrong insights and likely never trust the data again.
Observability represents a significant evolution from traditional data quality tools that relied primarily on predefined rules and manual checks. Now with Machine Learning, modern observability platforms can better detect baselines of normal behaviour within your organisation and automatically detect deviations across all five pillars simultaneously.
The result is a more proactive approach to data quality that can identify issues before they impact business outcomes.
However, as we'll see in the next section, implementing observability in your data ecosystem makes all the difference.
Shifting Left with Upstream Observability
The concept of "shifting left" comes from software development, where testing and quality assurance are moved earlier in the development lifecycle.
In the data world, this means assuring quality closer to the source.
Now, everybody is talking about shifting left, but not everybody is doing it
For example, when data observability first emerged as a category, it primarily focused on monitoring analytical outputs:
Is the dashboard showing the right numbers?
Are the reports updating correctly?
Do the machine learning model outputs make sense?
Given the world was obsessed with analytics and data science, this approach made sense—it followed the money where business and data teams felt they should invest.
And while these concerns are certainly valid, it is taking a reactive approach to a proactive technology; observability is about proactively getting ahead of your data. If you aren’t doing it at ingestion, you are reacting to data problems only after they've propagated through the entire data ecosystem.
The natural evolution is to move observability upstream, closer to the source systems where data originates.
Some Data Observability tools still struggle with this as they have developed their tool with downstream analytics in mind. Newer tools like Matia have built this into their foundation, focusing on detection from the source systems in the ingestion layer.
Personally, most companies I see are dealing with upstream data quality issues (e.g., Salesforce IDs not matching up, duplicate data in the Oracle database, inconsistent formats in sales data feeds), so that is where I believe the big focus needs to be when it comes to data quality.
These small issues cause major downstream problems and leads to a culture of mistrust for these source systems.

Consider a typical customer data flow:
Customer details are entered into Salesforce CRM
Data is extracted via API to a data lake
Several transformation jobs enrich the data
The data is loaded into a data warehouse
Multiple dashboards, reports, and ML models consume the data
Business decisions are made based on these outputs
If a field suddenly changes format in Salesforce (e.g., currency switches from USD to GBP or phone numbers switch from international to local format), this small issue multiplies at each step:
Extraction jobs may load the data, but miss the format change
Transformation jobs might misinterpret the new format
Join operations in the warehouse could fail
Multiple dashboards will show incorrect information
ML models might produce erroneous predictions
By the time the issue is detected in dashboards or reports, it has already contaminated dozens of datasets and processes.
Finding the root cause becomes a time-consuming investigation rather than a simple fix.
Again, I reference my Data Quality Conundrum Part 1 article, as root causes such as changing business processes or poor data entry often lead to larger-scale data quality issues. So, by shifting left and detecting issues at the source, upstream observability more proactively addresses quality problems.
This boils down to four main benefits I see:
Earlier Detection of Issues – The most obvious benefit is that any problems are identified before they spread throughout the pipeline. This is the essence of observability: why not identify issues earlier, rather than within the data product?
More Accurate Root Cause Analysis – The closer to the source issues are detected, the more immediately apparent the root cause is. This helps with time resolution and provides engineers with a better understanding of how to add resilience to their pipelines and data flows, ultimately improving future processes.
Prevention of Downstream Multiplication – When your data pipelines go from a raw to clean layer, the goal is to create a ‘single source of truth’. If not detected early on, any continuation of errors can amplify quality issues within that single source of truth and beyond. Stopping problems at the source prevents this exponential damage.
Integration with ETL and Data Pipeline Processes – Modern observability tools and platforms can integrate directly with data pipelines, enabling automated remediation workflows that resolve common issues without requiring human intervention.
This upstream approach aligns perfectly with the holistic platform trend we discussed in our previous article, where unified quality platforms offer end-to-end coverage. In a data domain where proactive prevention is valued, upstream makes even more sense!
How Observability Interacts with Other Data Ecosystem Components
Like anything else in the Data Ecosystem, observability doesn't exist in isolation.
It is a foundational capability that not only supports and enhances data analytics but also requires strong fundamentals to be done correctly. So, what are some key areas to consider when choosing, implementing, and using an observability tool?

Data Modelling & Architecture – Observability provides valuable feedback on how well your data architecture is functioning in practice. A tool can help detect schema or lineage issues in specific data flows, indicating architectural weaknesses that need addressing. This feedback loop will help architects and engineers refine and improve the underlying data model and flow. Many observability platforms now offer connectors or lineage/catalogue features that enhance your ability to do this as well.
Data Engineering – For data engineers, upstream observability is transformative. Rather than Slack messages from analysts, engineers receive automated alerts the moment anomalies appear. Engineers can check health dashboards and perform lineage tracking to quickly understand the performance of their pipelines and identify any issues that may have occurred. Tickets become less reactive and quicker to fix. Moreover, some platforms even assist with integrations and pipeline setup to improve quality from the outset.
Data Platforms & Warehouses – Many storage and processing layers become cluttered and difficult to use. With a well-built data model and platform strategy, observability can then enhance its performance and cost-efficiency in several ways:
Identifying unnecessary data transfers or transformations
Detecting inefficient query patterns or resource utilisation
Providing insights on data usage to inform archiving strategies
BI and Analytics – Perhaps the most apparent benefit is to analytical outputs. When upstream data is reliable, dashboards and reports can be trusted. But more importantly, data teams spend less time troubleshooting and more time on value-adding analysis. Moreover, engineers are not reliant on analysts to inform them about issues when they arise.
AI Solutions – As organisations invest heavily in AI, the "garbage in, garbage out" principle becomes even more critical. Upstream observability ensures that AI systems are built on reliable foundations. Understanding what is being monitored and tracked for data quality (with lineage or catalogue visibility) allows teams to build better AI systems that are trusted and know which data they can actually leverage, preventing hallucinations and reducing bias.
Looking across these five areas, it is no surprise that data quality hampers each one significantly. The proactivity data observability can offer to improve these data ecosystem domains, creating a cascade of benefits and ultimately helping deliver more reliable insights and greater trust in data outputs.
What This Means in Practice
The theoretical benefits of upstream observability are compelling, but what does it look like when applied to real-world data operations? After all, investing in data quality is challenging because quantifying the benefits can be a complex process.
Well, for most organisations, it comes down to four or five different statistics. We will use case studies from Matia’s customer base as examples, but generalise them so you can think about it for your own company/ use case:
1. Incident Response Time
The most obvious benefit is the reduction in time required to identify and respond to engineering tickets. When you have a shortage of data engineers (who are expensive resources and always in high demand), this translates into cost savings in multiple ways. Observability tools that provide real-time alerts and notify users about schema changes are two very key reasons for this. For example, a logistics technology company Solvento implemented upstream observability and immediately halved their incident response time, allowing them to identify problems before they escalated and better pinpoint failure points rather than searching through systems.
Translation for your business: Calculate how much time your team currently spends diagnosing data issues. Reducing this provides savings on engineering resources and less data downtime for business data products.
2. Data Pipeline Speed
Improving data pipeline efficiency and reliability is not just a technical metric; it directly impacts business decisions and the customer experience. Migrations and manual ETL development sometimes lead to data pipelines that are inefficient and constantly stall, slow down data traffic or break, leading to delayed access to data and further business implications. Observability platforms can help companies optimise these pipes (especially when combined with testing or contracts). Using Matia, a financial services company Ramp reduced its historical data resync times from 3.5 days to just 4 hours after implementing upstream quality controls, while HoneyBook saw a 30% increase in database replication speed with zero data loss.
Translation for your business: Is the data piping through correctly and efficiently? Consider the business impact of potential delays. What decisions get postponed? What customer experiences suffer (especially if your business operates in real-time)?
3. Infrastructure Costs
Companies often set up their data platform infrastructure without thinking about efficiencies. Then, as their data grows, the issues mount up. Part of an integrated observability tool is being able to optimise data processing by identifying inefficient data flows. These often waste computing resources through unnecessary processing, storage of invalid data, and redundant error-correction workflows. By understanding and implementing quality at source upstream, organisations can eliminate these inefficiencies at their root. I referenced Solvento above, which saw a 20% reduction in warehouse costs due to fewer failures and a streamlined, improved data processing system.
Translation for your business: Review your current cloud warehouse or data processing costs. Companies often don’t review these things, but the savings from eliminating inefficient data handling or storage can offset much of your investment in new resources, tools, and initiatives.
4. Engineering Productivity
Perhaps the most valuable benefit is reclaiming engineering time for innovation rather than maintenance. Without proper quality controls, engineers become stuck in endless firefighting cycles, constantly addressing symptoms rather than causes. This represents an enormous opportunity cost for your business. Engineers should be more strategic and plan the workflows and pipelines in a holistic way, but don’t have time to do so! With observability platforms to help reduce, identify and fix data issues, engineers can spend more time on high-value work. For example, a fintech company, Obligo, reclaimed 25 hours per month, previously spent troubleshooting pipelines, and now directs that time toward product innovation and business-critical development. Hell, some observability tools/ platforms can even do Reverse ETL, which saves engineers even more time on an essential process!
Translation for your business: Count how many hours your data and engineering teams spend fixing data problems each month, then multiply by their fully-loaded cost (and engineers are expensive). This is what data quality is costing you! Get them the help they need where they need it, so you can redirect their time to value-creating activities.
When evaluating observability solutions, look for platforms that address these requirements with minimal integration complexity. The right solution should start delivering value within weeks, not months.
That being said, this should be backed up with the right people, processes and standards to make sure data quality well understood and aligned upon by different internal stakeholders!
Conclusion
Data will continue to grow, and quality will remain a challenge for large organisations.
Detection represents the most proactive way to manage those issues before they lead to a negative business impact. Moreover, shifting to upstream observability represents a forward-thinking shift-left approach, where data teams build quality awareness into the foundation of their data systems before issues are noticed downstream.
Reflect on your current systems and operations and ask yourself:
Are we catching data issues at the source, or discovering them only after they've caused damage?
Your answer may help you restructure your process and think more proactively about data quality management.
As always, thanks for reading! Subscribe, comment, and have a great Sunday!
I partnered with Matia to bring you this article. Rather than push their product, Matia wants to spread the word about Data Observability (especially upstream, which I’m a big proponent of), which is why they were such a great partner to work with! Check them out here and engage with this article to support my newsletter (and keep The Data Ecosystem paywall from ever going up)!
Thanks for the read! Comment below and share the newsletter/ issue if you think it is relevant! Feel free to also follow me on LinkedIn (very active) or Medium (increasingly active). If you are interested in consulting, please don't hesitate to reach out. See you amazing folks next week!
Data quality shifting left has been a recent epiphany for me too! Trying to solve GIGO has always been a data and analytics struggle, but devolving that responsibility to the producers makes so much sense, it's just that folks have been historically bad at doing it.
As ever, really useful thought piece. Thank you.