Issue #25 – The Enabling Role of Data Architecture
Architecture can be the key to delivering an enterprise view of your data operations
Read time: 15 minutes
Introduction
The use, value, and hype around data are all accelerating faster than companies can keep up.
But every data domain (e.g., engineering, data science, analytics, management, etc.) is moving in different directions, with different mandates, resulting in increased complexity and confusion.
And that’s before you layer on the operational nuances inherent in each organisation…
Is there an answer to this?
Well there is no silver bullet, but there are ways we can do things better.
While the data ecosystem is complex, crucial components help explain it and the relationships between domains, making sense of what seems incomprehensible.
One of the best answers to solving this problem within data specifically is right in front of us! Data modelling and architecture are two approaches that provide a foundation for solving the most pressing issues data teams are dealing with. The problem is that data teams aren’t giving it the light of day it deserves and are still focusing on short-term issues rather than solving them for the long term.
So, today, let’s explore the topic of data architecture and understand how it can help align your data strategy, culture, and operations!
What is Data Architecture? Why is it important?
Everybody has probably heard of Data Architecture, but the term’s use is not extremely clear.
Many organisations and technical individuals treat Data Architecture as a blueprint of all their data technologies mapped together—essentially their data stack. While this is a component, it doesn’t encompass the whole picture. Instead, let’s turn to the TOGAF (The Open Group Architecture Framework) and DAMA DMBoK (Data Management Body of Knowledge) definitions:
TOGAF – A description of the structure and interaction of the enterprise's major types and sources of data, logical data assets, physical data assets, and data management resources.
DAMA-DMBoK (pg. 47) – Defines the blueprint for managing data assets by aligning with organisational strategy to establish strategic data requirements and designs to meet these requirements
Combining those two definitions gives a good view of what the domain provides for an organisation:
Data Architecture describes an organisation's data assets' structure, interaction, and management to support the overall business objectives and strategy.
So why is Data Architecture important? And what is its role in enabling your data teams with the foundation truth of what matters?
Consistent Enterprise View of Data – Creates a unified understanding of what data assets exist and how they flow across the organisation, ensuring all teams work from the same information baseline. This reduces silos, contradictory definitions, and ambiguity within data knowledge and usage
Links the Tech and Data to Business Processes – Sets a blueprint for how data assets and systems align with the business model and the operational workflows that keep the organisation running
Simplifies Integration Management – Most organisations don’t understand their data and technology integrations. An adequately defined architecture reduces the complexity inherent in most data technology designs by mapping out how systems integrate to maintain interoperability and provide transparency into the structure
Establishes Framework for Foundational Data Domains – Defines the core data categories, their relationships and how they are managed, providing a base layer for essential domains like solutions architecture, data engineering, security/ access management, lineage, data management and data governance. Without Data Architecture, these domains wouldn’t “talk to each other” in the same language
Standardises Downstream Requirements Gathering – Gives analysts and scientists a common language and framework to understand how the upstream data is structured, where it flows from and how it maps to the business strategy/ needs. This creates the foundational trust in the data and the business linkage necessary to build analytical solutions
Keeps AI Tools Informed – The biggest problem with AI is poor quality data, leading to hallucination and poor outputs. Providing a standard blueprint for the relevant data and information within the business ensures AI is more accurate and effective
It is worth noting that Data Architecture flows directly from Enterprise Architecture (EA), but Enterprise Architecture takes on a slightly higher-level view that encompasses the wider organisational view, looking at how the business works in terms of its people, technology, processes and data. Data Architecture is one component of EA, sitting next to things like Business, Technical and Application Architecture, which dig into the details of other architectural thinking within the organisational strategy. However, a common misconception is that data architecture is a technology thing, whereas it centres around the business and the information associated with the aspects of the business.
Where Data Architecture Will Help?
Genuine Data Architecture requires higher-level, strategic thinking, which (unfortunately) rarely has a place in today’s data industry.
Instead, there is a constant focus on short-term or flashy work (e.g., engineering tickets, building another dashboard, trying AI, etc.).
Data architects often find themselves managing the platform and infrastructure. While this is important, it deprioritises the strategic nature of architecture/ modelling to structure the data workflows and integration in a scalable way that aligns with the business processes. Moreover, the primary audience of architects is now data teams, when it should actually bridge the gap between the business and data teams.
I therefore want to outline what the six main use cases of Data Architecture are to deliver all the benefits we listed above:
1) Integration and Interoperability Optimisation
Systems should never exist in a silo; they need to communicate and work with one another. Unfortunately, this is often not the case. Integration and interoperability are constant short-term ticketing activities for engineers due to companies not appreciating the larger picture of linking their technologies. The architectural design and glossary help with the integration rules and limitations for different technologies and tools. It also identifies gaps for the dev/ data team to address or for vendors to build into their subsequent upgrades. Interoperability is, therefore, designed from an enterprise-wide technological perspective.
2) Data Technology and Application Consolidation
I’ve seen a lot of spaghetti-like diagrams of technical architecture. Often, only one person can decipher them (and then they leave, resulting in chaos). By mapping out what each technology does, what data it creates, and how it connects with other systems or data sources, organisations can develop a proper strategy for what tooling they need and how it fits together. Thought-out technical architecture diagrams and data models provide that; teams can build their tech stack strategically, reducing the clutter of apps and tools. So, instead of just buying new tech or creating new applications, working with architects to design an optimal system is imperative.
3) Data Flows and Asset Design
What data is there? Where does it come from? Forget lineage or catalogues; the first step is architecting the data assets and how they flow into the analytical solutions that deliver front-end value. Based on the business model and strategy, architects should spend time once a quarter designing how data aligns with and helps answer those business questions. This helps prioritise the development of structured data assets that reflect the areas of information the business needs data for (e.g., customer, sales, stores, inventory, etc.). These assets provide consistent definitions that business stakeholders understand. It also helps with merging/ combining separate data sources, which is very relevant for larger organisations or those facing mergers or acquisitions. Finally, architects can map the data flows and relationships between data and different operational applications, data stores/ databases, business roles and ownership (CRUD diagrams), and network segments (DAMA-DMBOK pg. 109-10). This type of data flow design is usually built in a matrix or flow diagram, providing an overview of what data each process creates (matrix) or how data flows between systems (flow diagram).
4) Referential Enterprise Data Model
Directly related to the above, world-class Data Architecture teams will produce an Enterprise Data Model that can be used as a single reference point for data entities, attributes and their relationships across the enterprise (DAMA-DMBOK pg. 106-7). Designing the conceptual, logical, and physical data models (at an application or project-specific level) that link the data architecture directly to the business processes gives business stakeholders a reference point for how data helps them make decisions. The EDM illustrates how an organisation delivers value (business & conceptual model) and feeds into the logical and physical models that detail the attributes, entities and relationships (often depicted in Entity-Relationship Diagrams). It also provides a high-level abstraction of the organisation, providing a valuable lens for prioritising critical data areas. The standardised components in this model allow for the comparison of data and the design of new assets/ flows. This enables change planning and improves interoperability. A good understanding of this can reduce time and risk in making changes to the business and align what engineers and developers do with pipelines and applications with the company’s goals.
5) Business Transparency & Technical Communication
Data people complain about business people not understanding what they do and the value they provide. A big reason for this is that documentation is tech- and data-focused, with language that doesn’t bridge the gap or translate well. Enter conceptual data models and well-designed technical architectures (both mentioned above). They say a picture is worth a thousand words, and in data, these models/ diagrams can be the silver bullet to getting your business team on board. A Conceptual Data Model should be built in conjunction with the business, and they should see their tasks/ activities in the output of the design, giving them confidence that data will solve their questions. A Technical Architecture should be a simplified view of their operational tools (e.g., ERP, CRM, CDP systems) and how they connect with front-end tools they query (e.g., dashboards, reporting applications, etc.). Both these documents provide transparency into the black box that is “Data” and help them understand the role data and tech play in delivering value for the business.
6) Demonstrate Value to Senior Leadership
Speaking of…value is the holy grail of what data teams are trying to demonstrate. But this is tough, especially for backend devs or data engineers. Data Architecture can help. As mentioned before, the Enterprise or Conceptual Data Model can demonstrate how data links up to the strategy and enables business processes and KPIs. Building on this, a well-crafted data flow diagram can then identify (at either a high- or low-level) which data assets are used most often and who they are managed by, ascribing value to the backend work of creating and maintaining pipelines or data quality. Within the detail, the Logical and Physical Data models show what data attributes and entities feed into the front-end applications, with metrics to track speed and quality of this data. Finally, a technical architecture helps illustrate how different operational and analytical systems feed into certain decisions, creating associations that otherwise get lost in the black boxes of the IT or Data team.
These six use cases demonstrate the potential influence of Data Architecture, especially for organisations that can’t present the value delivered by data, have overworked engineers, or don’t understand how their technology/ data maps out.
And let’s be honest, this describes most organisations out there…
Operationalising Data Architecture
So, what does all this mean in practice? That depends on your organisation's size, complexity, and willingness to change.
Let’s use an analogy to define how this may play out.
Say you are building a house. It is a collection of rooms that people live in, with each room having a function that serves a need. Anyone can build a house, but the quality would vary significantly. This is why most people pay an architect to design a home: to ensure each room is crafted to its purpose, the foundations are sound, and the work of high quality so you don’t need to fix it constantly.
Now let’s say you are building a city. A city is a lot more complex than a simple house. Instead of just the building components, you must understand the interactions between public transportation, roads, people, taxes, office towers, etc. You also have multiple business targets, departments, and tools to help make sure things run efficiently. City architects work within this, helping shape the city’s development. They interact with stakeholders to optimise the design of public spaces/ buildings, set standards to reduce maintenance costs or manage implementation projects to ensure they are done correctly.
Moving back to data, you can compare the house analogy to a small company or business domain and the city to a larger enterprise. Architecture is crucial to getting the data foundations right in any organisation. Even companies with a Data Mesh or a Modern Data Stack require planning to design data assets and flows to ensure they align with business needs and processes.
Organisations can ignore it and face constant structural challenges solved with short-term fixes and junior engineering hires. Or they can plan and design to avoid these challenges, incurring the upfront costs to design an optimal system for the long term (and update regularly). Overall, putting in the effort upfront (and managing it iteratively) creates data and tech consistency, reuse and standardisation that improves maintenance, interoperability, and adaptability to change (just like when architecting a house or city).
To do this correctly means approaching it in the right way. For this, I’ve listed out five crucial success factors for Data Architecture that span resourcing, processes and overall data philosophy:
Hire the Right Resources – First, hire the right resource, specifically a data architect who can create strategic architectural diagrams and map systems together. Many architects I’ve seen in organisations don’t know how to do this and can only model lower-level details, ignoring the business processes. Or this job is done by a data engineer with multiple other roles: building pipelines, maintaining systems, designing schemas, etc. Architects are invaluable for a high-performing data platform, so hire one or two depending on needs (with a clear view of why).
Use Technology to Automate/ Standardise – Another necessary resource is a data modelling tool (like ER/Studio) that helps define the company’s information in business language and uses this knowledge to design and document the organisation's data assets. This type of tool fits within a platform's data management, quality, and governance bucket, reducing the inevitable human error when mapping data flows. It also connects with data governance, lineage, or catalogue tools, creating interoperability across data management. Data architecture and modelling technology allow organisations to manage information and data assets like the city planner, incorporating standard patterns with the tool and automating complex tasks.
Centralise the Architectural Function – Many companies have aspects of decentralisation, where Data is a Product and domains own the operationalisation of data. While it has merits from an analytical perspective, this can lead to siloed thinking (which I’ve seen in many of these structures). A decentralisation strategy needs to be supported by a centralised Data Architecture team. They should support the data and technology design process from the enterprise view, ensuring that each domain clearly defines its data and how it enables overall company goals. The central architects will work with domain teams to produce models of their “foundational data products”, mapping out which domain owns what and which are shared with other domains. This creates a shared understanding of data’s purpose and mitigates the main challenges of a decentralised structure.
Ensure Business Teams are Engaged – Architects have become too data- and technology-centric; their audience needs a broader reach. Data Architecture should bridge the gap between the business strategy/ plans and execution/ implementation. Instead, most architects focus on the latter, not engaging the business stakeholders. To be successful, architects need to understand the needs, requirements and strategies of the overall enterprise and/ or business domains. Continued workshops/ touchpoints with these stakeholders also create a sense of buy-in and shared purpose, underpinning the change management necessary to realise the benefits of a data transformation program.
Don’t Be Dogmatic About Approach – Your approach may be either top-down, bottom-up, Kimball, Data Vault or something else. Remember that an Enterprise Data Model is not a static outcome but a design of how the business works, containing the main actors, data points, and relationships between those elements. Don’t be stuck with one approach; take a mixed modelling approach. Combine different architectural designs and techniques as long as it focuses on the most critical parts of the business. This unlocks where the most value resides and mitigates the largest risks, further connecting the organisation’s strategic direction to tangible data activities, assets, and workflows.
Foundations For Success
We’ve gone through the benefits, the use cases and how to operationalise Data Architecture.
But despite knowing these things, companies still don’t invest in architecture. They think:
It will slow us down
It won’t deliver value for our end users
A considerable amount of architectural work must be done to realise results
My take?
Initially, things will take a little more time. It may also cost a bit more and require other resources.
But if you want high-quality systems and products, you need to do it! Good planning and architecture improve internal quality, make things faster in the future, and streamlines data management. Not to mention your business stakeholders get more value from it.
And if you don’t want to build a full-blown Enterprise Data Model, that’s fine. Just map out a few business processes or data flows and see the benefits that come from it.
I do not like doing things half-assed; if you want to succeed with your data team, you must ensure the fundamentals are there. And to me, Data Architecture is one of those crucial components that too many teams forget to invest in.
So be better, invest in Data Architecture and ground your data foundations for long-term success!
Thanks for the read! Comment below and share the newsletter/ issue if you think it is relevant! Feel free to also follow me on LinkedIn (very active) or Medium (increasingly active). See you amazing folks next week!
I partnered with ER/Studio to bring you this article. Rather than push their technology, ER/Studio wants organisations to understand the benefits of Data Architecture and Modelling, which is why they were such a great partner to work with! Check them out here and engage with this article to support my newsletter (and keep the The Data Ecosystem paywall from ever going up)
great article
Great article, Dilan! I like how well you structured different stages.