Issue #19 – Developing an Overarching Data Technology Strategy
How to approach prioritising your technology needs
Read time: 12 minutes
Today, business relies on technology.
It is embedded throughout the organisation, and every department has its go-to core tools (ERP for supply chain or CRM for sales and marketing).
Data is no different. Technology is more critical for Data than most other departments.
However, companies don’t usually approach their data technology & tools strategically:
They buy in an ad hoc way
They don’t align purchases with goals
They stick to very basic and free software
They don’t know what technologies exist internally
They have multiple technologies doing similar things
What is worse than all of this is that this doesn’t seem to be a widely discussed problem. If you search for a data technology strategy on Google you mostly get articles about building a Data Strategy. As a data strategist, I can tell you this is not the same thing.
So, let’s dig into how to think about and develop a data technology strategy. There are four things to consider in this exercise:
Aligning your Org/ Data Goals with Technologies
The Operational vs. Analytical Divide
Priority & Role of Each Type of Tool
Implementation & Operational Considerations
Aligning Your Org/ Data Goals with Technologies
Before you do anything else, you need to look within.
What are your organisational goals? What is the business model and strategy that helps achieve these goals? What are the data objectives that sit under that?
Starting here might seem obvious, but you'd be surprised how often companies invest in shiny new tools without considering how they fit into the bigger picture.
Instead, departments buy tools for individual, ad hoc goals, not thinking about how they mesh into the overall strategy or impact other departments/ domains that may not be the owners of the tool but will interact with it.
You need to approach this from four angles.
The first angle is the Data Strategy. Whether one exists or not, the organisation needs to understand its strategy and how data will enable that. I will be doing a future article on Data Strategy, but for now, let’s focus on two components:
Organisational Goals/ Direction – What are the company's overarching goals (e.g., revenue/ customer growth, cost reduction)? Are you aiming for rapid growth, cost reduction, or improved customer experience? Your data technology should support these objectives.
Role of Technology to Drive Goals – How does technology help us achieve those goals? This should be very high-level, literally mapping out what each technology does to help us achieve a KPI target. For example, as a logistics company we need an ERP system to (1) operate effectively, (2) understand and manage our supply chain costs, and (3) assign revenue to different customers.
The second element is to conduct a Technology Audit. This is about understanding what technology exists within the organisation, what it does/ is supposed to do, and its effectiveness. I would focus on mapping out all the technologies within the data lifecycle (which I explained in detail in a previous article) and how they interact with one another. This stack is unique for every company, but I've included a helpful framework below to give you an overview of the categories.
Fill in each category of this framework with your existing technology stack and rate how each tool performs. This will help identify any gaps.
The third element is to align your Data Strategy with the Technology Audit. At this point, you’ve already done the hard work. Next is thinking critically about how each technology/ tool delivers against your data strategy goals and initiatives. For example, say one of the data strategy goals was to increase trust and transparency, some questions you may ask are:
Where do people get their data from? Do they trust it?
Do we have tooling to understand what data is used or track the transformations made?
Are people referencing the data lineage or cataloguing?
The fourth step is to prioritise your investment and roadmap your Technology Strategy. Usually, technology is purchased to solve one-off requirements that different departments have. Instead of individualistic thinking, a holistic and logical approach to execution is necessary. What and when do you need it, and how does it fit within your operations/ existing technology? For example, buying a Data Catalogue or Observability tooling is useless if your Data Storage/ Processing is not figured out. I’ve seen a company implement a Catalogue when they hadn’t migrated all their data into their platform, meaning they were cataloguing very small amounts of data and, predictably, nobody was using the tool.
This approach lets you understand each technology's criticality in business operations and analytical insights. Aligning your investment with that criticality saves you money and helps you get buy-in from your stakeholders—they will see the common sense in this approach rather than rolling their eyes from the sidelines as leadership makes another technology mistake.
The Operational vs. Analytical Divide
The next consideration is whether the technology deals with operational or analytical data.
One of the most common mistakes I see in data technology strategies is failing to distinguish between operational and analytical data needs. These two areas have different requirements, use cases, and often, different user bases.
Before we jump in, here is a quick definition for each type of data:
Operational Data – Data produced by day-to-day operations (e.g., transactions)
Analytical Data – Data that is aggregated and curated to be analysed for business intelligence or fed into ML/ AI models
I plan on doing a more thorough article on approaching both operational and analytical data because poorly defined thinking around these two data concepts is a mess many companies find themselves in.
Let’s focus on the technology side of these two data concepts.
First, Operational Data Technologies. These concern the day-to-day business operations and transactional tooling that keep the lights on. These tools are essential to enabling and streamlining business operations, like managing supply chain logistics (ERP systems) or customer interactions and sales (CRM tools). These technologies operate in real-time or near-real time, constantly updating. Given they are built to conduct operational tasks, data produced from these systems are often very rule-driven and hardcoded, making it hard to edit or change. For this reason, these tools can create many data quality problems down the line if they are not properly set up to predefined standards.
Second, Analytical Data Technologies. These technologies comprise the nebulous Data Platform term, used to analyse and derive insights from internal business data (often generated by the operational tools mentioned above) and external sources. The analytical technology stack includes warehouses, data lakes, BI tools, catalogues, data quality tooling, etc. This large suite of technologies also makes the landscape a lot more confusing. The rationale for certain technical architectures varies by company type, data maturity, and, more than anything, leadership preference. Despite the differences, all analytical tools have the same purpose—to drive value for the business through better decision-making—but they may differ at what stage and how they do that.
Distinguishing the differences between the two is important. As data people, we often focus on analytical tools. And this focus results in analytical tooling getting all the hype, especially now that these tools enable ML or AI. This is a misguided view, as both need to work together simultaneously.
As mentioned, operational data keeps the business running, but it is often the source of analytical data. When discussing poor data quality and complex pipelines, the root cause is often the operational systems and tools that feed into the analytical platform. Ultimately, the operational tools help stand up and enable the analytical insights that have become so important for data leaders and practitioners. The organisation is doomed to fail unless this is factored into the enterprise architecture design or the data modelling approach. I bring this up because evaluating how each technology impacts your operational and analytical data operations needs to be a crucial step in any technology strategy.
Priority & Role of Each Type of Tool
The next consideration that must be made is the priority and role of each tool.
Look at the below image. This is Matt Turck’s MAD (Machine Learning, AI & Data) Landscape. It is honestly mad at how complex and confusing it is
Look at all the tooling categories and subcategories (if you can even read them). This is primarily only analytical data technologies, too!
Not all data technologies are created equal, and not all should play the same role in your organisation. Therefore, I want to break these into four categories by level of priority:
The Foundational Data Stack
Primary Scalable Add-Ons
Nice to Haves to Improve Quality
Potential Future Tooling
Note that I also have the underpinning technologies and tools in the graphic. I will not go into these in detail here, but they are necessary and exist in any technology stack (e.g., Python or R to do analytics, ERP or CRM to operate the business, external data to provide additional context, etc.).
Also note that while I introduce each of these technologies here, I won’t go into much detail. Next week, I will cover the Data Technology Landscape, which will explain why each technology is important and provide popular examples/ brands.
The Foundational Data Stack – This base layer is the backbone of the data platform, providing the necessary infrastructure for data storage, processing, and visualisation. It ensures data is collected, processed, and made available for analysis efficiently and securely.
Data Storage: Storage layer for both raw and cleaned data; tooling may overlap with processing/transformation and serving
Data Processing & Transformation: Automation technology that integrates, cleans, and curates data as per the business needs/ use cases
Orchestration: Tools to automate the data flow, managing and scheduling workflows to ensure tasks are executed in the correct sequence and there is automated interoperability
ETL Tooling: Tooling that extracts, transforms, and loads data from various sources into data warehouses
Analytics, Consumption & Visualization Tools: Interfaces for analysts or business users to analyse and visualise data, providing insights through dashboards and reports
Primary Scalable Add-Ons – With the foundational layer sorted, companies will enhance their technology stack. These tools are essential depending on your organisation’s needs, and enable scalability for growing data requirements (e.g., managing customer interactions, ensuring security, performing machine learning).
Cybersecurity & Privacy Tooling: A whole host of less popular tools that underpin a data platform to protect data from breaches, prevent unwarranted access and ensure compliance with data privacy regulations
Customer Data Platform (CDP): Integrates and unifies customer data from various sources, allowing for more personalisation and segmentation
Data Science Modelling: The tools and programming languages that allow data scientists to build and deploy machine learning models
API Services/ Integration Tools: Tooling to facilitate interoperability between different systems and ingesting varying types of data to a certain standard, bringing together disparate sources and systems
Nice to Haves to Improve Quality – Data quality tooling has become increasingly popular. I list them as nice to haves because they should never be the initial investment. Before you buy and implement these technologies, you should have a working storage and processing layer accompanied by a Data Governance, Management or Quality team. Why? These require a lot of processes and change management principles to use effectively.
Data Observability/Quality Tooling: Monitors data pipelines and workflows to ensure data quality and integrity
MDM (Master Data Management) Tooling: Ensures the consistency and accuracy of key business data assets across different systems
Data Catalogue & Lineage: Provides a repository of data assets and tracks data flows/ transformations, enhancing data accessibility, governance and management
Potential Future Tooling – These tools are not widespread. Their marketing efforts may make you think they are, but few companies use them because they are truly nice-to-haves once you have figured everything out. That said, they can help facilitate and improve data literacy, driving usage and innovation if business stakeholders have enough time to learn how to use them.
Knowledge Graphs: Represent and manage complex relationships within data, enabling advanced analytics and insights
Semantic Layers: Provide a unified view of data by abstracting the underlying data sources, facilitating easier access and analysis
Next week, I will discuss each of these technologies in more detail, explaining why they are important to invest in and providing examples of the biggest players in the space.
Implementation & Operational Considerations
Having a data technology strategy on paper is one thing; implementing and operating it successfully is another.
Hence, the final section of your technology strategy considers the execution.
Phase Implementation Based on Identified Needs & Priority – Leveraging the assessed maturity of the technology stack and the above technology hierarchy of needs, determine maturity and prioritise based on needs
Vendor Selection & Investment – Do your due diligence on what kind of vendor you pick based on your use cases, maturity and needs. Sometimes the most popular players will be extremely expensive, especially for a small company, and you could save a lot of money looking elsewhere. Also inquire about getting investment from the vendors. A lot of SaaS vendors charge based on usage, so they will fork some money upfront to get you on board and set up
Tooling Interoperability – Some tools work well with others. Some don’t. Figure this out before implementation. Don’t just listen to the salespeople, actually architect it out, talk to others using those tools, and evaluate what level of interoperability each technology has and how it stacks up with your current data landscape
Build with Change in Mind – Ben Rogojan nails this one. The only real constant is change, so design your technology strategy and landscape to be adaptable. This means considering best practices, design principles, and data modelling when operationalising any new technology. Think about both the short-term and long-term
Business Stakeholder Change Management – Speaking of the long-term, technology fails because companies don’t consider the long-term transformation new tools will necessitate. New tech means a new way of working. You need to develop a change management plan with the relevant business and data stakeholders that considers training, communication, processes, technical support, and tooling champions. It is a lot of work, but it's worth it!
External Support – I’m a consultant, so I must say this. However, implementation and change management are a lot easier with third-party help. Make sure they approach it strategically, facilitate the change management/ buy-in, have experience with the tooling, and act as partners. The big consultancies often have the expertise with every tool, but if you want higher quality (and lower cost) I recommend searching for a freelance SME or boutique consultancy with a good understanding of the tech
Feedback Loops – You can’t operate without continuous evaluation of the technology, its performance, and whether it is hitting its marks. Especially with SaaS tools, it is up to the company to ensure the tech is delivering as required. Conduct workshops, retros, and interviews with key stakeholders throughout the implementation and after as well
Despite what many leaders think, buying new technology is not a panacea for all data-related challenges. Moreover, many tech salespeople are snake charmers in convincing you that you need something you don’t.
So developing a strategy is crucial, considering the short- and long-term with the entire architecture in mind. It may seem like a lot of upfront work, but it will create clarity and efficiencies in the long-term.
Next week, we will stay with the tech topic and try to map out and define all the tools within your data landscape (note that this doesn’t include all the tools in the data ecosystem, but includes most of them). How should you structure the tooling architecture, what fits in it, and what brands exist? I look forward to sharing that with you next week, and thanks for reading!
Thanks for the read! Comment below and share the newsletter/ issue if you think it is relevant! Feel free to also follow me on LinkedIn (very active) or Medium (not so active). See you amazing folks next week!
I love the hierarchy of needs diagram, especially in splitting out data transformation etc from ETL. This is often missed but is a crucial foundational requirement.
Love the diagrams: you may like this one: https://www.junaideffendi.com/p/end-to-end-data-engineering?r=cqjft&utm_campaign=post&utm_medium=web