Issue #36 – Real Life Advice on Building a Data Platform
Five crucial tips (from lived experiences) on how to ensure success for your multi-million dollar platform investment
This article was primarily written by Nik Walker, the Head of Data Engineering at Co-op, a massive cooperative in the UK that has a huge retail presence. Nik brings tons of hands-on experience when it comes to scaling data platforms (much more than a strategy consultant like myself). In this piece, Nik goes through the five things to think about when scaling your data platform. Check out his LinkedIn and blog. Enjoy the article!
Read Time: 14 minutes
If you’ve ever worked on a data team, it always seems like there is huge change.
Usually, it’s inspired by leaders making random decisions that impact the progress of everybody else’s work.
And one of the biggest decisions (or shall we call it whims) of leadership now is the desire to centralise, decentralise, federate or just store your data in a sensible way.
At face value, this isn’t a bad decision. But when I say “Now”, what I mean is “Forever” and now just means “Use the cloud dummy!”
Enter the realm of “The Data Platform”, a piece of cloud-based equipment that makes data democratisation and accessibility easier (apparently). Every firm globally seems to be in a rush to get into the cloud with all the data they possibly can to solve all the issues they can ever think of. For some reason, they think a new platform on GCP, AWS, or Azure (depending on which company the key decision maker has been woo-ed by) will make the company ‘data-driven’.
And this is why Engineers are so stressed and technical people constantly busy; they bear the brunt of the cloud push.
But a huge piece of the puzzle seems to get sidestepped. Less a piece of the puzzle and more a solitary question.
WHY are you doing this?
Every team wants to improve, modernise or flat-out nuke-from-orbit and start again with their data platforms. Full migrations, changes to foundational models and brand-new, untested frameworks are now the cool things to do to get promised. But we just don’t seem to start with the “Why?” question enough.
So enter me (Nik Walker): A jaded, cynical, been round the block Senior Data Engineering Leader who’s been through the rough of it from start to finish.
And enter this article about my experience about how to best approach setting up and scaling a Data Platform (I’ve actually done it while Dylan just talks about and consults on it).
Start with Sponsorship & the Need
Point 1 – You MUST start with Exec sponsorship and a list of your business needs
You’re a data person. Stakeholders do not give a crap what you’re building (genuinely), as long as it works, improves ROI and is relatively cost-effective. There’s some wiggle room to that; there are some strong stakeholders who might get the data platform, but the vast majority will not give a crap.
Therefore you need to convince them of that and get their buy-in. A huge part of this is ensuring that they understand the three elements listed below:
What already exists?
What needs to change & why?
What will be delivered and by when?
To do this, you need to communicate the above, identify the progress and challenges that arise, and create timelines for them.
Here is where the hypocrisy of a Head of Data Engineering comes through. I believe that 95% of all platform work is technically focused on engineers, architects, modellers, etc. The last 5% is stakeholder management and prioritisation, which ensures that the job is done properly and that stakeholders realise what is being done.
Yet that last 5% is often the bit that determines success or justifies the executive sponsorship. It should include the business needs, the planning, and the overall team progress.
Unfortunately, teams forget to do this, focusing only on the technical stuff.
Getting the right platforms in place is hard, and every grizzled engineering leader has fought for years to get it done. To do this, we have to prioritise the right datasets, in the right model, for the right stakeholders based on their business needs. And then the executive sponsor has to realise it and pony up the investment to get it done!
Without all that? What’s the point?
Don’t Forget about Data Architecture
Point 2 – You’ve got sponsorship, now what about those architectural decisions?
Not every firm has bought into the Data Koolaid yet, so your major sponsor may not be the CTO/ CIO/ CDO in this endeavour.
Instead, you might be speaking to a COO or CFO. They will kind of care about what you’re doing, but may not get the technical details you want them to.
Realistically, they may see this as a promotion-enabling decision or a not-so-well-understood strategic priority.
Hence, the first thing on their mind will be to do it fast, cheap, and efficiently.
From my experience, an ‘efficiently built’ data platform usually ends up in failed execution, a sub-par solution or a platform that needs to be rebuilt in two years
This might seem scary (and it can be), but I urge you not to fret, because this is all normal and A-okay.
Instead of jumping right to building or buying (like your sponsor may ask you to do), build out the architecture considerations. This may seem like a lot of work, but even a high-level, strategic approach to this will provide a roadmap for success (and it WILL impress your boss). So, here are five categories to think about and a few questions for those of us who might be looking at this:
Technology Audit – Before building anything new, understand what you have. This isn't just about listing tools - it's about understanding how your current technology landscape supports (or hinders) your data initiatives:
Current Data Landscape: Do you have a comprehensive view of the current data landscape? Are other teams trying to do what you’re doing? Who has what data, and where?
Architectural Framework: What architectural framework will you follow? Lambda? Delta? Medallion?
Data Quality, Lineage and Observability: Tooling, Teaching, Educating, Understanding - This stuff is expensive, hard to ROI, but unbelievably helpful. What currently exists and how will you spin this up?
Infrastructure Decisions – Infrastructure choices can make or break your efforts. Consider:
Cloud vs On-Prem: Do you need to be on the cloud, or are you buying into hyper-scaling hype? Do you have a pre-existing cloud agreement you can tap into?
Vendor Offers: Is there a vendor you trust with your platform’s life? How will you choose what infrastructure to invest in?
Data Workflow Structure – Your workflow structure needs to balance efficiency with maintainability:
Ingestion: Is it one singular, near-realtime pipeline? Or are you having many because of weird operational needs/ systems? What tooling? How will it be orchestrated?
Modelling: Data Vault? 3rd Normal Form? Kimball? What’s best for your organisation? How can you do best for your team… and best for the stakeholders?
Integration Points: Identify how different systems connect and where data flows break down. Are there redundant tools doing the same job?
Plan for Last Mile Solutions – The final step is often the most crucial - getting data into users' hands:
End Goals: Business Intelligence? Management Information? Data Science? Analytics? What exactly do you want to be provisioning from your new shiny platform?
Visualisation: What pretty tool would you like? What already exists?
Training & Adoption: How will you ensure users can effectively use the platform? What support structures will you put in place?
Ongoing Platform Maintenance & Modernisation – What should you consider beyond the build phase?
Skills: What skills already exist inside the organisation? What about the next six months worth? Modelling? Engineering? SRE/SOE? Science? What do you have? What do you need?
Monitoring & Operations: How will you monitor platform health? What's your incident response process?
Documentation & Knowledge Management: How will you maintain platform documentation? How will you handle knowledge transfer?
You’ve got your work cut out for you here, right?
But then there’s an even bigger question. For each of these parts, what is better: Build or Buy?
There are a lot of factors to consider here. Some of the top ones I faced in my platform build are listed below:
Time to deploy – How quickly can you get this tool deployed in your stack?
Customisation – How we can you customise it? Do you need workarounds due to a lack of customisation?
Scalability & Performance – How well does it scale? Do you need all that power?
Total Cost of Ownership – What’s the financial cost short, medium & long-term for the decision?
Vendor Lock-in – Can you avoid long-term vendor lock-in if the tool gets bought?
Talent Availability – Do you know what the market looks like for the tool, framework or methodology?
Security & Compliance – What kind of security does the tool have / need? How do you prove that?
Integration – How well does it integrate into your platform proper? How well does it fit into the wider tech stack?
Maintenance – What does maintenance look like? If your tool, how? If off the shelf? When?
Risk & Reliability – SLA? SLO? Rota’d overnights?

Remember you don’t have to hit the nail on the head for everything, you can do pretty good in one area if it helps you out in others.
For example, don’t downplay a purchased accelerator that does 60% of what you need and doesn’t take much time to get away from. Or maybe an ingest tool that links to pre-existing software will fit your needs for two years whilst you build a more robust solution yourself? Go for it!
Perfect will kill you, your team and your goals. So don’t aim to be perfect, aim to be functional!
Don’t Sweat the ‘Small’ Stuff
Point 3 - The small things that everyone forgets… like Data Quality, Data Governance and Enterprise Legalities.
After you have built your tech stack, you now have to think about those other things that techies love to dismiss.
Like what about Data Quality?
How will you show your Data Quality? How will you do your root-cause-analysis when it goes wrong (it will)? How will you then be able to take your findings to the relevant person to see how to correct the issue?
What about your in-platform Data Quality, how will you monitor that? What are your biggest axioms? What does the business truly care about? Accuracy? Timeliness? Wholeness? What do they want? How can you push and teach them?
Or maybe legality, security and privacy?
What are the legalities behind your platform?
GDPR?
HIPPA?
Competition Law?
How will you handle PII? Any SPII in there?
You can’t just punt this down the line; you need to define these with either a Governance team OR directly with legal.
You will need to get advice on what things the rest of the technical stack currently do to prevent breaches. You can then devise ways to ensure similar, for example a Role-Based Access Control system that prevents decisions being made by pricing analysts across brands. Or an Attribute control or Data-mask for SPII data that only HR needs access to.
Presenting those solutions allows non-technical users to really push to get them involved and consulting so you do the right thing. Even then, they may devise curveballs to try and break what you’ve done.
Building is a Team Game
Point 4 – People, players and the importance of team
Building a Data Platform is a great choice. But you cannot do it alone. Below is a list of a few people you will need to have floating around as ‘dedicated’ named job roles. Or people whose backs you will need to have as the Platform develops.
(Cloud) Data Architects – Can someone ensure your building complies with the organisation’s cloud design? How about that it works with the organisation’s infrastructure? What about ensuring architectural decisions are made right? You may not get a dedicated data architect, but there’s often one person floating around who knows this stuff… find them, make sure they have time, and get them involved.
Data Engineers – The backbone of any data organisation (I’m not biased, it’s the truth). You need data engineers with good tools and enforced principles to ensure the right things get built with architectural oversight. You need code compliance and inbuilt checking to ensure the right data gets processed at the right time.
BI & Analytics – These folks are often dotted around a business if not centralised. But they’re the random fonts of knowledge who know exactly where data is kept, in what format and the business process that spits it out. They’re the spies and enforcers that can help you massively.
Data Management (Governance & Quality) – You may not have these folks right away, but you’ll have a legal department and Data Engineers who often know how Data Quality works anyway (We’ve been doing it for 40+ years!).
Data/Platform/FinOps – DataOps for when you get going, that Continuous Improvement pipeline of refining your platform and pipelines. PlatformOps for keeping you protected and ensuring you’ve got the right things being built out. FinOps… because what we do is expensive, and justifying that is life.
Delivery, Product, Business Analysts – The ‘soft’ Data folks whose responsibility you can attribute to prioritisation, investigation, analysis & delivery of the very goals, objectives and awesome stuff you want to deliver.
Your teams end up looking a little like this:
Remember, your team may not perfectly map out to this (maybe it is more, usually it’s less), but this is the delineation of responsibilities you should think about. If you are seeking outside help, consider these roles as well (don’t just pay for strategic consultants who will give you PowerPoints but won’t do any of these things).
Also, don’t forget about management and business stakeholders, who need to be kept involved in the process (more on that below).
A diversified team makes platform development actually work. A lack of resources (or heavy mix in 1-2 areas) takes away from success. Build it in the right way with the right mix of roles!
Communication is Key to Delivery
Point 5 – Delivering & Communication
Ways of Working
None of this works without a good Ways of Working.
Who does what?
Whose responsibility is it to do it?
Who will prioritise?
What ceremonies will we have?
How often do we meet?
How do we contribute to the greater goal around us?
How does the organisation get its input into the great pot?
This comes back to the Operating Model question/ structure and these are all questions you will need to figure out.

I often recommend that people embrace a Chapter & Tribe approach when your Data Team starts branching into big areas. It lets you quickly develop and deploy teams in set structures into different business areas, or retain them to continue developing on our platform as needed.
How you show progress (Agile delivery metrics)
It’s nice showing off your end results, isn’t it? But by what metrics do you show stakeholders you’re making progress?
Project Status Updates – Blocks of work to show when, where and what will be delivered (for example, customer data in Q1 of 2025).
Velocity of Work – How quickly you’re delivering on your work (in user stories). It’s team-dependent and not often interchangeable. But you can see (and show) quickly if a team is constantly performing.
Customer Satisfaction – How does the organisation perceive your teams? Are they getting benefits from working with you?
Planned vs Done Ratio – A view of how much work you expect to do each sprint vs how much you do get done.
Value Turnover – You’ve already got the ROI on your prioritised backlog, so use it!
PETALS – An internal team metric to show Productivity, Enjoyment, Teamwork, Average, Learning & Serenity. Each of them is extremely important to ensure a team's ongoing functionality.
What are you deploying and when – A bus matrix works so well for this. Show me your Dimensions (and what data populates them), then show me the Business Process down the side with a colour code of what quarter they’ll be deployed in.
Hopefully, with these artefacts, you can demonstrate that what you are doing is worth the enormous cost and many resources. Remember, a platform build is often delayed and goes over budget, so communicating how it is delivered is crucial to get continued support from your main stakeholder (who, if you remember, is often not technical)
Wrapping it up!
This is not everything.
This is just a brain dump from a Head of Data Engineering who’s been around the block frequently.
It’s telling that only one area talks about the questions you need to ask yourself from a technical perspective, isn’t it?
You, as a leader, will need to answer a thousand questions as the going gets tough when you want to build a data platform. There is also organisational processes and procedures you’ll inevitably have to dilute, dispose, digest to get this done and many of the above questions will allow you to hit the ground running and deploy something good, quickly.
Good luck!
Dylan here again:
Next week we will jump into the much talked about and often misunderstood subject of Machine Learning and AI. With the first article of five, we will examine the most basic question of: what the hell is ML and why is everybody talking about it?
I’m tired of new data people jumping to ML, building a model in PyTorch, and thinking they are amazing data scientists who deserve $100k+. Or companies wanting to skip analytics (and even ML) and do AI with everything.
So, it’s time to tackle the subject and discuss where it exists in the data ecosystem. Tune in next week and have a great Sunday!
Thanks for the read! Comment below and share the newsletter/ issue if you think it is relevant! Feel free to also follow me on LinkedIn (very active) or Medium (increasingly active). And if you are interested in consulting, please do reach out. See you amazing folks next week!
They don’t really know what they are talking about, but you nod along anyway Yes but shouldn’t we inform them to help along (hypothetically speaking) often the leader or the head may take offense assumption, if the correction will have impact would be carefully presented by giving the feedback and present it in a way they can comprehend , this would be a great help to the leaders and the businesses. Or politely say I Will present you the effects of these at your convenience . This doesn’t offend anyone who is smart enough to pose changes (in my business view)
Can you identify an enterprise where the executives don't already consider it to be "data driven?"
Is there a reliable method of determining the extent that an enterprise is "data driven" today? It seems to me that responses to these two questions in terms that are easily grasped by executive management are a key to initiating programs that generate tangible value by improving the content, quality and delivery of data available to to the enterprise.