Most of your data is rotten and it’s not your fault, but it is your problem
Where is the Data Quality box in the Information Product Canvas?
I did an overview of the Information Product Canvas to a team last week, and one of the questions was how do we deal with Data Quality, as there is no box on the canvas for it.
The answer is we don’t capture data quality issues in this part of the Information Value Stream, and we don’t for two reasons.
We are typically using the canvas at the ideation stage so there is little value in deep diving in the data quality quagmire at this stage, and our stakeholders just expect us to deliver Information Products that have solved any data quality issues.
Your Information Value Stream is like a Restaurant Kitchen
But I also used the restaurant analogy that I have been using for a while and it seemed to help with their framing, so I thought I would share it here, so you can use it with your stakeholders and teams if it provides any value.
Think of your Data Team as the chefs in the kitchen,. They are stuck between the food producers who provide the ingredients and the customers who want a meal (oh should I actually call a meal “food as a product” 😉
Data is often delivered “rotten”
The unfortunate fact is that while sometimes we get delivered nice fresh and juicy tomatoes, sometimes we get delivered tomatoes that are rotten.
Now if we delivered a meal to our customers that contained rotten tomatoes, then they would be very unhappy. Our customers would just expect us to be professionals and deliver the meal they asked and paid for.
And so it is with data, stakeholders expect us to be data professionals and do the data work to provide data and information of a high quality.
Our natural behaviour is to try and make it the stakeholders problem to fix the quality of the data. But that’s like us saying to the customer:
“would you mind talking to our tomato supplier for us, and once you have we will start giving you meals with fresh tomatoes, until then you will need to live with the rotten ones”
That conversation is unlikely to go well.
So as data professionals we are stuck in the middle and therefore while it may not be our fault, it’s definitely our problem.
Again using the restaurant analogy, if a chef constantly received rotten produce they would just change suppliers.
Unfortunately this is typically not an option in our data domain, we can’t just go and buy our organisations data from somebody else.
But we have a set of sharp data knives at our disposal
But we have some data patterns that we can use to help solve this problem.
The first is Data Contracts.
We should put a data contract in place between us and the data providers.
And we should put a separate data contract in place between us and the data consumer / stakeholder.
When we get given bad quality data we can point out to our stakeholders that the data producers didn’t provide quality ingredients, and show them where the data contract has been breached. And we can show exactly how that resulted in the breach of our data contact with the stakeholder.
I liken it to the customer in our restaurant actually being the owner of the restaurant, who just happened to be eating their with their family. Once we help identify the problem they will jump in and put their political might towards helping us solve it. After all the don’t want their family, or their customers for that matter, eating rotten tomatoes.
Quality Data is a shared responsibility
The owner of the restaurant is invested in the food supply chain as much as the chef is. And we need to help our stakeholders to be invested in the Information Value Stream as much as we are as data professionals.
But its the data teams problem to solve
So if you have bad data quality, it’s not your fault, but it is most definitely your problem.