Artwork for podcast Data Mesh Radio
#248 Doing Data Quality Right by Building Trust - Interview w/ Ale Cabrera
Episode 24813th August 2023 • Data Mesh Radio • Data as a Product Podcast Network
00:00:00 01:10:11

Share Episode

Shownotes

Please Rate and Review us on your podcast app of choice!

Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/

If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here

Episode list and links to all available episode transcripts here.

Provided as a free resource by Data Mesh Understanding. Get in touch with Scott on LinkedIn if you want to chat data mesh.

Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.

Ale's LinkedIn: https://www.linkedin.com/in/alejandracabre/

In this episode, Scott interviewed Ale Cabrera, Senior Data Quality Product Manager at Clearbit. To be clear, she was only representing her own views on the episode.

Some key takeaways/thoughts from Ale's point of view:

  1. A key part of understanding what data work will be impactful is a simple phrase: "Is my understanding correct?" Putting out there what you took in and making sure you're on the same page will save a ton of time and headaches!
  2. Her advice to her past self: In data, far too often, we try to jump to solutioning instead of really taking the time to understand the problem. Start from understanding the problem and assessing it first.
  3. It's very easy to make data say something that it's not actually reflecting. Quality isn't just about accuracy or similar metrics, sometimes there are intangible aspects around correctness that people get but usually can't measure.
  4. In data work, many people miss two crucial aspects - the voice of the customer and the why. If you build the greatest thing ever but it isn't what the customer wants, it won't be used. Similarly, if you focus on the work and not the target outcome, your results are likely to be subpar.
  5. If you want to prove data work return on investment, try to associate it to a key company metric and talk about how improving that metric will drive better business outcomes.
  6. When you want to prove out the value of data quality, attach quality issues to direct business challenges or goals. It’s easy if you are a company selling data but you have to understand why bad quality data negatively impacts the company in order to gain influence to improve your data quality.
  7. Far too often in data work, people try to exchange data - the 1s and 0s - and forget to exchange information. Get people together and make sure you are in alignment, get people actually talking to make sure you understand and have a good forward path.
  8. The most crucial aspect of data quality is trust. If you can't drive trust, then all the 'quality' in the world doesn't mean anything because people won't use it.
  9. As a data person, your job is not to do data work. It is to unblock teams/projects/people and add value. Yes, that is through data work but the work isn't the point.
  10. The way to think about trust is a combination of "credibility, reliability, and intimacy by self-orientation."
  11. Measuring trust is hard but a good way is through interviews with users.
  12. Nothing will ever be perfect so consumers need to understand that. A mature and healthy conversation is to ask them to tell you if they see any data quality issues because issues always can happen. If they won't consume from data that might not be perfect, they can’t consume any data.
  13. When you are quick to react to data quality issues, your consumers can get more value from the data because they can be more sure sooner whether the shape of the data changing is an issue or a genuine reflection of something changing.
  14. In data, typically one-size-fits-all really ends up being that it fits none. You don't want to over customize a data product but still, your consumers are likely going to want something that fits their needs rather than something generic that they have to do the heavy lifting to be able to use.
  15. How people consume the same data can have a large impact on trust too. While data people hate Excel, sometimes you have to deliver something they can use in Excel. Otherwise, they will struggle to trust the data enough to rely on it.
  16. Is what makes something great the output or the impact? When applying product thinking, something doesn't really matter unless it is used to generate value.
  17. A key aspect of product thinking is prioritization. Yes, it would be great if we could build to every ask but focusing on what will most likely deliver value and why will keep you more focused on generating value through your data work.


Ale started with a bit about her career and what led her to focus on data quality, including what led her to her current role. On advice to her past self: in one of her past roles, a lead engineer gave her enduring advice: think before you code 😎 Oftentimes, in data, people jump to trying to solution instead of really taking the time to understand the problem and look at the various potential solutions to choose one that will work better in the long run. She also said to respect the data when considering quality. You can get data to tell a story that it really isn't reflecting the reality of what is happening so you have to give it the respect to not shape it to tell "statistical lies".


Another important aspect of data work for Ale is focusing on the voice of the customer and the why. What is the user telling you, what are the pain points. You can build the best solution but if it isn't what the customer wants, they probably won't use it. Look at all the amazing data platforms that barely anyone uses. Focusing on the why, why are people looking at this data or trying to address this challenge, will keep you from trying to solution instead of meet the customer where they are and help them solve the business challenge, not just solve a challenge about data. As a data person, the point of your job is not to do data work. It is to unblock teams/projects/people and add value. Yes, that is through data work but the work isn't the point.


When trying to prove return on investment for data work, Ale believes it's important to tie it to a key business metric. Something like 'Our data quality improvement from 98% accuracy to 99% accuracy will mean a decrease in churn by 5%, netting the company X amount of revenue.' In her case, she is attaching the data quality work to reducing churn because it is a key metric that is intrinsically linked to data quality in a company selling data.


For Ale, an important aspect of establishing yourself as an internal expert on data quality is tying the data work to value - what actually matters for the company. Improving data quality for something that drives 1% of revenue so it grows 25% is not as much of an increase as improving something that is 10% of revenue so it grows 3%. But the business might favor one over the other. What matters most to the business and why?


Ale shared a story about why it's so crucial to get people actually in a space exchanging context and information instead of just exchanging the 1s and 0s of data. She was working on a project involving two teams and neither side had ever met each other despite being months into the project. The data quality was terrible, it was seeming like half the data was wrong or missing. So the other side asked about what was happening when they were pulling data from both servers. Her team only knew about one of the servers. So get in the room - virtual or physical - and actually communicate!


Trust is the most crucial aspect of data quality work for Ale. If you have the best quality data but people don't trust it enough to use it, then it's still not of value. And trust is often built more through relationships than anything else (see Beth Bauer, episode #218). What is the purpose of the work if not to drive an outcome. If there isn't trust, can you really contribute significantly to an outcome? The way to think about trust is a combination of "credibility, reliability, and intimacy by self-orientation." She uses interviews as a good way to measure trust as time goes along. And that also encourages users to actually say something if they see an issue, which also increases trust after an issue. Listening to people is just a fundamental building block of driving trust.


When asked about if you lose credibility when telling data consumers to immediately flag any issues, Ale was clear that in data, mistakes happen. Nothing is ever perfect. So it's crucial that we can have conversations and tell people to speak up if they see quality issues. Recognizing that there will be data issues doesn't mean you are sloppy with data, only that you are realistic. Being aware and reacting quickly to issues also means consumers can be more sure that an anomaly or change in the data is actually something real and react to that more quickly. Recognizing you will make mistakes and working quickly to rectify those mistakes builds trust.


For Ale, there is a tendency in data for delivering overkill and/or something the other party has difficulty trusting. Providing a dashboard for someone who wants to see the actual data is just not that helpful. Meet people where they are while you upskill them to be better, even if that where is in Excel or Google Sheets.


Ale talked about switching to the product mindset in data. If you are building 'amazing things' but no one is using them, are they really all that great? Is what makes something great the output or the impact? A good way to get to impact is focusing on acceptance criteria - what would make the user happy and what are they actually expecting?


In wrapping up, Ale shared a bit about how to become a good product manager and her thoughts on how crucial prioritization is to actually applying data product thinking. How specific should you get with a solution is a very tough question but you can start by asking what is the expected impact versus a more generic solution.


Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about

Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/

If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/

If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here

All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Links

Chapters