Use code DATAGOV23 for 35% off ebook copies of Designing Data Governance from the Ground Up here: https://pragprog.com/titles/lmmlops/designing-data-governance-from-the-ground-up/
Please Rate and Review us on your podcast app of choice!
Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here
Episode list and links to all available episode transcripts here.
Provided as a free resource by Data Mesh Understanding. Get in touch with Scott on LinkedIn if you want to chat data mesh.
Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.
Lauren's LinkedIn: https://www.linkedin.com/in/laurenmaffeo/
Designing Data Governance from the Ground Up (Lauren's book): https://pragprog.com/titles/lmmlops/designing-data-governance-from-the-ground-up/
In this episode, Scott interviewed Lauren Maffeo, author of the book Designing Data Governance from the Ground Up and adjunct Lecturer at George Washington University. To be clear, she was only representing her own views on the episode.
Some key takeaways/thoughts from Lauren's point of view:
Lauren started with a somewhat common refrain for this podcast: the pace of data practice maturation - around governance and data practices as a whole - is just not keeping pace with innovation in other aspects of software. Even the pace of conversation is not maturing as fast. Cybersecurity is maturing very quickly for example but we're just not seeing that in data. So companies are just not ready to really derive a lot of value from things like machine learning (ML) or natural language processing (NLP).
One of the big issues around industry maturity and data governance for Lauren is that there isn't even a large community around the topic. So there isn't a larger cohesive conversation around data governance best practices. Scott note: it's really hard to have a broader conversation too because approaches and practices vary widely and data governance has about 15 varied subtopics that each deserve their own focus rather than being lumped under a huge umbrella of 'other' that governance has become.
Most cybersecurity breaches, Lauren pointed out, are caused by internal employees making a mistake. So how do we think about that relative to data? Is that people creating low quality data and/or is it people not being data fluent enough to actually make good decisions based on data? In cybersecurity, there is a big emphasis on training people to see what an attack looks like - should we take the same approach in data regarding bad quality data? "You really have to embed data literacy into very strategic ways of communicating with the organization and educating them that way. Without that approach, I think very little progress can actually be made."
Lauren talked about how few companies are really going broad with their data literacy programs, training up a large number of their employees. There is a lot of talk about that as part of data governance programs but few are walking the walk. And she believes it's not that hard to get people to a relatively data fluent level - understanding SQL, being able to more easily spot low quality data, etc.
"We'll do the data governance later," is something Lauren has seen and heard in conversations. Governance is seen as something that can be layered on like a coat of paint at the end of a car being manufactured. But because good governance is intrinsic to data quality and matching to the actual business use case, trying to do it later rarely leads to good results.
When asked about selling the return on investment of data governance work, Lauren admitted that it's often quite nebulous but data governance is so key that people know they need it despite not being super clear on the specific value of the work. And you can roll out your data governance tech, policies, and processes at a reasonable pace, creating some definitions and a sandbox to show people how it will work. She is really big on the idea of a sandbox to get people used to new governance practices and tech. It isn't as though everything changes suddenly, it's that you're working towards better data practices that will drive value for the organization. Fail fast is "the essence of innovation in tech" so we need to embrace it far more - but still safely and sanely - in data.
"We also can't afford for leaders of any department to not know what quality data looks like for their teams, because their success, the success of their teams depends on having quality data that their customers trust," Lauren said. So we all need to be in this together and have domains really owning and understanding their data. That can't be on a central data team.
Gamification is one thing Lauren is seeing work for improving data literacy/fluency. It is a great pathway to creating a data-driven culture. Make it fun and give out rewards :)
Lauren wrapped on a simple message. Automate your standards. It is easy to have your tech and standards/processes quickly lose alignment if you aren't making things easy for people via automation.
Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about
Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/
If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here
All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf