Artwork for podcast Data Mesh Radio
#81 Finding Useful and Repeatable Patterns for Data - Interview w/ Shane Gibson
Episode 8126th May 2022 • Data Mesh Radio • Data as a Product Podcast Network
00:00:00 01:25:00

Share Episode

Shownotes

Data Mesh Radio Patreon - get access to interviews well before they are released

Episode list and links to all available episode transcripts (most interviews from #32 on) here

Provided as a free resource by DataStax AstraDB; George Trujillo's contact info: email (george.trujillo@datastax.com) and LinkedIn

Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here

In this episode, Scott interviewed Shane Gibson, CPO/Co-Founder of AgileData.io and Agile Data Coach.

A few takeaways from Shane to start:

- Agile methodology is about finding patterns that might work, trying them out and deciding to iterate or toss out the pattern. It's going to be hard to directly apply software engineering patterns to data but we should look for inspiration there and then tweak them.

- Any time you look at a pattern you might want to adopt or evaluate if a pattern is working for you, ask yourself: will this/does this empower the team to work more effectively?

- Applying patterns is a bit of a squishy business. Get comfortable that you won't be able to exactly measure if something is working. But also have an end goal in mind for adopting a pattern - what are you trying to achieve and is this pattern likely to help you achieve that?

- Share your patterns to not only help others but to get feedback and maybe ideas to iterate your pattern further.

Shane's last 8 years have been about taking Agile practices and patterns and applying them to data as an Agile Data Coach. And those patterns required a lot of tweaks to make them work for data. A big learning from that work is that when applying patterns in Agile in general, and specifically in data, each organization - even each team - needs to test and tweak/iterate on patterns. And that patterns can start valuable, lose value, and then become valuable again. Shane gave the example of daily standups drive collaboration as a forcing function but then lose value when that collaboration becomes a standard team practice. If there is a disruption to the team where collaboration is no longer standard practice, daily standups could get value again. So how do we apply these Agile concepts to data?

Currently, Shane sees no real patterns emerging in the data mesh space - it is quite early as patterns often take 5-8 years to develop and data mesh is maybe 12 months in to even moderately broad adoption and is such a wide practice area, there are many practice areas that patterns will need to cover. But, that lack of patterns makes it quite hard for even those who want to be on the leading edge of implementing data mesh instead of the true bleeding edge - having to invent everything yourself is taxing work! So we need companies to really take existing patterns, iterate on them, and then tell the world what worked and what didn't. If people aren't sharing patterns, that's going to make it hard to adopt data mesh for many organizations.

Shane believes that it will likely be pretty hard for many organizations - or at least many parts of large organizations - to give application developers in the domain the responsibility of creating data products. If your domains aren't already quite technically capable in building software products, it's going to be very hard for them to really handle data needs. So looking at domains that are using large, out-of-the-box enterprise platforms or SaaS solutions instead of rolling their own software, will they really have the capability to manage data as a product? If their domains don't have the most complex of data, maybe? But if they do, are they really mature enough to handle it? A very valid question.

To really be agile, using Agile methodologies, you need to first adopt the Agile mindset and not just patterns and practices, per Shane. Agile is really about experimenting with a pattern and either iterating to make it better or throwing it out. It's not about being precious. As mentioned earlier, you should also throw out patterns that were effective and aren't helping you any more. You need to do the same at the team and organizational level if you are going to successfully implement something like data mesh. Your teams and your organization are like a living, changing, evolving organism - treat them as such.


A very important point Shane made is data mesh isn't solution - it needs to at most be a way of approaching your data and analytical challenges of organization but with a true purpose in mind. The purpose isn't implementing data mesh. The purpose is a business objective or challenge and data mesh is helping you tackle that. Also, data mesh is not the right solution for many organizations, especially smaller ones or ones that don't have highly complex data needs - those organizations should review data mesh and understand the principles and work towards them but your real challenge isn't the centralized team being a bottleneck so don't take on the pain of decentralizing to be hip and trendy.


For those who haven't really dealt with Agile, a "fun" potential learning, per Shane, is that there isn't really a great pattern for measuring if a certain pattern is working. Proving how well something is working is pretty impossible so a large part of it is feel - we chose this pattern to improve collaboration. Do we believe our collaboration has improved? If yes, great, let's try to iterate and improve a bit more. If no or our collaboration has even gone down, get rid of it!


Per Shane, when evaluating if you are effective in your Agile methodology, ask: does the organization empower this team to work effectively? You will probably need to look at this on a team by team basis and repeatedly ask this question over time. Trying to scale Agile to fit all teams in an organization is often an anti-pattern. And if you are in a hierarchical company, adopting Agile patterns alone is probably not going to change the way you work in the long-run, you need to break the hierarchies in some way.


For Shane, there is a big question that data mesh has yet to answer: can we really move the data production and ownership to the application developers? He thinks if we look at DevOps and how developers took on the necessary work for testing and CI/CD, we can. But then the even bigger question is how. How can we map the language of what needs to get done to the software engineering semantics?


For Shane, the idea of a proof of concept - PoC - is just broken. We need to rethink it entirely, especially for data mesh. What are you really trying to prove out? He believes there are typically two types of PoCs and most default to Type 1 when potential beneficiaries expect the output of Type 2. In Type 1 PoCs, you are out to prove a high-level hypothesis that has lots of uncertainty. It's about experimentation and doing it in a "quick and dirty" way that is not ready for production. But the output of Type 1 is all about proving out the hypothesis - not a production-ready result.


Type 2 is a minimum viable product or minimum valuable product - what can we strip away from our end goal to get to something that can be used and is - mostly - productionalizeable. Literally what is the minimum that is viable? It is about proving the capability to deliver and delivering something of value sooner. So ask yourself, what are you really trying to prove in your PoC?


Shane finished on three points:

- Empower your teams to change the way they work

- Stop vendor and methodology washing data mesh

- Regarding data mesh, share what patterns you are trying to adopt, why you chose them, and what is working/not working. Data mesh can only evolve to something really great if we work together and share more information.


Shane's LinkedIn: https://www.linkedin.com/in/shagility/

Shane's Twitter: @shagility / https://twitter.com/shagility

AgileData.io website: https://agiledata.io/

AgileData Way of Working: https://wow.agiledata.io/

Shane's Podcasts: https://agiledata.io/podcasts/


Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/

If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/

If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here

All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, and/or nevesf

Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): AstraDB

Links