Data Mesh Radio Patreon - get access to interviews well before they are released
Episode list and links to all available episode transcripts (most interviews from #32 on) here
Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.
Jill's LinkedIn: https://www.linkedin.com/in/jillianmaffeo/
Developing Interoperable Channel Domain Data (blog post): https://vista.io/blog/developing-interoperable-channel-domain-data
In this episode, Scott interviewed Jill Maffeo, Senior Data Product Manager at Vista.
Before jumping in, Jill gives a lot of very useful examples of outcomes they've been able to drive that could be abstracted to apply to your own organization's business challenges. Outcomes like better customer segmentation, faster time to launch new offerings, etc. If you are having difficulty with stakeholder buy-in, especially for someone in marketing, this episode could help you frame things in their language.
Some key takeaways/thoughts from Jill's point of view:
Jill started off by talking about her current role where she manages a "team of products that span the gamut between data ingestion, data curation, metadata curation and creation, and also [the] upper funnel portfolio around measurement and insight from a demand perspective." So, essentially she is managing a suite of data products around marketing that covers a wide variety of needs, some more technical and behind the scenes and many on the front-line of powering analytics.
Taxonomy can be a significant help to driving interoperability. Taxonomy is ultimately about finding the right balance between flexibility and standardization according to Jill. Much like many decisions in data mesh, it isn't black or white but it's about finding the happy balance somewhere on the spectrum of grays. Having simple tagging and taxonomy has allowed them to see how something like marketing materials - e.g. postcards and flyers - has evolved over time without having to manually connect the data each time - when a new product is launched, it is added to the tagging and product hierarchy.
Jill also believes that good tagging and taxonomy remove - and prevent - tech debt. It means you have an easier time generating new insights without having to do manual stitching on a one-off basis and it creates a much lower barrier to people leveraging the data from outside the domain - if they can understand the general taxonomy, they don't have to be as deep in the context to leverage the data effectively. And taxonomy can help more effectively share metadata to let other domains really understand what a domain is doing.
What Jill and her team saw early in their data journey was that each team was doing a good job tracking their metrics but all the metrics were siloed. Even though they were run independently, obviously the organization needed information to be able to span the different marketing channels. So Jill and team started to really classify and say something like 'hey, it looks like you are doing what you call X and team B is doing something they call Y but we'll map it to Z.' That way, it's far easier to look at a customer journey across domains and silos. You can tell much richer stories.
Where Jill really sees the biggest value with taxonomies is at scale. To start out with, they are helpful but as you add more and more complexity to your data mesh implementation, instead of combining data by manually integrating data products with each other, you can quickly see historical trends across multiple domains to get a much bigger picture. The integration work, at least at the concept level, was done for you. But taxonomies are also not a magic bullet.
A key strategic goal at Vista over the last few years has been personalization according to Jill. So being able to see the big picture of customer journeys and being able to intersect with them with the right offerings at the right time - that sounds like business nirvana. But if you can't see everything that is happening across your many offerings, is that really possible? Or are you preparing yourself with historical data for a potential ML model down the road? Getting the right metadata in place early, that early investment, set them up for value down the road.
Jill talked about how after doing her PoC around what she was trying to do with taxonomy and metadata creation and curation, some lines of business weren't really ready to engage. But instead of putting this taxonomy project on hold, she looked for additional places where she could make progress for when those lines of business were ready, that the value creation at that stage wouldn't have missed all the necessary data in the meantime. Essentially, she looked to avoid the loop of 'it's not of immediate value but we'll get to it eventually' then six months down the road 'oh, if only we had this data over the last six months, then this would be of value.'
Once what Jill and team implemented started to show early wins, they circled back with the business to say 'hey, this is generating value, here's how it saved you some effort', so the business people were more ready. And the teams they had helped already started to advocate for more of the work too. Which gave Jill's team proof points when going to execs and other stakeholders that investing time in metadata and taxonomy is driving value and their participation would drive more value for all.
A key win for Vista according to Jill has been moving from a metric like how many emails were sent to someone in a certain period to what kind of emails - was it all promotional or were some of them based on an order? So getting more granular about the what has helped them reduce email opt-out rates for example. The next phase of that is to look at specific interactions a customer had and develop a next best action model - what can the company do to drive more business that is highly relevant? Again, that personalization. Jill gave a number of additional useful examples of what tagging and taxonomy is driving for them.
Jill also discussed how there shouldn't be one overarching taxonomy that everything should adhere to - you want to look at data from multiple angles. At Vista, customers may be looking for different things - like a postcard versus a wedding invite - but for manufacturing, it all looks the same. And you again want to apply personalization so you don't ask someone if they'd like to create another wedding invite a year later… But, if they are a restaurant that orders menus every 3 months with a change in the season, you want to 1) send them promotions ahead of time and 2) if they are late in their order pattern, potentially escalate to do something additional.
Especially when looking at taxonomies, Jill believes you can entice people to participate earlier by pointing out they will influence the overall choices more than those that come later. So they get more say. But, it's important to balance that as you bring on new teams: their voice matters and their feedback is important to continue to improve and evolve your taxonomies.
Jill gave the example of building a home on why we need to be coordinated in how we generate our data, with things like a unique ID: do you want the drywaller to do their work before or after the electricians and plumbers have done theirs? So getting the coordination done upfront on how you will look to combine data, even if not perfect, can save you a fair amount of time, money, and headaches down the road.
Most customer journeys are cross domain. Jill showed how the more you make domain data interoperable, the more insight you have into the customer that can drive better business with your organization as a whole - globally maximize customer value - instead of only trying to locally optimize value by domain. And many - most? - of your executives, their questions and desired insights are rarely only in one domain. So how are you working to really answer their questions and have that ability to answer cross domain questions?
Jill talked about phase one versus phase two of a data mesh journey. In phase one, you are focused on creating data products to meet specific use cases and it's pretty easy to end up with some overlap. So when you get to looking at your data products as a full suite of data products, those overlaps look like extra cost. So early in your journey, make sure domains are communicating about what they are building to prevent doubled up work. And it will still probably happen and that's okay. But part of product thinking and product management is portfolio management.
A good check for teams helping manage a data product portfolio is to take a few stakeholder questions and use what's available to try to answer them. How was the user experience? Is the information easy to understand? Etc. That will inform future platform improvements and new data products.
"When you're thinking about interoperability, it's just playing nice, right?" If you think of interoperability as a key part of your culture, it's far easier to implement. Let people know why interoperability is good for them and the whole company.
Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/
If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here
Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): AstraDB