Artwork for podcast Data Mesh Radio
#151 Driving Interoperability via Taxonomies and Tagging to Power Personalization - Interview w/ Jill Maffeo
Episode 1516th November 2022 • Data Mesh Radio • Data as a Product Podcast Network
00:00:00 01:24:14

Share Episode


Data Mesh Radio Patreon - get access to interviews well before they are released

Episode list and links to all available episode transcripts (most interviews from #32 on) here

Provided as a free resource by DataStax AstraDB; George Trujillo's contact info: email ( and LinkedIn

Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.

Jill's LinkedIn:

Developing Interoperable Channel Domain Data (blog post):

In this episode, Scott interviewed Jill Maffeo, Senior Data Product Manager at Vista.

Before jumping in, Jill gives a lot of very useful examples of outcomes they've been able to drive that could be abstracted to apply to your own organization's business challenges. Outcomes like better customer segmentation, faster time to launch new offerings, etc. If you are having difficulty with stakeholder buy-in, especially for someone in marketing, this episode could help you frame things in their language.

Some key takeaways/thoughts from Jill's point of view:

  1. "When you're thinking about interoperability, it's just playing nice, right?" If you think of interoperability as a key part of your culture, it's easier to implement. Let people know why interoperability is good for them and the whole company.
  2. Taxonomies help drive interoperability because there is already an established language even if things don't fit perfectly. New concepts can emerge and your taxonomies should change. But it makes the interoperability discussions have at least a common starting point.
  3. Taxonomies are a living thing - make sure they aren't overly rigid and be prepared to continually evolve and improve them.
  4. Within your taxonomy structure, if there is a reason for things to be unique for a domain or use case, that is okay. Look for potential ways to also convert that data to best fit your taxonomy but you don't want to force a square peg through a round hole.
  5. Taxonomies really start to add a lot of value at scale. They are somewhat costly upfront with likely moderate return on investment early; but, if you do them right, will pay back a lot as you move forward. They make historical analysis - especially with interoperability - far easier because you've done the work ahead of time.
  6. Taxonomy, when done well, is about balancing standardization and flexibility. Much like most things in data mesh, it's about finding the right balance for your organization.
  7. Most customer journeys are cross domain. The more you make domain data interoperable, the more insight you have into the customer that can drive better business with your organization as a whole instead of only trying to locally optimize value by domain.
  8. Similarly, many executive questions and desired insights are not exclusive to a domain. Can we get to answering their questions much more quickly and completely? What value would that drive?
  9. To get funding for longer-term initiatives with a payoff down the road like taxonomies, look to directly attach your work to exec's strategic goals. Make it easy for them to see why this will be of value to invest in now rather than later.
  10. If a line of business isn't ready to engage on a key long-term strategic goal like very rich metadata, look for places to make progress that you control. That way, you can capture more value from data being generated now so when you look to reengage, there is more incentive.
  11. When re-engaging with a team that declined to work with you the first time, look to use as much empathy as possible. Yes, they said no but now is the time to say welcome to the party and invite them in. No 'I told you so', only 'great, how can we help?'
  12. It's easy to lose sight of differentiation in metrics. With a good taxonomy and metadata strategy, you can differentiate better between actions - e.g. not how many emails did you send but what type and why? 5 marketing emails in a week is probably bad but 4 order-related emails and 1 marketing email isn't.
  13. It's okay - and probably advisable - to have your taxonomy and tagging be able to serve multiple use cases. A postcard and a wedding invite look the same to manufacturing but are marketed very differently.
  14. A potential way to entice people to participate earlier in a data mesh journey: let them know they will have more say and influence on the general choices. However, you still want domains that are just starting to participate to know that you want their feedback and that their voice also matters.
  15. Think about building interoperability like building a structure. You need everyone coordinating - you want the electrician and the plumber to do their work before the drywaller. Redoing the work instead of setting your build schedule in place doesn't sound like a good idea.
  16. Focus on getting your standards in place for interoperability. Scott Note: no one is publishing their standards and this NEEDS to be done. People need to share their standards explicitly as everyone is reinventing these standards.
  17. When picking early use cases, potentially look for complementary use cases where having information needed for each individual use case can power even further use cases.
  18. Good - even if only rudimentary - taxonomy and tagging lets you easily see how things have changed over time without having to manually stitch data. It also lowers the barrier to external domains leveraging your data.
  19. A key benefit of taxonomies is the ability to tell richer stories. You have data across many different domains or business outcomes but you can see how they interplay.
  20. ?Controversial?: Standard tagging and taxonomy can remove - and even prevent - tech debt, partially because it prevents some manual stitching of data.
  21. A good check for teams helping manage a data product portfolio is to take a few stakeholder questions and use what's available to try to answer them. How was the user experience? Is the information easy to understand? Etc.

Jill started off by talking about her current role where she manages a "team of products that span the gamut between data ingestion, data curation, metadata curation and creation, and also [the] upper funnel portfolio around measurement and insight from a demand perspective." So, essentially she is managing a suite of data products around marketing that covers a wide variety of needs, some more technical and behind the scenes and many on the front-line of powering analytics.

Taxonomy can be a significant help to driving interoperability. Taxonomy is ultimately about finding the right balance between flexibility and standardization according to Jill. Much like many decisions in data mesh, it isn't black or white but it's about finding the happy balance somewhere on the spectrum of grays. Having simple tagging and taxonomy has allowed them to see how something like marketing materials - e.g. postcards and flyers - has evolved over time without having to manually connect the data each time - when a new product is launched, it is added to the tagging and product hierarchy.

Jill also believes that good tagging and taxonomy remove - and prevent - tech debt. It means you have an easier time generating new insights without having to do manual stitching on a one-off basis and it creates a much lower barrier to people leveraging the data from outside the domain - if they can understand the general taxonomy, they don't have to be as deep in the context to leverage the data effectively. And taxonomy can help more effectively share metadata to let other domains really understand what a domain is doing.

What Jill and her team saw early in their data journey was that each team was doing a good job tracking their metrics but all the metrics were siloed. Even though they were run independently, obviously the organization needed information to be able to span the different marketing channels. So Jill and team started to really classify and say something like 'hey, it looks like you are doing what you call X and team B is doing something they call Y but we'll map it to Z.' That way, it's far easier to look at a customer journey across domains and silos. You can tell much richer stories.

Where Jill really sees the biggest value with taxonomies is at scale. To start out with, they are helpful but as you add more and more complexity to your data mesh implementation, instead of combining data by manually integrating data products with each other, you can quickly see historical trends across multiple domains to get a much bigger picture. The integration work, at least at the concept level, was done for you. But taxonomies are also not a magic bullet.

A key strategic goal at Vista over the last few years has been personalization according to Jill. So being able to see the big picture of customer journeys and being able to intersect with them with the right offerings at the right time - that sounds like business nirvana. But if you can't see everything that is happening across your many offerings, is that really possible? Or are you preparing yourself with historical data for a potential ML model down the road? Getting the right metadata in place early, that early investment, set them up for value down the road.

Jill talked about how after doing her PoC around what she was trying to do with taxonomy and metadata creation and curation, some lines of business weren't really ready to engage. But instead of putting this taxonomy project on hold, she looked for additional places where she could make progress for when those lines of business were ready, that the value creation at that stage wouldn't have missed all the necessary data in the meantime. Essentially, she looked to avoid the loop of 'it's not of immediate value but we'll get to it eventually' then six months down the road 'oh, if only we had this data over the last six months, then this would be of value.'

Once what Jill and team implemented started to show early wins, they circled back with the business to say 'hey, this is generating value, here's how it saved you some effort', so the business people were more ready. And the teams they had helped already started to advocate for more of the work too. Which gave Jill's team proof points when going to execs and other stakeholders that investing time in metadata and taxonomy is driving value and their participation would drive more value for all.

A key win for Vista according to Jill has been moving from a metric like how many emails were sent to someone in a certain period to what kind of emails - was it all promotional or were some of them based on an order? So getting more granular about the what has helped them reduce email opt-out rates for example. The next phase of that is to look at specific interactions a customer had and develop a next best action model - what can the company do to drive more business that is highly relevant? Again, that personalization. Jill gave a number of additional useful examples of what tagging and taxonomy is driving for them.

Jill also discussed how there shouldn't be one overarching taxonomy that everything should adhere to - you want to look at data from multiple angles. At Vista, customers may be looking for different things - like a postcard versus a wedding invite - but for manufacturing, it all looks the same. And you again want to apply personalization so you don't ask someone if they'd like to create another wedding invite a year later… But, if they are a restaurant that orders menus every 3 months with a change in the season, you want to 1) send them promotions ahead of time and 2) if they are late in their order pattern, potentially escalate to do something additional.

Especially when looking at taxonomies, Jill believes you can entice people to participate earlier by pointing out they will influence the overall choices more than those that come later. So they get more say. But, it's important to balance that as you bring on new teams: their voice matters and their feedback is important to continue to improve and evolve your taxonomies.

Jill gave the example of building a home on why we need to be coordinated in how we generate our data, with things like a unique ID: do you want the drywaller to do their work before or after the electricians and plumbers have done theirs? So getting the coordination done upfront on how you will look to combine data, even if not perfect, can save you a fair amount of time, money, and headaches down the road.

Most customer journeys are cross domain. Jill showed how the more you make domain data interoperable, the more insight you have into the customer that can drive better business with your organization as a whole - globally maximize customer value - instead of only trying to locally optimize value by domain. And many - most? - of your executives, their questions and desired insights are rarely only in one domain. So how are you working to really answer their questions and have that ability to answer cross domain questions?

Jill talked about phase one versus phase two of a data mesh journey. In phase one, you are focused on creating data products to meet specific use cases and it's pretty easy to end up with some overlap. So when you get to looking at your data products as a full suite of data products, those overlaps look like extra cost. So early in your journey, make sure domains are communicating about what they are building to prevent doubled up work. And it will still probably happen and that's okay. But part of product thinking and product management is portfolio management.

Quick tidbits:

A good check for teams helping manage a data product portfolio is to take a few stakeholder questions and use what's available to try to answer them. How was the user experience? Is the information easy to understand? Etc. That will inform future platform improvements and new data products.

"When you're thinking about interoperability, it's just playing nice, right?" If you think of interoperability as a key part of your culture, it's far easier to implement. Let people know why interoperability is good for them and the whole company.

Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at or on LinkedIn:

If you want to learn more and/or join the Data Mesh Learning Community, see here:

If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here

All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): AstraDB