Artwork for podcast The Data Download
Inside Collibra: Treat your data as a product
22nd June 2022 • The Data Download • Collibra
00:00:00 00:28:56

Share Episode

Shownotes

Data mesh is a relatively new concept that aims to reduce friction in maximizing the value of data. It distributes data control to different business domains that have experts in the data relevant to them. A catalog of data products contributes to the data owners' efficiency in curating and analyzing their data for business insights.

In this episode, Luis Romero, the Product Marketing Director at Collibra, talks in-depth about the four pillars of data mesh and how it can empower businesses. Jay Militscher, the Head of Data & Analytics at Collibra, also shares Collibra’s humble beginnings in executing data mesh and how they hope to improve their already robust system.

Tune in to the episode to know about data mesh, its significance, and how to utilize it within your organization.

Here are three reasons why you should listen to this episode:

  1. Understand the significance and the four pillars of data mesh.
  2. Learn how Collibra effectively implements data mesh.
  3. Discover how to get started in bringing in data mesh within organizations.

Resources

Episode Highlights

[01:50] How Data Mesh Can Help Business Domains

  • IT and data teams are not the experts on the data coming from the other departments.
  • It’s best to have data in the hands of experts who will manage, curate, and cleanse data. Eventually, they turn the data into a product for its consumers.
  • Analysts and business users waste a lot of time finding the data they need, and sometimes they even find difficulty in trusting the data.
  • Data should be pre-packaged and available in a catalog for anyone who needs it, making it easier to verify and extract the right insights from it.
  • The four pillars of data mesh are data ownership, data as a product, self-service data infrastructure, and federated governance.

[05:50] Domain Ownership

  • Most organizations have multiple business domains such as finance, engineering, marketing, etc.

Luis: “We should instead put that data into the hands of the true data stewards right within these domains.”

  • The different business domains are best positioned to manage, curate, and make the data fully and readily available to be consumed by the business.

[06:48] Data as a Product

  • Data owners with full knowledge and expertise about the data should treat data like a software product.
  • A software product has a vision, strategy, and life cycle. We should treat data in the exact same way. 
  • Treating data as a product means providing all the necessary facts and documentation. So that when it's in a catalog, it's ready to go.

[08:25] Self-service Data Infrastructure

  • Luis observed that 99% of their customers complained about their complex data landscape because they have their data across different sources.
  • Having various data sources can overwhelm companies when they retrieve and process data — more so when turning it into a usable product.

Luis: “We got to figure out a way to remove the friction from both the data producers and the consumers, and make it easy for them to go and find that data, bring that data together, understand the quality of the data, and again put it out there in a data marketplace, a data catalog, but again, make it very, very self-service.”

  • Make data as self-service as possible by leveraging all kinds of cloud technology.
  • Enterprise data catalogs can enable a one-stop shop for retrieving your data across all data sources.
  • Set up a  data marketplace where all the users can go to find certified data sets.

[11:07] Federated Governance

  • Large enterprises have acquired many independent business entities across multiple acquisitions over several years. 
  • A healthy balance between reducing risk and supporting compliance is needed, or the different entities will feel constrained as they achieve their individual goals. 
  • Some policies work for everyone within the organization, but some policies will need domain-specific context and control when dealing with their data.
  • Sharing between the different entities under privacy regulations can happen, but it's about fostering the right balance of governance while still enabling their freedom.

[14:14] Data Mesh at Collibra

  • Collibra began its approach to data and analytics with business domain ownership first before there was a central data office through its business intelligence (BI)  functions.
  • Collibra's data and analytics professionals received appropriate infrastructure and tools to enable BI functions in different departments.
  • The data office's job is to grow a team with data engineering, infrastructure, machine learning, and data science skills to enable these business domains.
  • Collibra had the infrastructure for a data mesh, so they didn't have to reorganize and are hiring even more data engineers and data scientists.

[16:16] Initial Response Inside Collibra

  • The initial response from other business domains was to get better tools.
  • The data office worked with other departments to help them modernize their technology stack, such as cloud systems.
  • Their data office built the data and analytics technology stack, but the business users had total control over the data pipeline.
  • In the beginning, Collibra faced difficulties due to not having a self-service infrastructure at scale in the cloud.

Jay: “We get to use our own product here at Collibra so that when each of those data product owners produce a data product, they're actually publishing it through the Collibra catalog so that each of those analytics folks shop for data products in the Collibra platform from each other.”

[20:07] How Organizations Can Get Started with Data Mesh

  • The organization’s management must be committed to this approach because it isn't a one-time project but a way to move the whole organization forward.
  • The management must be ready to invest in the skills, development, and cloud technology necessary to support this broadly and scalably.

[21:20] Future of Data Mesh at Collibra

  • Today, Collibra's data office is lending advanced analytics with machine learning to other domains. Later on, each domain will do its analytics directly.
  • Data mesh began centrally in the data team because they are building the infrastructure and process necessary to regularly operationalize models to retrain the other business domains.
  • The data office wants to implement more automation and integrations across all the analytics needs and services of the different domains to reduce friction.
  • In adopting data as a product mindset, Collibra will include all the documented data and development processes in the data catalog available for all data product owners.
  • To implement federated computational governance, Collibra needs to start automating its governance workflows.

[25:51] Jay’s Key Takeaways

  • Data mesh is about decentralization and distribution. It can start in a central data office that provides the data infrastructure to other domain-based data professionals.
  • A data catalog can act as a marketplace where data product consumers can access data and use the data to publish their products in the same catalog.
  • Federated governance provides global organizational oversight and some guardrails and policies while also maximizing local context.
  • Successfully implementing data mesh principles requires strong data fluency, executive-level commitment, and funding for infrastructure modernization.
  • Any company can start by picking a valuable domain ready to build a data product. Build up wins and learn to improve the implementation as you onboard more business domains.

About the Speakers

Luis Romero is the Product Marketing Director at Collibra. He helps customers get a pulse of the up-and-coming trends in the market and identify their challenges. He also ensures that Collibra is positioning its products and solutions directly in line with its customers' initiatives and business outcomes.

If you want to reach out, you can contact Luis Romero via LinkedIn.

Enjoyed this Episode?

If you did, be sure to subscribe and share it with your friends! 

Post a review and share it! If you enjoyed tuning in, then leave us a review. You can also share this with your friends and family. This episode will help them implement data mesh within their organizations through the lessons learned by Collibra.

Have any questions? You can connect with me on LinkedIn

Thank you for tuning in! For more updates, please visit our website. You may also tune in on Apple Podcasts or Spotify.

Links

More Episodes
Inside Collibra: Treat your data as a product
00:28:56
Inside Collibra: Comparing your ethics framework to spicy foods
00:13:02
Don’t just talk the talk with Anna Hannem, Scotiabank
00:30:30
trailer Welcome to the Data Download
00:00:30