Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here
In this episode, Scott interviewed Marius Ingjer, Co-founder and Senior Consultant at Knirkefritt AS. Marius has been working with a few clients on implementing data mesh.
They covered 3 distinct topics: evaluating if data mesh is a fit for your organization, team structure challenges in data mesh or data mesh-like implementations, and a simplified definition of federated computational governance.
Evaluating if data mesh is right for you:
To start, Marius provided a list of evaluation questions to help you determine if data mesh might be right for you:
How many data sets are you producing?
What is the lead time to creating a new dataset?
How well are your datasets serving your data needs?
How many domains do you have?
How complex are your domains?
How does the team respond to new data requirements?
How usable in general is your data?
Every company wants to share data well but the centralized data team isn't the bottleneck yet for many. Centralization can add a lot of value until it starts to become more hurtful than helpful and yes, figuring out that point is easier said than done. Centralization of data fights Conway's Law and can become way too much cognitive load so it will eventually become an issue for many organizations.
A key question in evaluating if data mesh is a fit: what is the cost of allowing your data processes to fail? Per Marius, the business consequence of failed reports has historically not been that high. But if you are driving business decisions, whether that is ML or just crucial day-to-day decisions on your data, data mesh might become more attractive.
Team structures and challenges in data mesh:
In general, it's important to understand that implementing data mesh will cause cultural challenges - Marius believes developers generally don't want to ALSO share their data. It's additional work so you have to align incentives, which is far easier said than done.
That additional cognitive load on developer plates is very crucial. We need to make we address that load to not burn them out. That means realigning incentives but also having extra help with things like grooming the work backlog. Providing extra resources helps but that is more about tackling the work, not handling the increased cognitive load. And learning about how to do data well is a pretty big learning task.
Marius recommends giving teams the extra resources but also reshaping the team and business structure, such as the KPIs, to effectively prioritize and shape the requirements. He also recommends having a stick, not just carrots, or teams will try to just opt out.
Simplified definition of federated computational governance:
When talking about this topic with developers right now, it feels far too complex. To make it less complex, reduce the friction to developer decisions but not add much to cognitive load. An example might be providing easy data masking tooling for PII or extensible data APIs so developers can focus on the value-add.
Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): AstraDB