Data Mesh Radio Patreon - get access to interviews well before they are released
Episode list and links to all available episode transcripts (most interviews from #32 on) here
Provided as a free resource by DataStax AstraDB
In this episode, Scott interviewed Omar Khawaja, Head of Business Intelligence at Roche Diagnostics. To be clear, Omar was only representing his own viewpoints and learnings, not necessarily those of Roche.
Some interesting thoughts/takeaways from Omar's point of view and learnings:
Omar started the conversation with a definition of what Business Intelligence means to him and how it has evolved from a mostly reports-based function - or descriptive analytics - to include predictive analytics and then prescriptive analytics. But monolithic approaches - enterprise data warehouse, data lake, etc. - just haven't led to great outcomes for many (most?) large organizations. Omar quoted Albert Einstein, "Insanity is doing the same thing over and over and expecting different results." So why keep trying to throw technology and a monolithic architecture at our growing data and analytics challenges. So in 2020, Omar was happy to help evaluate if decentralization could work in data for Roche.
They moved to a domain-aligned model with the business intelligence, analytics, and data engineers - at least those not building the platform - moving into the domains. That way, they can become the people who know the data best including the business context. They can help to shape and develop data products.
For Omar, the way most analytics work has been done historically - and how people have thought about analytics work - the outcome of analysis itself is the focus. So if you build a dashboard, an analyst creating a dashboard might spend all their time focusing on the dashboard with little to no thought about the fragility of the inputs to the dashboard. What happens if upstream data changes? In data mesh, the dashboard is an output of data work that is more easily created and managed because the analyst knows the upstream will be maintained as a product and even if there are changes, there is communication about impending changes. And they might be able to reuse the data in another analysis.
Regarding roles needed in domains for data mesh, Omar gave the incredibly common data mesh answer of "it depends". There needs to be an owner who understands the data-as-a-product concept - not just creating data products - as the lifecycle of data is crucial to doing data mesh well. But each domain and each data product has different needs; so focus on building the cross-disciplinary team needed to take care of the job now - and in the future - and not on the exact composition of each team. The use cases aren't cookie cutter, why make the teams cookie cutter.
At Roche, for many domains, the initial data product design is done by a senior architect but the data product development is done by a data engineer - possibly called an analytics engineer - but if there isn't a need for extensive data engineering for the data product, it can be developed by a data/business analyst or software engineer. The role title doesn't end up mattering, the team capabilities and the needs do. Design with data consumers in mind.
Per Omar, for all organizations, there is some kind of existing analytics practice - or brownfield. It is important to leverage what you've built historically - both the types of analytics and the team. You might have great insight into what information consumers want. And many of the people involved in the data warehouse or lake can evolve into a role that is highly valuable in data mesh - they probably know your data quite well. But, it is crucial that they understand why they need to evolve and are given the resources to do so.
Echoing many previous guests, Omar said to focus on the outcomes in data mesh, not the exact structure. Empower your teams to figure it out and enable them to do the work, especially via the data platform. The role of the "citizen data scientist" never really worked; now we can focus on giving many more access to information and insights, not just access to data with little to no context. And sharing across the company is crucial to finding scalable, repeatable practices and patterns.
At Thoughtworks' State of Data mesh conference, Omar presented how each of the four pillars or principles of data mesh represents Mind, Body, Heart, and Soul. Digging into data-as-a-product, or the heart of data mesh in Omar's analogy, there is far more than just creating data products. He shared their learnings in data product discovery - how do you figure out what products you need? What are the expected outcomes from creating this data product? Who is going to use it?
Omar discussed the need for a real mindset shift to understand data-as-a-product especially. It doesn't come naturally for most people so you need very conscious change management - change management and organizational challenges will almost certainly take considerably more of your time in a data mesh implementation than you expect. And it's not shoving people forward or dragging them - it's taking them by the hand and working with them to find a good way forward. Once they feel the empowerment, Omar is seeing most people really like this new way of working. And they are focused on continual value delivery instead of the project, one-off type of value creation.
When Omar joined Roche a couple of years ago, his first big task was forming the data strategy. Should they just do a data lake in the cloud? Was that transition working well for most organizations? The answer for him was no and he repeated about the definition of insanity - trying the same thing over and over and expecting different results. He took a lot of inspiration from the book Think Again by Adam Grant. There was a lot the organization needed to unlearn or relearn. And when COVID-19 hit, it forced organizations to get much better at communication and collaboration so data mesh was a great fit - they were already embracing decentralization.
Omar talked about how data mesh was one of the aspects of their data strategy but only one aspect. The overall goal was and is "how do we become data-driven as a company?" So you should start from the why. Why consider data mesh? What are you trying to achieve? Is it worth it?
This set of questions should also apply to everyone's day-to-day work in a data mesh implementation, per Omar. Again, focus on outcomes, what are you actually trying to deliver? Is it just a dashboard or are you trying to deliver the insights that dashboard will create? Omar recommended the Lean Value Tree approach as one way to focus your time.
Omar returned to the concept of product mindset and product thinking in data. What value are you trying to deliver? Then how much value do you expect it will deliver? Then how will we measure if we are successful? Then were we successful in delivering expected value? A big part of this is the discovery process - drive towards a business focused discussion about outcomes.
In Omar's view, no one will get really everything correct, much less in something as large, complex, and new as data mesh. It's okay to make mistakes. That's part of learning. But you need to get to a "good enough" place and move forward. Measure along the way and adjust. It's a journey, there will be trials and tribulations along the way. Learn from it and adjust. Collaborate and move forward together.
In Roche's early journey, they found some teams were duplicating work so they moved to fix that. What they learned was they should provide very early visibility in to plans to prevent teams from spending time on the same things. The data product discovery and design phases are now quite public and that's worked well. Instead of teams duplicating work, they are often early consumers of data work other teams have done.
In wrapping up, Omar again reiterated to focus on what you are trying to deliver, what is the value and that it's okay to move forward with an incomplete picture. You'll make some mistakes but prepare to learn and adjust and just get to making progress. Pretty sound advice.
Omar's LinkedIn: https://www.linkedin.com/in/kmaomar/
Omar's State of Data Mesh presentation: https://www.youtube.com/watch?v=S5ABE4bevn4
Adam Grant book Think Again: The Power of Knowing What You Don't Know: https://www.amazon.com/Think-Again-Power-Knowing-What/dp/B08HJQHNH9
Lean Value Tree definition: https://openpracticelibrary.com/practice/lean-value-tree/
Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/
If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here
Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): AstraDB