Artwork for podcast Data Mesh Radio
#192 Diagnosing the Analytics Gap - All About Diagnostic Analytics - Interview w/ João Sousa
Episode 1929th February 2023 • Data Mesh Radio • Data as a Product Podcast Network
00:00:00 01:07:47

Share Episode


Sign up for Data Mesh Understanding's free roundtable and introduction programs here:

Please Rate and Review us on your podcast app of choice!

If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here

Episode list and links to all available episode transcripts here.

Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.

Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.

João's LinkedIn:

João's Medium:

Brent Dykes' LinkedIn:

In this episode, Scott interviewed João Sousa, Director of Growth at To be clear, he was only representing his own views on the episode.

The "four types" will often be throughout this summary. The four types refers to the types of analytics: descriptive - what is happening; diagnostic - why is it happening; predictive - what might happen in the future; and prescriptive - what actions should we take.

Some key takeaways/thoughts from João's point of view:

  1. Of the four types of analytics, diagnostic analytics is VERY underserved. The other three - descriptive, predictive, and prescriptive - are where most organizations are focusing more so there's a "diagnostic analytics gap."
  2. ?Controversial?: Of the four, diagnostic analytics requires the most domain/business expertise.
  3. Tips for improving your diagnostic analytics: 1) show the value of drilling down in to the why - find a few use cases and communicate the value well; 2) promote a closer collaboration between data and business people; 3) improve your definitions around data roles; 4) very clear communication of expectations and who does what; 5) don't get into firefighting mode, have a structured approach to diagnostic analytics; and 6) automate the repetitive parts.
  4. There are 3 levels of diagnostic analytics immaturity: getting "stuck in the what" instead of the why; "the usual suspects" where you look at things from the same angle, same slice and dice and only leverage a small portion of the data available; and "need for speed" where you have the right data culture of drilling down to the why bet often get stuck in the trade-off between speed and analysis thoroughness.
  5. Signals you are "stuck in the what": 1) in reviews, teams are just reviewing what happened instead of data-driven recommendations or data-driven hypotheses. 2) You lack impactful stories when asked "what real data-driven insights have you shared recently that drove action?"
  6. Signals you are stuck in "the usual suspects": 1) few to no real new major insights or hypotheses. 2) Lack of incremental data work / no new slices and dices to further analyze data. 3) Indirectly, data teams become more disconnected from the business.
  7. Signals you are stuck in "need for speed": 1) continuously cutting corners on thorough analysis in the name of speed. 2) Juggling too many tasks / priorities with constant 'hair on fire' type requests.
  8. Culture and people are the biggest levers in data but the hardest to change. We need better processes and tooling to enable accelerating diagnostic analytics. Tooling and processes specifically for diagnostic analytics are few and far between.
  9. Many companies do not put much value and/or effort in to diagnostic analytics - that is highly correlated to analytics maturity.
  10. Diagnostic analytics work can be seen as boring compared to predictive and prescriptive work. It's typically not as technically challenging and many data people are not as interested in the business aspects.
  11. ?Controversial?: The best teams segment their diagnostic analytical questions into strategic, tactical, and operational. High performing teams also adjust thoroughness versus speed to best suit the specific need. They also automate as much as possible to reduce burden on the human in the loop.
  12. !Very Important!: Decentralization presents a big potential risk to diagnostic analytics. Analytics within the domain seems covered but many questions are cross domain...
  13. Scott note: There isn't a clear owner of diagnostic analytics in data mesh - if domains know their own data well, they should be able to do diagnostic analytics on information internal to the domain but it will be far harder cross domain. And that is end-state, not mid mesh journey. Diagnostic analytics likely falls to where you have your business analysts, whether that is embedded only, centralized only, or a mix.
  14. Insight definition criteria (from Brent Dykes): 1) provides a shift in understanding; 2) is something unexpected where the organization was not previously aware; and 3) it's relevant and/or aligns to what stakeholders care about. João added a 4) it must be delivered on time and communicated effectively.

João started the conversation discussing the four types of analytics: descriptive - what is happening; diagnostic - why is it happening; predictive - what might happen in the future; and prescriptive - what actions should we take. Most of analytics work over the last 30 years has been the descriptive and both descriptive and diagnostic are typically owned by the analytics team. Data science, ML, and AI have moved the needle for doing predictive and prescriptive analytics the last few years. But diagnostic analytics remains underserved.

That diagnostic analytics gap exists for a number of reasons in João's view. On the people side, diagnostic analytics requires two sets of skills/knowledge: the analytical + technical and the business + domain. Without the domain knowledge, it is far harder to connect the dots around the why - a key concept in data mesh in shifting data ownership left*. Yes, we know sales in this region are falling, but why, what changed? João believes diagnostic analytics requires the most domain knowledge of any of the four types.

* Scott note: I always think of the Pastafarian - or Church of the Flying Spaghetti Monster - figure as to why they dress like pirates and how the number of pirates is strongly inversely correlated to global temperatures. See here. Correlation doesn't mean causation, relevant XKCD here

On the tools and processes side, João believes diagnostic analytics is far less developed than any of the other four types. Dashboards are great for descriptive analytics - what is happening - including some exploration but they are difficult to use to actually understand the why, drilling down to the root cause. Culture around diagnostic analytics is another large issue for many organizations - there are many varied approaches and lots of differing views on the actual value of doing deep diagnostic analytics.

João has three different diagnostic analytics immaturity stages before getting to a well-functioning approach. The least mature is "stuck in the what," where the business stakeholders are the ones trying to do diagnostic analytics with low data fluency* to drive to the why. They are only reporting the what, the descriptive analytics. The second maturity level is "the usual suspects" - essentially, the team builds lots of slice and dice dashboards and then just monitors things using that, they don't think to keep adjusting their angles and dig in. The third maturity level is "need for speed." The domains have the capabilities - usually via embedded analysts - to analyze their own data but are almost always siding with speed versus comprehensiveness of analysis. The world and business are changing fast but it takes time to do good analytics well to generate an actual insight.

Scott note: this brings up the question of where diagnostic analytics lives in a data mesh implementation. If domains have high data fluency, then presumably they can do their internal analysis but what happens if the information to drive to the why is cross domain? This is why I believe many domains are likely to have their own business analysts but organizations will still have a centralized business analyst team too.

On the question of who does the diagnostic analytics in most organizations - an empowered and highly data literate domain or a centralized analytics team - João said it depends. In a low data maturity organization, it's typically the analytics team - hopefully pairing closely with the business. In a higher data maturity team, it's about upskilling the subject matter experts in data and providing the right tools so they can do the analysis themselves.

João shared two signals you might be "stuck in the what", that you need more diagnostic analytics maturity. The first is in your weekly or monthly review meetings, you are talking about what is happening and there are only some high-level guesses as to why - "oh, that's _probably_ because we changed the website" - and not much more. Nothing is data-driven answers or even hypotheses. The second is reflecting that you haven't taken any real data-driven actions with a large impact recently. If you aren't driving your actions from your data, it's likely you aren't answering the "why" questions.

It's a lot harder to detect if you are in "the usual suspects" phase of maturity per João. The data teams aren't getting lots of additional requests. The business people are generally happy because they have dashboards that show a lot of information sliced and diced in how they typically look at things. But they are only testing existing hypotheses and not really coming up with fresh/new insights. So two signals are that there aren't really any new insights or hypotheses and there aren't many requests from the domain to the data team. The third signal, one that's indirect, is that because there is that lack of incremental data work and requests, the data teams start to become more disconnected from the business.

Teams that are stuck more in the "need for speed", João said while it's a better place to be, it's still frustrating. Teams are always trying to balance thoroughness of analysis versus speed. So some signals you are there is the pressure to cut corners on thoroughness of analysis in the name of speed - that actually happening is another signal - and constant high-priority interruptions for diagnostic analysis, juggling too much and putting aside the long-term work to take care of the fast turnaround requests.

When teams break past the immaturity stages for diagnostic analytics, João pointed to a few things high performing teams do well. The first is to segment questions/requests into tactical, strategic, and operational. Strategic questions are typically more big picture so they change less frequently and thus are typically less urgent than operational or tactical requests. Strong teams also adjust their thoroughness versus speed depending on what the situation calls for. Lastly, they automate as much as possible - there is still a human in the loop but repetitive tasks aren't value-add tasks for someone to do.

João shared Brent Dykes' definition of an insight, which is probably much more strict than many use. First, it must provide a shift in understanding - so not "we found this anomaly", it changes what people know. Second, it must be unexpected - so those teams stuck in "the usual suspects" won't meet this because they are only testing against the expected. And third, it must actually matter, it must be relevant and/or aligned to what stakeholders care about. João added his own criteria of it must be on time and communicated effectively. These are all necessary to actually drive the right action.

João wrapped with a few tips for improving your diagnostic analytics: First, show the value of drilling down in to the why - find a few easy initial use cases to really show the value, not your most difficult questions that will take months to really answer. Second, have the data and business people collaborate more closely so the data people can better understand requests and business people can start to think about new analytical approaches. Third, really get clear around your data role definitions: who does what and why and what _aren't_ they supposed to do. Fourth, start to get very clear on expectations, improve that communication so everyone is on the same page. Fifth, plan ahead and don't get stuck in firefighting mode grasping for straws - it's too easy to approach diagnostic analytics in an unstructured, reactive manner. Finally sixth, look to automate away the repetitive parts as much as possible.

Quick tidbit:

Beware the 'boring' label for diagnostic analytics. Many data people want to focus on the more technically challenging predictive or prescriptive analytics. Show people diagnostic analytics is valued.

Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn:

If you want to learn more and/or join the Data Mesh Learning Community, see here:

If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here

All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf