#23 Where and How Can Data Virtualization Work in Data Mesh - Interview w/ Dr. Daniel Abadi

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/

Please Rate and Review us on your podcast app of choice!

If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here

Episode list and links to all available episode transcripts here.

Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.

Dr. Abadi's blog post on data federalization and data virtualization: https://blog.starburst.io/data-federation-and-data-virtualization-never-worked-in-the-past-but-now-its-different

Dr. Abadi's contact info:

LinkedIn:

Twitter: @daniel_abadi / https://twitter.com/daniel_abadi

Starburst blog posts: https://blog.starburst.io/author/daniel-abadi

Don't forget to catch Dr. Abadi at Datanova - the Data Mesh Summit on Feb 9-10th. Thanks to Starburst for sponsoring the transcripts for Data Mesh Radio, check out the transcript here. And check out Starburst's other free data mesh resources here.

In this episode, Scott interviewed Dr. Daniel Abadi, the Darnell-Kanal Computer Science Professor at the University of Maryland with a focus on scalable data management research. Dr. Abadi will be presenting next week at the Data Mesh Summit on Data Fabric and Data Mesh alongside Zhamak and Sanjeev Mohan.

This was a pretty wide ranging and free wheeling conversation about data virtualization in general and how it can be used in data mesh. Both agreed that there are many places where data virtualization can play in data mesh, whether in extracting information from operational systems, stitching together a data product once data processing has been done, or at the mesh experience plane re combining data across multiple data products. Dr. Abadi specifically mentions something like a query fabric that makes use of a data virtualization approach, not just tools that only do data virtualization.

There is a natural side effect of having multiple different technologies in use - when you give the domains the ability to use what they choose, the difficulty of combining data from multiple sources needs to be solved. There is always a balance between how much you just copy data and how much you can access in the source system and data virtualization can give a few more options rather than all or nothing.

As data virtualization has been around as a concept for 30+ years, there is a lot of baggage with the term but Dr. Abadi sees there being recent advancements that mean more people should take a second look at where they can be useful. But warns to do your homework and really think through whether they fit your use case. A query fabric can make your user experience much more pleasant. Trying to create data products entirely within a data virtualization platform probably won't be, at least according to Scott.

Additional topics included retransmitting or reprocessing data, versioning, the importance of denormalizing data for analytics and how that plays with data virtualization, and much more. It is a really fascinating deep dive into the history of computing and how it impacts what we are trying to do today.

Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/

If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/

If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here

All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Share Episode

Shownotes

Follow

Links

Chapters

Video

More from YouTube