Artwork for podcast Data Mesh Radio
#10 Ensuring Data Quality via Data Testing and Versioning – Interview w/ Jesse Paquette
Episode 101st January 2022 • Data Mesh Radio • Data as a Product Podcast Network
00:00:00 01:04:46

Share Episode

Shownotes

Data Mesh Radio Patreon - get access to interviews well before they are released

Episode list and links to all available episode transcripts (most interviews from #32 on) here

Provided as a free resource by DataStax AstraDB; George Trujillo's contact info: email (george.trujillo@datastax.com) and LinkedIn

In this episode, Jesse Paquette, Chief Science Officer and Co-founder at Tag.bio - a data platform vendor in the life sciences space, and Scott dive a bit deeper into data quality in general, especially data testing and versioning.

You can see the LinkedIn post that sparked this discussion here

Jesse recommends a number of things to ensure data quality, especially data testing and versioning. This includes versioning of 1) the code used to create the data (generally the ETL code), 2 the schema, 3) the business logic layer, and 4) timestamping / temporality based versioning.

Jesse's general calls to action are 1) make data testing frameworks so testing is much less tedious and time consuming; 2) work with stakeholders to gain trust in the data and then continue the dialogue to keep said trust; and 3) create schema/domain model blueprints so that domains have a starting point - whether they use it is irrelevant but shortening the path to a working domain model is crucial.

Jesse's contact info:

Email: jesse at tag.bio

LinkedIn: https://www.linkedin.com/in/jessepaquette/

Twitter: @bzdyelnik / https://twitter.com/bzdyelnik

Website: https://tag.bio/

Tag.bio vendor interview for Data Mesh Learning: https://www.youtube.com/watch?v=acQADu7ttqQ

Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/

If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/

If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here

All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): AstraDB

Links