What's Hiding in Your Unstructured Data?
Episode 9010th May 2021 • This Week Health: News • This Week Health
00:00:00 00:08:24

Transcripts

This transcription is provided by artificial intelligence. We believe in technology but understand that even the most intelligent robots can sometimes get speech recognition wrong.

  Today in Health it, this story is what's hiding in your unstructured data. My name is Bill Russell. I'm a former CIO for a 16 hospital system and creator of this week in health. It. A channel dedicated to keeping health IT staff current and engaged. VMware has been committed to our mission of providing relevant content to health IT professionals since the start.

They recently completed an executive study with MIT on the top Healthcare trends, shaping IT, resilience, covering how the pandemic drove unique transformation in healthcare. This is just one of the many resources they have for healthcare professionals. For this, and several other great content pieces, check out vmware.com/go/healthcare.

All right, here's today's story. I. The top five nightmares hiding in a healthcare organization's unstructured data. This is from healthcare IT news. And to be honest with you, it's probably a pitch for this company, which is Text iq. I'm gonna give them credit 'cause they give us a fair amount of information in this.

And the reason I'm covering this is I think it's interesting. I think it's interesting how much unstructured data we have and they point out some . Really interesting things about the information that could be in the unstructured data. So let me just go into the article. Unstructured data is information that does not have any predefined data models or schema, so it can be difficult for an enterprise to locate and digest.

Examples include physicians' notes in the EHR, emails, text files, photos, videos, call transcripts, and recordings, and business chat apps. Unstructured data can make up upward of 80% of the data within a healthcare organization. So it's important to be aware of what exists and not be caught off guard and accidentally share sensitive information.

The CEO of Text iq, a vendor of technology that uses artificial intelligence and machine learning to work with sensitive, unstructured data has five nightmares when it comes to what can be found in unstructured data. So I'm gonna give you those five real quick. So the first nightmare, personally identifiable information, PII and personal health information, PHI, and failing to redact that information can leave the enterprise.

At risk. And the CEO goes on to say, at times people will share personal information from credit card numbers and social security numbers to personal health information. When communicating with a company as a customer or within a company as an employee, oftentimes they may not realize the significance of doing this, may not remember that they did it and may not understand this information.

Now, as part of the company's unstructured data, obviously this information is extremely sensitive. And the last thing companies want to do is have this information end up in the wrong hands. And you see how this happens, right? Somebody says, Hey, can you provide this information to hr? And they say, yeah, here's my social security number.

They send the email back. I. and now we all assume, hey, that's internal and, and those kind of things, but at the end of the day, it becomes part of the body of unstructured data for your health system. Second nightmare code words, and this is really about fraud and detecting fraud. I thought this one was interesting because I hadn't really thought about it.

Words that aren't normally used in a conversation or repeatedly used only between a small group of participants. Can indicate code words by using AI to automate the process. Enterprises can save both time and money while understanding their data and avoiding issues like potential fraud. So people are using code words to identify things and to have a conversation across normal channels.

And really they are perpetrating fraud or conspiring around those things. You can actually identify those things with ai. AI and machine learning, as we talked about in the past, is really good at identifying patterns. Third thing, the same person appearing under different names is another one of your nightmares.

So organizations risk added expense and reputational damage if they notify the same person multiple times because they weren't able to identify that Bob, Robert, and R Burns and R period burns are indeed the same person. This happens over and over again. Four things. Sexual harassment can rear its ugly head and unstructured data.

Sexual harassment is never acceptable, but having it pop up unknowingly can create a new set of issues during a company audit or litigation or prior to a merger or acquisition. Organization's risk having a surprise like this if they are unaware of what is in their unstructured data. As with sharing, PII, employees may be engaged in this activity via email or business chat apps.

Even if it is not reported, there will still be a trail of evidence as the information will remain hidden until identified and addressed. And finally, another nightmare is unconscious bias. Right, so unconscious bias can refer to comments made unknowingly that exposes a bias against a gender, race, culture.

Unconscious bias is easy to miss and much more pervasive in the workplace than blatant discrimination and can be blamed for lower wages, less opportunities for advancement and high turnover if left undetected and unaddressed Unconscious bias hurts individuals as well as the business employees who feel like they have experienced negative bias.

Are likely to withhold their best. Creative thinking ideas and solutions from the organization are less likely to refer others to the organization and will eventually leave the job for other opportunities. And unconscious bias can lead to expensive lawsuits. You don't say, all right, so I, I like the topic.

It brings up something that we have to be aware of in it. Here's my so what on this 80% of the data. In your organization, being unstructured may not be entirely accurate, but it's probably closer to that number than it is below 50%. It used to be enough for it to house the data and not know entirely what was in the data.

We were sort of the Swiss bank account of data. It's safe, it's secure, and we back it up every night. Well, the world has changed. Now it's important to know what is in the data. The data can lead to all kinds of serious issues. . Plus, once it is in your possession, it becomes a liability for the organization.

How many people do you think it would take? To ensure that your unstructured data doesn't include any PII or PHI, that's just the first nightmare. How many people would you have to hire to go through all the data to make sure there was no PII or PHI? And the answer is too many to count tools are required.

Tools that can sift large amounts of data and look for patterns of abuse, fraud, and mistakes that can lead to exposing. Personal information. I'm often asked the question around, where's the best place for AI in healthcare? And this is probably one of the best places to start using ai. You know, this is a place where you can get used to it, figure it out, what works, what doesn't?

was something like this, it's:

the closing question is, it's:

Apple, Google Overcast, Spotify, Stitcher, you get the picture. We are everywhere. We wanna thank our channel sponsors who are investing in our mission to develop the next generation of health leaders. VMware Hillrom, Starbridge Advisors, McAfee and Aruba Networks. Thanks for listening. That's all for now.

Chapters