Artwork for podcast The Other 80
Untangling AI Bias with Dr. Ziad Obermeyer
Episode 2520th March 2024 • The Other 80 • Claudia Williams
00:00:00 00:43:09

Share Episode

Shownotes

Using AI in healthcare comes with a lot of promise - but access to data, lack of clarity about who will pay for these tools and the challenge of creating algorithms without bias are holding us back.

In 2023, TIME named Dr. Ziad Obermeyer one of the 100 most influential people working in AI. As a professor at UC Berkeley School of Public Health, and the co-founder of a non-profit and a startup in the AI healthcare space, his work centers on how to leverage AI to improve health and avoid racial bias.

We discuss:

  • The idea of a safe harbor for companies to discuss and resolve AI challenges
  • How his company Dandelion Health is helping solve the data log jam for AI product testing
  • Why academics need to spend time “on the shop floor”
  • The simple framework for avoiding AI bias he shared in his recent testimony to the Senate Finance Committee

Ziad says without access to the right data, AI systems can’t offer equitable solutions: 

“I think data is the biggest bottleneck to these things, and that bottleneck is even more binding in less well-resourced hospitals… When we look around and we see, ‘well, there are all these health algorithms that are in medical journals and people are publishing about them’. The majority of those things come from Palo Alto, Rochester, Minnesota [and] Boston. And, those patients are wonderful and they deserve to have algorithms trained on them and learning about them, but they are not representative of the rest of the country – let alone the rest of the world. And so, we have these huge disparities in the data from which algorithms are learning. And then those mirror the disparities and where algorithms can be applied.”

Relevant Links

Dr. Obermeyer’s profile at UC Berkeley School of Public Health

Ziad Obermeyer’s testimony to the Senate Finance Committee on how AI can help healthcare

More about Nightingale Open Science

More about Dandelion Health

Article on dissecting racial bias in algorithms

Article On the Inequity of Predicting A While Hoping for B. AER: P&P 2021 (with Sendhil Mullainathan)

About Our Guest

Dr. Ziad Obermeyer is the Blue Cross of California Distinguished Associate Professor of Health Policy and Management at UC Berkeley School of Public Health. His research uses machine learning to help doctors make better decisions, and help researchers make new discoveries—by ‘seeing’ the world the way algorithms do. His work on algorithmic racial bias has impacted how many organizations build and use algorithms, and how lawmakers and regulators hold AI accountable. He is a cofounder of Nightingale Open Science and Dandelion Health, a Chan Zuckerberg Biohub Investigator, a Faculty Research Fellow at the National Bureau of Economic Research, and was named one of the 100 most influential people in AI by TIME. Previously, he was Assistant Professor at Harvard Medical School, and he continues to practice emergency medicine in underserved communities.

Connect With Us

For more information on The Other 80 please visit our website - www.theother80.com. To connect with our team, please email claudia@theother80.com and follow us on twitter @claudiawilliams and LinkedIn.

Transcripts

00:02 - Ziad Obermeyer (Guest)

Tomorrow, a bunch of cancers in this country are going to metastasize, a bunch of people are going to drop dead of sudden cardiac death, and if we had algorithms a year ago to predict who those people were and get them the treatment that they needed to stop this outcome, we'd be in much better shape. But we don't know what we're missing. There's just this huge deficit, because people can't get their hands on data to build these products that are going to transform clinical care.

00:39 - Claudia Williams (Host)

Welcome to the Other 80. I'm Claudia Williams.

Artificial intelligence is a powerful tool to identify people at the greatest risk of really bad outcomes like cardiac events and sepsis. This means getting people the right care faster, but that's only if AI can be more equitable than humans, making decisions about care without bias.

Yet over and over again, we're learning that algorithms are only as good as the data they're trained on. Ziad Obermeyer, a professor at the UC Berkeley School of Public Health, is one of the leading thinkers on how AI can inadvertently reinforce health and equity and bias. He also has a surprisingly simple formula for avoiding that terrible outcome.

So please welcome Ziad Obermeyer to the Other 80.

Ziad, it's so wonderful to welcome you to the Other 80. It's also just delightful to consider you a new colleague, and we'll get to some of that at the end. What do you think it's important for people to know about you before we dive into health and AI?

02:00 - Ziad Obermeyer (Guest)

I think maybe the place that I should start is the place that all of my research starts, which is my clinical work. AI has a lot to add to the science of medicine and I think that practicing better medicine, making better decisions in the hospital and in the clinic, aggregates up to much better policy, better cost effectiveness at the macro level. So I think, even though the work touches on a lot of things outside of medicine, at its core it is I do research on medicine.

02:33 - Claudia Williams (Host)

So we met by email a couple of weeks ago when you were preparing for testifying in front of the Senate Finance Committee, and one of the phrases you used was AI is going to change health care for better or for worse. Right, and so that's the two sides of the coin I want to have us explore today, and let's start with the for better what kinds of applications and uses are you really excited about in the work you do and, more broadly, in the field?

03:02 - Ziad Obermeyer (Guest)

Yeah. So let me give you an example that comes from my work in the emergency department, and that is one of the things that when you're training in emergency medicine is drilled into you from day one, is there's a short list of things that you cannot miss in the ER, and one of them is heart attack. And yet we still miss heart attacks. And it's weird because when you see a heart attack on TV, it looks really obvious and it's like, yeah, of course, middle age..

03:32 - Claudia Williams (Host)

We can all picture it right.

03:33 - Ziad Obermeyer (Guest)

Yeah, yeah, of course that guy's having a heart attack, but in reality people don't always look like middle aged men and the symptoms can be really subtle. And so it can be a little twinge in your chest that isn't even painful, it can be a little shortness of breath or even just nausea. And so we have this dilemma because, on the one hand, we really need to not miss heart attacks because the return to catching it and treating it is so huge in terms of preventing death and disability. On the other hand, if you started testing everyone who came into the ER with a little bit of nausea, you know you'd bankrupt the health care system not in 10 years but in 10 days. So that's kind of the trade-off. It's really important to catch, but really hard to test everyone.

In some studies, about 80 or 90% of the tests we do when we're confident enough, we're worried enough about heart attack to test this person 80, 90% of the time that test comes back negative. So we're back where we started, but we've exposed the patient to actual risk and financial toxicity as well.

04:37

And on the other hand, lots of people still die of heart attack every year, including people who come through the ER with symptoms that we don't take seriously or we just go in the wrong direction. So the current state, I think, is a microcosm of our health care system in general. We have this amazing technology and we both fail to get it to people who need it and also give it to people who don't need it, and I think that's one of the big reasons that our healthcare system is so expensive and underperforms in terms of the outcomes that we get. And so some research that my colleague Sendhil Mullainathan, who's at University of Chicago, and I did basically used data from the hospital where I was working as a doctor in Boston at the time, and trained an algorithm to help me diagnose heart attack and also to evaluate my own performance. So basically we could look at what the AI would do for this patient, and then we can look back and say, oh, what did you do with this patient?

05:34 - Claudia Williams (Host)

Wow!

05:36 - Ziad Obermeyer (Guest)

So a patient comes into the ER, they check in at the triage desk and already we know a bunch of stuff about them because we know where they live, we know their insurance, we know their vital signs and we know the symptoms that they've mentioned to the triage nurse. But then we can also look back in the electronic medical records so if they've been to that hospital system before, there's a ton of other data. So we feed all that into the AI and we train it to identify people who are at high risk of. If they had one of these tests for heart attack, would the test be positive? So the algorithm learns from the test result, which is great, and then we can see how that algorithm is doing on a bunch of new patients that it's never seen. And what we find is two things.

06:19

One is that and this will not be surprising to anyone, doctors test a lot of people that are low risk, but the algorithm says do not test this person, it's never going to be positive. If we just follow the algorithm's recommendation, we could cut about two thirds of the tests that doctors at this very good Harvard teaching hospital were doing. Two thirds we could just get rid of because they're so low risk and they're so predictably negative that we just wouldn't do them if we knew that going in.

So that's the first thing we found. But the second thing we found is that there are a bunch of high-risk people that are coming through the ER that were not testing and those people are getting diagnosed with other things. They're being treated for other things. In some cases the doctor didn't even think about heart attack because they didn't do an electrocardiogram. But those people, when the algorithm says this person looks really high-risk, you should test them, and the doctor doesn't test them, those people go on to have really bad outcomes over the next three or four weeks in a way that suggests that they had a heart attack that we just missed and failed to diagnose in the ER. It was very eye-opening to me to see that through this lens now we have this new tool through which to look at all of the healthcare that we're currently providing, and that tool is letting us see both the overuse that was always very obvious, but it's harder to do something about it in real time.

It's easy to look at a negative test and say you shouldn't have done that test. It's a lot harder to say here's a new patient that you're worried about, you don't need to do the test on. So being able to have a prospective system for doing that is great, but also having a system for catching all of these high-risk people that we're currently not catching and missing the vote on was also really exciting, and so I think that's just one decision, and the healthcare system is just full of these decisions where similar problem, we've got an amazing technology, we're just bad at allocating it to the right people. And so I think that's why there's so much potential for AI for fixing a lot of these problems with our healthcare system, of that overuse and underuse that coexist, unfortunately, side by side.

08:21 - Claudia Williams (Host)

And we'll get to some of the technology decision-making and governance, but just sitting there for a minute in that hospital, using AI in the way you describe involves a set of decision-making structures and environments that sometimes exist and sometimes don't. So, as you just think about being a clinician in a health system, what kinds of structures would need to exist so that you could take advantage of a tool like that effectively?

08:51 - Ziad Obermeyer (Guest)

I think data is the biggest bottleneck to these things, and that bottleneck is even more binding in less well resourced hospitals. And so when we look around and we see like, well, there are all these health algorithms that are in medical journals and people are publishing about them, the majority of those things come from Palo Alto, Rochester, Minnesota, Boston.

09:14

And those I mean those patients are wonderful and they deserve to have algorithms trained on them and learning about them, but they are not representative of the rest of the country, let alone the rest of the world. And so we have these huge disparities in the data from which algorithms are learning and then those mirror the disparities and where algorithms can be applied, because if your data are not online, you can't benefit from these algorithms, even if they were trained on a different population. So those disparities in data, I think are a huge and underrated problem, and I think we're used to thinking about too much access to data as a problem because of privacy and risks, and that I don't mean to minimize those at all, but there are also risks to too little data.

09:59 - Claudia Williams (Host)

And we'll come back to that in a little when we talk about Dandelion and Nightingale as kind of two potential solution types and I think there's some others I'd love your input on. So we have this sort of construct of for better or for worse and let's do the flip side. So I want to go back in time to an example that you've shared several times five years ago. You did a study and had a surprising finding, and let’s talk a little bit about what you found and what happened next.

10:30 - Ziad Obermeyer (Guest)

So in about:

Algorithms need to predict a specific variable in a specific data set, and there's no variable called “get sick”. The choice that they made when training these algorithms to say, look, this is complicated and I want to deal with it, and so I'm going to use a convenient variable that does exist in my data set called “healthcare costs”. It's not unreasonable because when people get sick, they tend to generate healthcare costs by, you know, seeing doctors and going to the ER and being hospitalized, and so what all these algorithms did and it's not just one company, this is a whole family of algorithms that are collectively being applied to about 150 million people every year in the US, so the scale of this market is already huge in healthcare, which is something I didn't know before I started working on this. The health system takes those highest scoring people and says, okay, we're going to give them help and we're going to screen everybody else out.

12:07

So the highest risk people get help, the others don't. But remember, risk is just cost, and the problem with using cost as a way to allocate help with health is that not everybody who has low costs should have low costs.

Some people who face barriers to accessing healthcare, who are treated differently by doctors, who are not taken as seriously, who are not listened to, who are misunderstood, those people get less healthcare than they should. It doesn't mean that they had less healthcare needs. They just had less healthcare costs because of all of the social inequalities that affect people's ability to access and get good healthcare. So the algorithm saw that accurately and said oh look, all of these black patients. They're going to generate very low costs, so we're going to score them low. And so this set of algorithms contain this enormous racial bias that we found and documented in that study, and I think for me this was a huge red flag, both because of the scale of how widely adopted these algorithms were and because nobody caught this problem.

13:21 - Claudia Williams (Host)

I'm curious how they responded. So one question is whether you can catch it, and it was sort of luck that you were there. And then the second is once it was identified, what was the reaction? Were their responses what you'd hoped they would be?

13:36 - Ziad Obermeyer (Guest)

In many ways yes, Because before we published the paper, we actually got in touch with the particular company whose algorithm we studied. One of many companies that was making similar products, and this company was actually very responsive.

13:51

They allocated their technical teams to work with us to create a new version of the algorithm that was substantially less biased. So it was great. We had these calls with people on the technical team and one of them was like we're so glad that you found this problem. Like, I decided to work in health care instead of working in some tech company because I care about health. This is not what I wanted to do and so we're so glad to have the opportunity to fix this problem. So the initial response was very positive. I think, sadly and as you know, companies like universities and governments are not one entity there are many entities and one of those entities is risk management and legal. So, sadly, over the evolution of that company's response to our work, I think took a turn for the worse and I think that I understand why, but it's a shame.

14:48 - Claudia Williams (Host)

It was interesting that you were on the podium with Michelle Mello, because her work is well known from the malpractice space and creating safe environments for people to share risky decisions, so it'd be interesting to sort of think about the mashup of those two concepts.

15:06 - Ziad Obermeyer (Guest)

That's a really nice parallel and I think it's a really great insight and I wish that there were similarly safe, I mean it sounds kind of insane, but a safe space for these companies to acknowledge that mistakes were made. We're all learning. This is a new field. Lots of these problems were not known by anyone. Not by government agencies, not by the hospitals, not by researchers, and so I think, unfortunately, the environment of risk aversion because of litigation is a huge determinant of how not just that company but other companies responded, and I think that's a shame.

15:43 - Claudia Williams (Host)

A cynical person might say well no, what health plans cared about was cost. Actually, Somebody else in the food chain might have cared about the outcome, but they do care deeply about costs. So did it study the thing they wanted it to study, or, in fact, were they wanting to look at outcomes?

16:05 - Ziad Obermeyer (Guest)

Yeah, it's a great question, and I think it was certainly the first question that crossed my mind too, and I think the reason I don't think that's the case is because it's very natural to think about trade-offs between health and cost, and I think that's often the case.

16:23

I think the whole point of population health management is that there's no trade-off, like if you can prevent a heart attack or a stroke or an exacerbation of congestive heart failure or dialysis or a diabetic foot amputation, that's good for everyone.

16:41

That's good for the patient because they still have their foot, and it's good for the healthcare system because that would have cost a lot of money. And so I think, in this case, the deep irony of this whole thing is that this algorithm was not some sort of like evil genius move to reduce cost at the expense of health. It's not doing either thing well, because it's finding the wrong people. It's finding people who already have a lot of access to health care, when in fact, if you want to save money, you actually have to go find people who don't have access to health care. Who are not spending money now, but tomorrow you're going to be spending a lot of money to fix the problems that you could have fixed today. So even if you were just a totally cynical, profit-maximizing insurance company that wanted to hold down cost, predicting cost is not the right way to do that.

17:33 - Claudia Williams (Host)

Interesting. That's a really helpful point. And circling back to your point that we shouldn't, for our governance of this technology, rely on academics and their selection of particular studies, you and colleagues a couple years ago put together a playbook, and I think the playbook did a really beautiful job of laying out a very practical framework for folks, whether they're regulators or C-suite, or developers, to think about how to guard against the kinds of biases you just talked about. So if you wouldn't mind just laying out what that four-part strategy is and how it works in practice.

18:16 - Ziad Obermeyer (Guest)

Yeah, thanks for saying so. When you're an academic, you have a natural tendency to write for other academics, but I think in this case that was not the right move. So what we tried to do is we tried to just distill the lessons from that work down into a few key things that were feasible in real organizations.

And the first, which sounds dumb, is first make a list of all of the algorithms that are being used in your company, and it sounds ridiculous, but no such list exists in the vast majority of organizations we work with. It's really hard to do anything about bias in AI if you don't know what AI is actually being used. So the first technology is what we call a list, and that's very useful, and what's funny is that the whole point of having AI is that you can make important decisions at scale. These things are incredibly powerful, and yet there's no actual oversight from the C-suite level, and so I think the list is just a good way for the C-suite to start exerting some control through awareness of what's going on in the organization.

19:26

For each of the algorithms that are being used, I think the single most important thing to know about them is what is the variable they are predicting, and is that the right variable? So, to go back to the example we talked about, this algorithm is being used to allocate help with health, but it is predicting costs. That seems like a bad combination, because we all know that health and cost are not the same and that some people have less cost than they should given their health. And so, basically, getting to that level of specificity and it's hard, because when we talk about AI, there's all these words are like oh, this is a risk predictor. And it's like, great, risk of what? So there is a variable that is being predicted and I just want to know that variable. And you can read a bunch of academic papers about AI and health and you can read the whole paper and still be very confused about what variable that thing is predicting. But there's always a variable.

20:25

Even Chat GPT, these generative AI models, how are they trained? You have to remember that this is a word predictor and it's specifically a word predictor that humans think are good words. It's very simple. And then the last component I think of this evaluation, where a kind of oversight framework is you want to know is that algorithm doing the thing? So you've said okay, great, it's predicting health. It's being used to allocate health, great. Now I just want to know is it doing the thing that we think it's doing overall, like for everybody? Is it performing well? And then if I disaggregate that performance by race, ethnicity, geography says the economics, etc. How is it doing for those people? That's it, and that's basically what we did in this exercise. It's not either rocket science or brain surgery. It's pretty, it's pretty simple.

21:21 - Claudia Williams (Host)

And I think that framework does two things. It does a beautiful job of demystifying what is often shrouded in the secrecy of technology, and it also would require people at many levels of the organization to have deeper conversations with the technology providers and vendors and even people within their own organization. It invites a different dialogue about the use of technology in organizations than I think always exists today.

21:50 - Ziad Obermeyer (Guest)

Yeah, I think that's right and I totally agree with that, and I think it's a particular kind of dialogue that often falls through the cracks in current work structures or even academic departments, because there are the engineers who are building the thing in the data and then there are the kind of decision makers who are making the decision.

And right now it's like an, an export model, where it's like, okay, you build the algorithm and you ship it over here and then I use it, for example, the algorithm that we developed, like sure, you're working with the data, but you need to know that this cost variable is produced by all of these social forces that shapes who gets costs recorded and who doesn't. Because these data, even though they look like health data, are actually insurance data about claims that get paid In the course of a transaction between, you know, a patient and the hospital and a payer, and so you can't build good products without knowing when the data are coming from and what the output of your algorithms being used for.

22:54

But those conversations rarely happen, not just because people live in different units, but because those people often don't share the same language. They don't have ways to communicate about these really important things, and so these interdisciplinary issues often just like fall through the cracks, and I think that's how we get a lot of problems with technology.

23:20 - Claudia Williams (Host)

During that Testimony you gave for Senate Finance, the question, of course, was what are policymakers biggest opportunities in this space? I think the framework you've provided is as applicable to policymakers as it is to other decision makers. But I, having been a policymaker, you know you become very attached to the tools you have, whether that's payment rates or Regulations, or even your purchasing power or the data you have, and so I just love to have you kind of explicate what you think those opportunities might look like, based on those different policy levers that we might have in front of us.

23:55 - Ziad Obermeyer (Guest)

Yeah, and I think the good news at least for me as I was preparing for that session, was that I think the government, and specifically the finance committee, has a lot of tools that, out of the box, they could use to make a huge difference in this space, starting with just purchasing.

So you know, this is about 20% of GDP and there's an enormous amount of power that comes with controlling that amount of money. And one of the reasons that that's really important is because right now I don't know if you've been to your doctor recently, but I went this morning and it doesn't feel that different from the way it did, like 20 years ago. It's not like there you walk in and you get scanned and say, by some AI thing. It looks pretty similar. And I think one of the reasons is that people who are building algorithms and health, they're worried about the FDA in the sense that you have to get cleared or approved to get your product, but really what they're worried about is is anyone going to pay for this? If so, how much?

25:04

There's this thing that happens early in markets where nobody knows what things actually are worth. We need to figure out the price of things. The sooner we figure out the price, and it almost doesn't matter what that price is. There's so much uncertainty right now that there's dramatic underinvestment at any price that we're willing to pay for these products.

And so Medicare and Medicaid have this huge, but also child services or all the other big government programs, have this huge opportunity to say look, AI can be transformative for the quality and the cost of the things that we are currently paying for. Here are some high value things and here is how much we will pay for them, and specifically, here's how much we'll pay for an algorithm that meets this performance target in the whole population and in legally protected groups. By doing that, and by just setting some clear payment rates and some criteria for what qualifies for that rate, medicare could, at zero cost, unleash an enormous amount of innovation from the rest of the market, because now people will be like great, I know the price, I can invest, I can do it at that price, so I'm going to create a product.

26:23 - Claudia Williams (Host)

Yeah, that was an amazing hearing. If I had any frustration, it was that it didn't really delve into the particulars of the Medicaid market and the additional complexity for adopting technology in those 50 plus markets. I've had a few conversations since the hearing with people that are very concerned that all of this opportunity is just going to leave Medicaid in the dust. Any thoughts about how to proceed to avoid that outcome?

26:58 - Ziad Obermeyer (Guest)

Not to sound like a broken record, but there's one big problem with Medicaid relative to Medicare. It's not resource related, but it's that if you're a researcher or a product developer who wants to build algorithms on data, you can get all the Medicare data from the whole country in one stop. Whereas if you want the Medicaid data, it's a 50 stop process, and, especially given how much less lucrative that market is, who's going to work on Medicaid? This doesn't make any sense.

And so having some way for researchers and product developers to actually access the Medicaid data is another just easy win for policymakers who want more innovation in the space. Just make it easier for people to actually do the thing you want them to do. I think, all that said, if I thought about where AI is so powerful, which is in prediction and prevention, there are a lot more opportunities in Medicaid than there are in Medicare.

28:04

In Medicare like either you're already sick or you're already old, and so a lot of the or both, or both. And so the train has often left the station, and so I think there are huge gains from applying these kinds of methods to Medicaid in a way that will eventually help Medicare. And so I think that one of the frustrating things about the space is that there are so many independent buckets. It’s like we're doing, each of these programs has its own P&L, but they're all, it’s all the same P&L.

28:44 - Claudia Williams (Host)

Right, right, we can talk to Congress about that next right.

Let's do a deeper dive on the data piece. You've described how many of the biases, not all, but many of the biases and algorithms can be tied back to gaps in data. That in our current system, the largest buckets of data tend to be in large, very risk averse organizations, some sometimes protecting their business interests, frankly by not sharing data, and that, in particular, developers, if they're outside the health system or at a vector with it, simply don't have access to those data, can't test their products.

And so I was really interested in the two organizations that you've helped found, one Nightingale, the other being Dandelion Health. I’d love to have you talk about how each of those, in its own way, is going about trying to solve that basic core dilemma.

29:45 - Ziad Obermeyer (Guest)

Yeah, thanks for pointing out that they're both trying to solve the same problem. They're just taking two different approaches. And the problem is one that in my research life is so clear because I have to spend so much time negotiating for access to data, and it's not just me. It's like if a PhD student at Berkeley comes to me and says I know how to do this interesting AI thing, I would. This seems like a natural application to health. I would like to get access to some health data to see if it works. And my typical answer is like great, we're going to add you on to the data use agreement and then you'll go get your fingerprints done and then your criminal background check will clear and then in like three years, you can start working. And of course, it's just, it's not a good model for research. I think what's surprising is that even people in the private sector who have money to spend on getting access to data because they want to develop a product, they face the same problems because it's not a money problem, it's an institutional and sort of incentive problem. But I know of a few startups that have just run out of their series A funding, waiting for a hospital to sign an agreement and to deliver the data that everyone agreed was a good idea to get delivered. So there are all of these frictions that are, I think, really one of the reasons that we haven't seen as much AI innovation as we'd like. I think that's a huge problem for society.

31:25

Whenever you're working with data, whether it's research or product development, there's a risk to privacy, and there always will be, and there's no way to get that risk to zero. But I think there's another risk, which is that we almost don't know what we don't have. Tomorrow, a bunch of cancers in this country are going to metastasize, a bunch of people are going to drop dead of sudden cardiac death, and if we'd had algorithms a year ago to predict who those people were and get them the treatment that they needed to stop this outcome, we'd be in much better shape. But we don't know what we're missing, and right now there's just this huge deficit because people can't get their hands on data to build these products that are going to transform clinical care, and so that's sort of my view of our current situation is clearly there are risks to privacy and people doing bad things with data. That will never go away, although there are, I think, a lot of ways to minimize those and get those very low. But there's also this other huge problem like the social value of this data is not being captured by anyone and that's a tragedy and it's many, many different kinds of tragedies.

32:35

So Nightingale is a nonprofit. It's philanthropically funded, initially thanks to Tom Khalil.

And what Nightingale does is it works with hospitals and health systems and government agencies and anyone with interesting medical data focusing on imaging, like X-rays and electrocardiograms and biopsy slide images, and it creates data sets around interesting medical questions like: why do some people die of COVID-19 while other people get a runny nose? Why do some cancers metastasize and others don't? So interesting medical questions. Cool, large data set. We de-identify it and we bring it out onto our cloud platform where researchers can access it for free. So the goal of that is just connecting researchers to data that they otherwise would not be able to get, so they can write papers and make breakthroughs and, do you know, just do the research thing. That's one vector. I think one of the things that was frustrating about that was how small that was, so Dandelion is trying to solve that problem a little bit differently.

33:39

It's a for-profit company and Dandelion has relationships with a handful of really big non-academic health systems from around the country that were specifically chosen for their diversity, both geographically, racially, ethnically, in terms of the equipment that they use so all kinds of different definitions of diversity and we de-identify all of the data from the hospitals and we bring it over onto our cloud.

34:11

And then someone who wants to build an AI product, for example, to predict which mammograms are actually high risk of metastasizing, would say to us okay, I want a data set of at least 50,000 mammograms, linked to follow-up data over at least five years, so that we see all the biopsies and all of the other things that happen.

34:34

And one other advantage of the hospitals that we work with is that, by virtue of their geography or their model, they're very longitudinal, so they usually see people over the whole course of their lives. It's not like a fragmented market like the Bay Area, where you've got, you know, one doctor at Sutter and another at wherever.

So they come to Dandelion. They have a spec for a data set that they want to build a product. We build that data set for them and then they lease access to the data, so they don't own the data, but they can build their product on top of it and they own that. And then Dandelion shares the revenue back to the health systems and there's an exclusive focus on tools that will actually benefit patients, and so there's no access to data for hedge funds or other people who are doing like non-clinical things. It's all about clinical care.

35:30 - Claudia Williams (Host)

I don't know if you know this, but before taking this role, I led a large data network in California, manifest MedEx, that brings together data for about 30 million people claims and clinical data. So it was exclusively used in a HIPAA construct so treatment, payment, operations, biplans as well as providers, when I was there, but more so since I've been asking myself this question: how might that infrastructure be reused or leveraged for what you're talking about? Any thoughts on what the pathway could look like?

36:07 - Ziad Obermeyer (Guest)

Yeah, I think those are such amazing resources for all sorts of policy applications. I think my impression is that historically and I don't know whether this is because of the agreements that were set up or when they were founded, but getting those data to be used for research, but then also certainly for product development, has been hard because nobody knows who has the right to do stuff with the data.

And so I think so I don't know enough about them to have an opinion, but I would imagine that resolving that uncertainty would probably be a first order thing. That would at least unlock a lot more interest in using the data once you know what you can and can't do with it.

36:57 - Claudia Williams (Host)

I think of you as a prototype, for as an example of an academic who has really structured your time and what you choose to spend time on, your products to maximize impact. And I see that in a couple of different ways. One, just the high degree of cross organizational collaboration to the active work of bridging from your work into policy, into the application of the findings. Three, the actual building of platforms and tools and services. So the school of public health at Berkeley is about to hire a bunch of new academics for next year. It's very exciting If they came to you and said how do I be a prototype like you? What would you say? What, what's the path for that?

37:43 - Ziad Obermeyer (Guest)

I think the first thing I'd ask them is why they would ever want to do that. But if they insisted. I think the thing that's a little bit different about me, is that I think it's a huge advantage to be above the bar in two things. I think that academia promotes a lot of specialization, and I think that's how that's one playbook to succeed. I think, I don't know if you watched the Super Bowl, but one striking thing about the Super Bowl was that both quarterbacks actually started off playing baseball.

38:22 - Claudia Williams (Host)

Interesting.

38:23 - Ziad Obermeyer (Guest)

I think it's very hard to win a race that, like everyone else, is running, and I think it's a lot easier to find your own race and just run that way. There are a lot of problems that nobody's working on, but in order to find those problems, you have to spend time in the actual world, understanding what the problems are and how you can get solutions from this other world and bring them to bear on those problems. And so for me, it's, you know, to go back to where we started, it's medicine. I think there are not a lot of other people on faculty at Berkeley who have spent as much time in hospitals as I have.

And that's just a source of a lot of richness and inspiration and kind of interesting new stuff to work on. That also happens to be really really life and death important for a lot of people, so it doesn't have to be in the hospital. I think you're a technical person, spending a lot of time in any area that produces data.

39:27 - Claudia Williams (Host)

I'm gonna use that as a segue to the last question, which is this is where I get my guests to give me advice in my new role, but I think the question of how you make good on this third pillar of impact for university education, research being the first two, impact being the third, one of my focuses is the need to shrink the distance between praxis, between the field and the academia.

And I'm curious, thinking about our shared organization, the Berkeley School of Public Health, what would be some of the easiest ways to do that in this particular organization?

40:04 - Ziad Obermeyer (Guest)

Let me stall by saying that I completely agree with that and I think it’s, I think people often think about translation as like oh, you have an idea and then you translate it into the real world, and I think that that's just empirically very rare. I would imagine that just having some way for academics to spend time just hours outside of academia is the best thing that you could do.

One of my favorite parts of my residency was basically when you spent a bunch of time for like a day being a nurse's assistant and just starting all the IVs. I think it would be a great experience to go be a personal care assistant in a nursing home for a week if you study long-term care.

40:56 - Claudia Williams (Host)

I love that. I mean it's very human-centered design-ey, but it's also just being without agenda. I think we often find ourselves in those spaces with a very pointed agenda. Our time is up and I know you keep probably a very tight schedule these days. What a pleasure to have you on. Thank you so much.

Ziad Obermeyer (Guest)

Oh, thank you so much, Claudia. It was so nice.

Claudia Williams (Host)

I feel like we could talk for another hour, but maybe next time.

41 - Ziad Obermeyer (Guest)

No me too.

41:24 - Claudia Williams (Host)

Right now, the AI and health conversation is a tech roller coaster. There are high hopes and deep fears. What struck me most in this conversation with Ziad Obermeyer is the idea that, while we should focus on the risks, we need to place equal emphasis on the benefits. If we don't figure out how to pay for AI tools or unlock access to data needed for model training, we will squander this opportunity to improve health, address equity and reduce costs. That risk is particularly acute for public health and the safety net, who are often the last to benefit from any new innovation or technology.

This podcast was created by me, Claudia Williams. My podcast producer is Avery Moore-Clos. Check out the show notes for more information on Zied Obermeyer, his work at UC Berkeley and his company, Dandelion Health. There is more information on my background, this podcast and our guests on our website, wwwtheother80.com.

Until next time, I'm Claudia Williams.

Links

Chapters

Video

More from YouTube