Today, we’re speaking to Dr Garth Funston, a GP and Clinical Senior Lecturer in Primary Care Cancer Research at Queen Mary University of London.
Title of paper: Using large language models to identify pre-diagnostic clinical features of ovarian cancer from healthcare records: a population-based case-control study
Available at: https://doi.org/10.3399/BJGP.2025.0366
Most women with ovarian cancer present with symptoms, but many symptoms are recorded only in free text healthcare records and missed by studies and clinical decision support tools that rely on coded data. We found that using large language models (LLMs) to extract symptoms from free text records substantially increased symptom detection and strengthened associations with ovarian cancer. Incorporating LLM-extracted symptom information into research and clinical decision tools may support identification of women at higher risk of cancer and aid appropriate investigation.
Transcript
This transcript was generated using AI and has not been reviewed for accuracy. Please be aware it may contain errors or omissions.
Speaker A
00:00:00.800 - 00:00:50.940
Hi and welcome to BJGP Interviews. I'm Nada Khan and I'm one of the Associate editors of the Journal. Thanks for listening to this podcast today.
In today's episode, we're talking to Dr. Garth Funston, who is an academic GP and clinical senior Lecturer in Primary Care Research at Queen Mary University of London.
We're here to talk about his recent paper in the BJDP which is titled Using Large Language Models to Identify Pre Diagnostic Clinical Features of Ovarian and Cancer from Healthcare Records.
So, Garth, thanks so much for talking to us again today, but I wonder, just before we get into the AI side of this paper, can you briefly explain the clinical problem you're trying to address here with ovarian cancer diagnosis in general practice?
Speaker B
00:00:51.500 - 00:01:55.010
So most women with ovarian cancer are diagnosed after they develop symptoms and see their doctor. The challenge is that most symptoms are really non specific. There's no real red flag symptoms for ovarian cancer.
That makes it a real clinical challenge for the GP to kind of recognize it and perform tests.
So the symptoms are things like abdominal and pelvic pain, persistent bloating, urinary urgency and frequency, things that we see really frequently in gp. So knowing when to consider ovarian cancer is the big challenge.
And we know that certainly a proportion of women see their GP multiple times before the diagnosis. Now we're lucky for ovarian cancer in that we have reasonably good triage tests and CA125 and transvaginal ultrasound.
So the challenge really is to identify women with these non specific symptoms early so as we can work out who to test and hopefully improve early diagnosis and on outcomes in that way.
Speaker A
00:01:55.250 - 00:02:14.530
Yeah, and I'm sure you're well aware of sort of the body work around this area and people like Willie Hamilton, who's done work around early diagnosis of ovarian cancer, along with Claire Bankhead, and they did some really interesting work around things like bloating, didn't they? But that was slightly different, I think, and a little bit that's some time ago now, isn't it?
Speaker B
00:02:14.930 - 00:02:39.230
Yeah, it was some time ago. I think all of that is, you know, fundamental and still holds true.
And they did a lot of work around things like IBS and in women over, over 50 and things like that that are kind of these subtle signs that we need to be aware of with ovarian cancer.
So, yeah, we know there's lots of features that are associated with ovarian cancer, but it's recognizing when to invest to get those features because they're so common.
Speaker A
00:02:39.630 - 00:02:49.310
Yeah. And do you think that's why it's described as difficult to diagnose early in general practice? Is it because the symptoms are so common?
What are your thoughts on that?
Speaker B
00:02:49.390 - 00:03:48.750
I think there's a few reasons.
I think ovarian cancer used to be called, certainly in the media, the kind of the silent killer and terminology, which I really, really frustrates me, because we know it's not. We know that most women of symptoms for diagnosis. We actually know that from this paper and other papers that are symptoms in early stage cancer.
But that kind of thought around ovarian cancer still holds. Secondly, the symptoms are nonspecific, they're reasonably common. I mean, you know, I probably see a.
A patient with abdominal pain most days and it's kind of working out which ones to investigate for ovarian cancer. Yeah. And so I think those are the main things. And thirdly, it's, you know, it's not the most common common cancer.
GP will see people probably only encounter a case of ovarian cancer every three to five years, a new case. And that's the extra challenge. It's kind of suspecting it when it's a rare thing in primary care.
Speaker A
00:03:49.100 - 00:04:03.500
Yeah. And one thing I found really interesting about this work is that you're using free text clinical records rather than just coded data.
So can you tell us a little bit about the data you accessed here and why it was so important to use this free text data?
Speaker B
00:04:04.220 - 00:05:09.600
So a lot of the work that we do with primary care data focuses on coded data and certainly within the uk, because that's really the data we can actually access within UK for research purposes. But up to 80% of clinical information is not in that coded format, it's in the free text.
And work from people like Sarah Price in the past have shown that often subtle things that we need to pick up are in the free text and GPS don't code that.
So it's something I've been really keen to use in research for many years now to try and look at what extra information is there in the free text that could help us in both research and clinical practice and kind of picking up these cancers. And the data we accessed was from the United States, it was from healthcare clinics associated with the University of Washington.
And that included kind of coded data, but also the free text medical records of patients which had been anonymized and were accessed in a kind of a safe and appropriate way.
Speaker A
00:05:10.000 - 00:05:40.140
Yeah.
And I think a lot of clinical staff listening to this will certainly, certainly appreciate that a Lot goes into the notes that we just type in that doesn't really get coded. So it's phenomenal that you're able to access that data.
And this paper uses large language models or LLMs, which some people might associate, associate with tools like ChatGPT, but just at a very basic level. Can you just talk us through what actually is a large language model and what sort of it was used for in this, in this study?
Speaker B
00:05:40.950 - 00:06:49.130
Large language models, lots of people use them on a daily basis. Absolutely right.
Things like ChatGPT, they're essentially a tool for our purposes which we use to extract information from the free text medical records. Now natural language processing approaches have been used actually for many years, kind of rule based approaches.
Other models, these require lots of training. You need to lots of highly annotated records and notes to train the models.
Advantage of large language models, things like GPT, is they need less annotated notes and we did still do some of that, but they require less and that makes them much easier to apply and use in practice. We use them in this setting to effectively pull out key information on symptoms.
We predefined a list of 17 symptoms from the literature which were associated with ovarian cancer and we used the large language models to go through the notes, pull out information on those symptoms that we could use in the study alongside the coded data.
Speaker A
00:06:50.090 - 00:07:03.350
And I think that as we've been discussing, these large language models are probably really useful for this kind of data. I think especially because a lot of general practice is narrative and contextual as we've been discussing as well.
Speaker B
00:07:03.350 - 00:07:38.940
Yeah, I think, I mean there's two challenges with using free text data. One is access requirements because there's lots of concerns around confidentiality. The other is just the volume of it.
You've got these massive records that you know, contain lots of information, lots of writing, go back years. How do you actually process that to find the key information that you need?
I think large language models are a really useful tool here because with a bit of training you can use them to actually extract the information that's pertinent to your kind of question.
Speaker A
00:07:39.340 - 00:07:48.620
So let's go into what you found and I'm really interested to know about what kind of patterns or features was this model able to identify before an ovarian cancer diagnosis.
Speaker B
00:07:49.180 - 00:09:06.690
So we looked at 17, 17 features. We find actually that 14 of the features were more frequently recorded within the free text and coded information.
And often those were the more non specific features. Things like appetite loss, actually things like weight loss as well and urinary symptoms, whereas actually pelvic mass was pretty frequently coded.
And 40% of bloating, for example, was was recorded in free text and not recorded in codes at all. And the model was able to pull out those features.
And when we combined the extracted information from the pretext with the information from the coded data, we find that 14 of the features were actually associated with ovarian cancer in the regression models.
Now, if we only used coded information, didn't use the information extracted from using the large language models, six of those features were no longer associated.
So really it showed that applying large amount language models, pulling out those extra features made a big difference in terms of the associations that we were able to identify.
Speaker A
00:09:07.730 - 00:09:13.970
And did any of those findings surprise you in terms of the associations from a clinical point of view at all?
Speaker B
00:09:13.970 - 00:09:42.430
I think we focused on 17 features that had been reported in some studies. Some are only reported in a few studies.
Pretty non specific, not always in nice guidelines, but I think I was not completely surprised by any, but things like appetite loss and things like that, which are more subtle, I was excited that we were able to identify that even within a relatively small study like this because we had access to that free text data and were able to kind of pull out that information.
Speaker A
00:09:43.070 - 00:10:03.630
Yeah.
And I know there has been some work done done about how GPs enter information into patient records and things like symptoms often don't get coded and they are in the free text.
So I'm interested to know your thoughts about what this kind of approach adds beyond these existing symptom based risk tools that might be just based on coded data.
Speaker B
00:10:03.950 - 00:11:27.920
Absolutely. I mean, we know that, you know, for certainly some symptoms. Sarah Price's work has shown that 43% of symptoms are not coded in her work too.
So I think really chimes with that. I see kind of the use of this in two ways. One is research context and one is kind of a clinical context.
And in the research context, I think moving towards using these LLMs to pull information out of the record could really be game changing in how we understand disease, how we understand the symptomatology of disease, actually how we understand risk factors as well, which aren't always coded either.
So I think not just for cancer, but applying this across different diseases, we could do some really exciting work looking at risk factors and predictors of disease.
And secondly, in the clinical setting at the minute we use tools fairly frequently such as qrisk Q Cancer and those are developed based on coded data and they pull coded data from the GP record and then GP gives and enters in extra details. I think there's real potential here to use LLMs to inform those risk prediction models.
So you could have those LLMs extract information, add it to the model. There's a potential here to give more accurate predictions and guidance for gps.
Speaker A
00:11:28.080 - 00:11:51.180
And, you know, you mentioned that this is a really exciting area and I think there is a lot of excitement around AI in healthcare at the moment.
Where do you think the opportunities are now in general practice, especially with, as you mentioned, some of the difficulties around accessing free text data and this kind of approach to identifying symptoms or things in the free text that clinicians are entering in?
Speaker B
00:11:51.260 - 00:13:02.640
Yeah, so I think there's a lot of work going on using the free text in different countries.
So in the US already and Scandinavia, the Netherlands, there's been a lot of work actually using free text applying and natural language processing approaches to kind of do the studies and build that into models. So I think already we're, we're going to start to see potential impacts from this.
There's models out there looking at pancreatic cancer, for example, which have shown free textiles considerably work, actually probably going to be bjgp, also looking at lung cancer using freetext.
So I think for me, the opportunities are to start to do these studies that are pretty novel, certainly in the uk, to kind of identify factors that associated with disease, identify symptoms, but then also move from that to incorporate them into models and put them into practice, which is a challenge, certainly with the infrastructure, governance and other restrictions, and has to be done properly and ethically. But I think if we don't start using this data, now that we have the tools, it's a real missed opportunity.
Speaker A
00:13:02.720 - 00:13:20.540
And what are the next steps for this work?
So you've identified these factors that are associated with an ovarian cancer diagnosis, and as you mentioned, there are a few features that were found above and beyond just what you might expect to find from the coded data. So where are you taking this next? How do you want to put this into practice?
Speaker B
00:13:21.260 - 00:14:05.700
Yeah, so in this work we looked only at symptoms, whether they're present or absent. Actually, large language models can do much more than that.
They can look at duration of symptoms, they can look at information on severity, and that's quite powerful. We can't capture that encoded data at all, or risk prediction models don't capture that.
I'm really interested in starting to look at that information, see what we can pull out from that and see how it affects risk of cancer.
So I think the next step for this work is to look at that look at other cancers and diseases as well, and also start to move from a proof of concept that we can do this into building risk prediction models to actually try and make something that we can move into primary care.
Speaker A
00:14:06.100 - 00:14:12.260
Yeah. And for any gps listening to this, what's the main thing you'd want them to take away from this current study?
Speaker B
00:14:13.080 - 00:14:44.120
So I think from ovarian cancer perspective, to recognise that actually ovarian cancer has symptoms, they're often subtle, they can occur in early stage disease.
So to be aware of symptoms such as persistent bloating, abdominal pain, urinary changes, and to be aware of those and recognise them, particularly if women have worsening symptoms or it doesn't seem right or there's no obvious cause, and perform investigations such as CA125 and. Or ultrasound.
Speaker A
00:14:44.600 - 00:14:50.920
Any other final thoughts that you want to add just based on this work or anything that you want to sort of highlight from this paper?
Speaker B
00:14:51.400 - 00:15:22.150
So, for me, this work shows how much information, how much important information is contained in that free text.
I think from a UK perspective, we need to work with governance bodies, we need to work with data providers to look at how we can make this data available in the uk. UK has some of the best healthcare resources in the world, but if we're unable to access and use this free text data, it's a real missed opportunity.
It's a real opportunity to use it to benefit patients on the nhs.
Speaker A
00:15:22.630 - 00:15:31.830
Brilliant. Thanks very much for that, Garth. It's a great paper.
So, yeah, I'd recommend anyone listening to go back and have a listen to it, but I just wanted to say thanks very much for your time.
Speaker B
00:15:32.390 - 00:15:33.270
Thank you very much.
Speaker A
00:15:34.630 - 00:15:49.800
And thank you all very much for your time here and for listening to this podcast today. Garth's original research article can be found on bjgp. Org and the show notes and podcast audio are at bjgplive. Com. Thanks again. Bye.