Artwork for podcast Learning Bayesian Statistics
#120 Innovations in Infectious Disease Modeling, with Liza Semenova & Chris Wymant
Modeling Methods Episode 12027th November 2024 • Learning Bayesian Statistics • Alexandre Andorra
00:00:00 01:01:39

Share Episode

Shownotes

Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!

Visit our Patreon page to unlock exclusive Bayesian swag ;)

-------------------------

Love the insights from this episode? Make sure you never miss a beat with Chatpods! Whether you're commuting, working out, or just on the go, Chatpods lets you capture and summarize key takeaways effortlessly.

Save time, stay organized, and keep your thoughts at your fingertips.

Download Chatpods directly from App Store or Google Play and use it to listen to this podcast today!

https://www.chatpods.com/?fr=LearningBayesianStatistics

-------------------------

Takeaways:

  • Epidemiology focuses on health at various scales, while biology often looks at micro-level details.
  • Bayesian statistics helps connect models to data and quantify uncertainty.
  • Recent advancements in data collection have improved the quality of epidemiological research.
  • Collaboration between domain experts and statisticians is essential for effective research.
  • The COVID-19 pandemic has led to increased data availability and international cooperation.
  • Modeling infectious diseases requires understanding complex dynamics and statistical methods.
  • Challenges in coding and communication between disciplines can hinder progress.
  • Innovations in machine learning and neural networks are shaping the future of epidemiology.
  • The importance of understanding the context and limitations of data in research. 

Chapters:

00:00 Introduction to Bayesian Statistics and Epidemiology

03:35 Guest Backgrounds and Their Journey

10:04 Understanding Computational Biology vs. Epidemiology

16:11 The Role of Bayesian Statistics in Epidemiology

21:40 Recent Projects and Applications in Epidemiology

31:30 Sampling Challenges in Health Surveys

34:22 Model Development and Computational Challenges

36:43 Navigating Different Jargons in Survey Design

39:35 Post-COVID Trends in Epidemiology

42:49 Funding and Data Availability in Epidemiology

45:05 Collaboration Across Disciplines

48:21 Using Neural Networks in Bayesian Modeling

51:42 Model Diagnostics in Epidemiology

55:38 Parameter Estimation in Compartmental Models

Thank you to my Patrons for making this episode possible!

Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser, Julio, Edvin Saveljev, Frederick Ayala, Jeffrey Powell, Gal Kampel, Adan Romero, Will Geary, Blake Walters, Jonathan Morgan, Francesco Madrisotti, Ivy Huang, Gary Clarke, Robert Flannery, Rasmus Hindström and Stefan.

Links from the show:

Transcript

This is an automatic transcript and may therefore contain errors. Please get in touch if you're willing to correct them.

Transcripts

Speaker:

Welcome to the second ever live LBS episode recorded at STANCON on September 11, 2024.

2

:

In this episode, Lisa Semenova and Chris Wyman bring computational biology and

epidemiology to life, making, I have to say, science seriously cool.

3

:

You'll learn how Bayesian statistics and causal inference help in advancing the front

4

:

tier of our knowledge in these fields and enjoy, I hope, the live Q &A with the fantastic

Stankon audience who attended this episode.

5

:

Again, a huge thank you to the organizing committee and to the audience, you folks were

absolutely wonderful.

6

:

This is Learning Visions Statistics, episode 120.

7

:

Welcome to Learning Bayesian Statistics, a podcast about Bayesian inference, the methods,

the projects, and the people who make it possible.

8

:

I'm your host, Alex Andorra.

9

:

You can follow me on Twitter at alex-underscore-andorra.

10

:

like the country.

11

:

For any info about the show, learnbasedats.com is Laplace to be.

12

:

Show notes, becoming a corporate sponsor, unlocking Bayesian Merge, supporting the show on

Patreon, everything is in there.

13

:

That's learnbasedats.com.

14

:

If you're interested in one-on-one mentorship, online courses, or statistical consulting,

feel free to reach out and book a call at topmate.io slash alex underscore and dora.

15

:

See you around, folks.

16

:

and best patient wishes to you all.

17

:

And if today's discussion sparked ideas for your business, well, our team at PIMC Labs can

help bring them to life.

18

:

Check us out at pimc-labs.com.

19

:

Hey folks, before we start the show, I just wanted to share something.

20

:

You can guess that I love podcasts.

21

:

I have one.

22

:

So I listen to lot of podcasts and something that happens a lot when I listen to podcasts

is I hear something on a show and I think, that's awesome.

23

:

I gotta write this down.

24

:

But then I miss it.

25

:

I have to pause.

26

:

have to rewind the episode.

27

:

That's a hassle.

28

:

But...

29

:

that's where I discovered actually chat pods and they agreed to sponsor the show.

30

:

It's really cool because whenever I catch a quote then I just tap a button and the app

will instantly transcribe and save that moment for me to revisit later.

31

:

Honestly that's super practical when you listen to a technical show because then you can

just do that and then boom if you're working on a model you know time series model

32

:

Boom, you can already have that in chat pods and you have the moment that you are

interested in.

33

:

So really, now that's how I listen to my favorite shows and I'd love for you to try it

with me.

34

:

So if you're interested, just check the podcast description or search for chat pods in

your favorite app store.

35

:

And as they say at chat pods, capture podcast highlights anytime, anywhere.

36

:

Now, onto the show.

37

:

Well people, officially welcome to the second ever live episode of the Learning Basics and

podcast.

38

:

Please welcome Lisa Semenova and Chris Wyman.

39

:

So let's start.

40

:

What do you want to talk about?

41

:

How to put your guests very, you know, uncomfortable.

42

:

so that's No, no, no, that was a joke.

43

:

That's fine.

44

:

No, so let's start with your backgrounds.

45

:

So Lisa, you were already on the podcast, which was like, what, three years ago, something

like that.

46

:

And you do so many things that I feel like we should update your

47

:

your background section, your origin story.

48

:

So, yeah, can you start by telling us how you ended up doing what you're doing today?

49

:

Like, how did you end up using Stan, using Bayesian stats, using PIMC, using a lot of

different stuff to do very interesting research?

50

:

Okay, I don't want to be the buzz killer at the StumpCon, but right now, my favorite...

51

:

probabilistic programming language is NumPyro.

52

:

And that has very, it's not just matter of taste, but it is matter of functionality

because the work I do these days requires easy integration of neural network architecture

53

:

into a PPL, which of course is possible in Stan as well, but there you need to write the

architecture down by hand.

54

:

and it's easy for trivial architecture, but once the architecture becomes a little bit

more complicated, you really don't want to be doing it manually.

55

:

So this is the reason I'm not a PMC active user right now, not a super active Stan user

right now, rather a NUMPYRO account for the time being.

56

:

And how did you end up doing patient stats?

57

:

Well...

58

:

I did my PhD in epidemiology and as we have heard yesterday and today, Bayesian statistics

is very prevalent in epidemiology for a ton of reasons.

59

:

The type of epidemiology I was and still am doing is in the space of spatial statistics

where the main tool, the hammer of spatial statistics is

60

:

Gaussian processes.

61

:

yeah, I think that was my introduction.

62

:

And I tried to get out of that world a couple of times by working in Pharma and building

models based on Joe's work, for example, and trying to do other things.

63

:

But somehow this dark hole keeps attracting me back again and again.

64

:

OK, thanks.

65

:

Chris, what about you?

66

:

So my undergraduate, my PhD and my first postdoc were all in particle physics.

67

:

I went on a bit of a journey through my PhD of kind of questioning what I was working on.

68

:

When I started my PhD, it was in a big group of particle physicists and I asked to be put

with the most theoretical person there because I thought theory and maths is really fun.

69

:

And then as I was kind of as I was going through my papers, I thought but the paper I'm

writing isn't really changing anything.

70

:

I don't feel like it's that useful.

71

:

And so I kind of moved, I just kind of crept towards the more useful, what I consider the

more useful end of the discipline, namely where experiments were being done to try and

72

:

confront these, all these different theories of physics with the experimental data to see

which ones are ruled out.

73

:

There wasn't really any signal at the Large Hadron Collider for new physics beyond the

Higgs boson, which everybody had been expecting.

74

:

And so this kind of stepwise, I want to do something a bit more useful, a bit more useful,

eventually ended up with, well, if I broaden my horizons beyond particle physics, I can

75

:

think of things that I

76

:

consider to be a lot more useful.

77

:

And somebody in my group worked with or was friends with somebody who did mathematical

modeling of HIV.

78

:

And I had no idea that something that sounded so interesting and something so applied

existed.

79

:

And so I got in contact with that guy and did a month long free internship just to test

the water.

80

:

And then I started my postdoc.

81

:

by that point, I thought I want to get out of this field.

82

:

And then got in touch with my current boss.

83

:

And I'm still with him 10, 11 years later.

84

:

doing infectious disease epidemiology.

85

:

So mostly HIV and some COVID stuff as well.

86

:

Huh, okay.

87

:

Yeah, that's fascinating.

88

:

I didn't know you started with physics, so you can see the amount of background work

that's going into the episodes.

89

:

No, but that's really fascinating.

90

:

So you add to the cohort of ex-physicists who are doing amazing things in the Bayesian

world.

91

:

Yeah, I sometimes feel in quantitative disciplines, physicists are like rats that you're

never more than two meters away from even if you didn't know about it.

92

:

Yeah, and I mean, so I guess you understood the topic for today.

93

:

That's like yesterday, that was the nerd panel.

94

:

We talked about samplers and tuples and very technical things.

95

:

Today, we're gonna...

96

:

It's a dummy panel?

97

:

And today, we're going to see basically how to apply that on real data and what are the

fascinating things that you guys are doing.

98

:

And I think that's awesome because that's going to make science look really good, which is

also the goal of this podcast, make better educational scientific content.

99

:

If you have any questions for Lisa and Chris, write them down and the last 10 minutes of

the show you'll be able to ask them whatever you want.

100

:

Again, the questions are recorded, the sound, so your voice, but you won't be filmed, but

you will be recorded and you will get to be, if you ask a question, in one of the episodes

101

:

of Learning Visions Statistics, so you know, that's something to brag about.

102

:

to somebody who knows what that is.

103

:

So anyway, so write down your questions, blah, blah.

104

:

So let's start diving a bit more.

105

:

Actually, talking with you guys before the show, I realized there is something very

fundamental that I didn't know and understand.

106

:

It's that there is a difference between computational biology and epidemiology.

107

:

For me, it was kind of the same thing.

108

:

So maybe can you explain what the difference is and what you guys do actually in these

realms?

109

:

Yeah, surprise.

110

:

So epidemiology generally is a science about health and health in the most generic sense

possible.

111

:

It could be physical health, could be mental health, could concern infectious diseases.

112

:

It could concern non-communicable diseases.

113

:

It could concern health in a particular region.

114

:

could be epidemiology of a particular region or a particular disease.

115

:

Or it could be global health, looking at the distribution of health on very large scale.

116

:

if you like, epidemiology is a macro science, while biology is looking rather into tiny,

tiny details.

117

:

I think actually if you pay attention, most of the epidemiologists you would meet, they

wear glasses.

118

:

Because we can't see things really well.

119

:

We just more or less look at the globe like, okay, there's malaria in this part of the

world.

120

:

There is dengue in this part of the world, more or less.

121

:

Biologists are not like that.

122

:

Biology, I feel, is much more of a precise science.

123

:

They look into details of things.

124

:

metachondria and this cell and that cell, how does this cell become a brain cell out of a

stem cell and so on.

125

:

So they tried to understand world at a micro level.

126

:

Yeah, would agree with that, except I think there are sort of biological areas of study at

higher scales.

127

:

I'm just less familiar with them.

128

:

So biologists would study ecosystems and you know, could, yeah, just other things beyond

the microscopic level.

129

:

But sort of it's some of the things you're studying might not have any relevance to human

health.

130

:

So you might be studying sort of what's happening in given animal, even plant, and just

understanding how that function works.

131

:

Whereas epidemiology is almost always focused on humans, I mean you can do veterinary

science as well, but it's related to health and health at the individual level, the

132

:

population level, but it's about understanding, there's very commonly an applied aim, we

want to understand what's good health, what's bad health, so that we can improve health.

133

:

Whereas biology, think, at a basic level is just about understanding these processes,

whatever they are.

134

:

And computational biology is of course just the computational side of that, and in both

disciplines.

135

:

there's work which is in biology and epidemiology, there's work which is not computational

and not even quantitative even.

136

:

So you can do qualitative work, which is very important, particularly with epidemiology,

know, understanding people's attitudes to healthcare, people's attitudes to certain kinds

137

:

of new treatment, getting those kinds of things in place and understanding cultural

differences with interventions coming in can be really important to make these things

138

:

work.

139

:

Absolutely.

140

:

Yeah.

141

:

So the field of epidemiology actually spans several sciences.

142

:

with social sciences on one end, passing through economics and so on and so on, ending up

at the statistics and machine learning and computational epidemiology.

143

:

Okay, thanks.

144

:

That's really useful and fascinating.

145

:

I'm curious if your background in physics is actually helping you in this new field,

because it sounds to me like it would, but I'm curious if it does and

146

:

If it does, how?

147

:

I think the short answer is no.

148

:

that compared to a count, know, helping compared to what compared to a counterfactual of

having had a PhD in epidemiology, you know, I don't think it is as useful.

149

:

But I think that's sort of the reason you see physicists everywhere in this type of work

like rats is that an education in physics teaches you how to describe phenomena

150

:

quantitatively, to make predictions, to understand the mechanisms.

151

:

And then if you can take that kind of skill set to different phenomena.

152

:

then that can still be useful.

153

:

that's, I think that's why it is useful, being able to describe things using maths and the

skill set associated with that, like coding and making plots and sort of understanding the

154

:

relationships with things.

155

:

So at that level, it was helpful.

156

:

Okay.

157

:

So, if, if like someone from high school, you know, went to you and asked you if she

should pursue a degree more in epidemiology and computational biology to the

158

:

of work you're doing, would you recommend doing that or would you say, well, physics is

very useful because you're going to learn these building blocks basically of exactly what

159

:

you describe and then you can apply that to any field.

160

:

I think it would depend what area of epidemiology you wanted to go into.

161

:

So if you wanted to go into this kind of qualitative work and understanding the cultural

differences, obviously the background physics is helping you not at all there.

162

:

If you wanted to end up in the area of epidemiology,

163

:

epidemiology I work in, think physics isn't bad, but a more useful route would be through

something like mathematical biology or biostatistics to kind of to learn the methods and

164

:

some of the kind of the areas you're going to be applying them to at the same time.

165

:

Anything to add Lisa?

166

:

You don't have to, but it's just I'm checking before.

167

:

My default answer if you don't know what to study, study maths.

168

:

Yeah.

169

:

You can't go wrong with that.

170

:

Yeah.

171

:

Yeah.

172

:

Especially algebra.

173

:

Yeah, but something I'm wondering about then is how does how do Bayesian statistics fit in

all that?

174

:

You know, why are you folks even in Stan or an Empire or PMC?

175

:

Why are you even interesting in being in these kind of conferences like StanCon?

176

:

And I think that's going to give us and the audience a better

177

:

concrete idea of what you're doing every day.

178

:

So, just to give a...

179

:

So far I've been talking about epidemiology.

180

:

work in...

181

:

Both of us work in infectious disease epidemiology.

182

:

Me completely, I think you partly.

183

:

Anyway, so I work completely in infectious disease epidemiology and just to clarify what

that is, so it's of course epidemiology of infectious diseases.

184

:

So you have infectious diseases and non-communicable diseases, ones that don't spread from

one person to another via a pathogen.

185

:

And so for infectious disease epidemiology...

186

:

We're interested in the infection process at lots of different levels.

187

:

So when pathogens get inside our cells and inside our organs and our bodies and our

households and our workplaces and our cities, countries and the whole world.

188

:

So there's kind of lots of processes going on at lots of different levels.

189

:

And we want to understand those a lot of the time using quantitative data.

190

:

So sort of, you know, the most sort of familiar level would be the individual level.

191

:

So when an individual gets an infection.

192

:

what's the probability of certain outcomes happening?

193

:

So getting this symptom or not getting it would just be kind of a single probability

parameter, but conditional on getting symptoms or a certain set of symptoms, when would

194

:

you get them?

195

:

So then you start to think about timing distributions, and there's lots of those in

infectious disease epidemiology.

196

:

So how long after I get infected do things happen?

197

:

Am I getting symptoms?

198

:

Am I getting hospitalized?

199

:

Am I dying?

200

:

And am I transmitting to somebody else?

201

:

So you have all these kind of timing distributions.

202

:

And you have observations of those, sometimes censored, sometimes incomplete.

203

:

And so getting estimates of what these distributions are is clearly a question for

statistics.

204

:

And a lot of the time studying the dynamics as well.

205

:

So one of the key differences between infectious disease epidemiology and the epidemiology

of non-communicable diseases is the dynamics, essentially, which is that when you have a

206

:

process that spreads from person to person, that naturally gives rise to exponential

dynamics.

207

:

until something interrupts it, like an intervention or population building up immunity,

whereas non-communicable diseases don't have those exponential dynamics.

208

:

mathematical models for the dynamics of the system are very different between the two

things.

209

:

But you still want to estimate those a lot of the time using statistical models.

210

:

For example, we had this talk earlier, I think, from Judith about estimating the R number

over time to the average number of people I pass the disease onto, and they pass the

211

:

disease on as well.

212

:

So what is that number?

213

:

How does it change over time and in response to what?

214

:

So lot of these are statistical questions.

215

:

So yeah, to complement this answer, long story short, Bayesian statistics is a great way,

A, to connect models to the data, B, to get uncertainty, C, to allow your models to be as

216

:

complex, well, within limit, of course, a reasonable limit, as complex as you would like

them to be.

217

:

And what is interesting is that, yes indeed, there is separation in epidemiology between

infectious disease and NCDs, non-communicable diseases, but in terms of modeling, there is

218

:

also some overlap that exists.

219

:

So how do we model infectious diseases, right?

220

:

For A, there is compartmental disease transmission models, which tell us how to...

221

:

how do agents, individuals move from one compartment to another.

222

:

B, there is agent-based models where we model every individual by themselves and then try

to compute summary statistics to also fit them to the data.

223

:

There are semi-mechanistic models which sit somewhere in between.

224

:

And there are spatial models which might or might not consider the temporal component.

225

:

So.

226

:

What's happening in the NCD world and communicable diseases?

227

:

Okay.

228

:

ABM is probably not very appropriate.

229

:

Semi-mechanistic models not very appropriate.

230

:

Spatial models, they don't care about the nature of your data.

231

:

They are spatial, right?

232

:

They don't know whether the spatial pattern is present due to how an infectious disease

was developing in a population or due to...

233

:

some environmental exposures that caused the pattern of this type of cancer, for example,

in space.

234

:

So spatial models is number one.

235

:

That is one common denominator.

236

:

And second, also the state space models or compartmental models.

237

:

Turns out they're useful in the non-communicable space as well.

238

:

Because rather than modeling

239

:

rather than viewing each compartment as all the people together, which are in the same

state, we can view this as a state of one person during the course of their disease or

240

:

condition.

241

:

Okay, Chris, you have...

242

:

What's a recent project you've been working on that you're particularly excited about that

you can share with us so that...

243

:

then that gives us an even more concrete idea of what your job involves every day.

244

:

So for the benefit of the listeners of the podcast and with apologies to the audience who

will hear this again tomorrow, that project would definitely be the most interesting

245

:

statistical project I've worked on is about understanding how having two HIV infections is

different from having one HIV infection.

246

:

because you can get, once you've got HIV, you have it for life, but you could get infected

again.

247

:

Or sometimes when you're infected in a single event, you get two quite different viral

particles and both of them go on to establish a productive infection.

248

:

you can be in this state which might persist or might disappear over time of having two

viruses simultaneously.

249

:

And there's been lots of work, lots of studies done on trying to figure out, that worse

for you?

250

:

If you had to pick an answer, even if you didn't know much about it, you might think it's

probably not better for you than having HIV just once.

251

:

And so people trying to estimate.

252

:

Is it worse for you, and if so, how much?

253

:

And so we came to this question with a lot more data than has been used previously.

254

:

The way we were able to have a lot more data was by using next generation sequencing

instead of more traditional older Sanger sequencing methods.

255

:

So it's kind of high throughput sequencing methods, which means you can get lots of

samples through.

256

:

You can have bigger sample sizes.

257

:

But it means the data is a bit noisier.

258

:

And so you have to think a bit more carefully about interpreting it.

259

:

And so something that's

260

:

I thought was particularly interesting to use Bayesian stats for here is we built a simple

causal model, sort linking together a number of covariates together with causal effects.

261

:

And so obviously the different variables are affecting each other.

262

:

And so it's important to propagate uncertainty through this kind of model because the

uncertainty in one part is relevant for the uncertainty in another.

263

:

And so getting an overall answer like to what extent does my immune system decline more

quickly if at all when I have two HIV infections.

264

:

is a function of many different parameters in the model.

265

:

And so it's nice to use a Bayesian model in this context to bring that uncertainty

together properly.

266

:

OK, yeah.

267

:

Yeah, that's fascinating.

268

:

preparing for the episode, you were kind enough to share with me some of the material

we're going to show tomorrow.

269

:

Definitely recommend coming and checking out Chris's talk, because that's going to be

fascinating.

270

:

And we'll put it, of course, in the show notes of that episode for the podcast listeners.

271

:

Before moving on to Lisa in your project, I'm curious, what does a simple causal model

mean in your field?

272

:

Well, I use the term simple in the context of this conference, in the sense that it has

maybe seven overall types of variables.

273

:

I think it's 50 parameters estimated numerically in a model, but I have seven nodes in my

DAG, essentially.

274

:

I have certain kinds of predictors like age and sex, which are often

275

:

relevant in epidemiology, coming in together with the genetic data.

276

:

So after we've sequenced the person's virus, we determine what kind of, you know, what

genetic sequences, and there's many of them in a given person.

277

:

So how do we connect that together with clinical longitudinal data about how their

infection is progressing together with these predictors?

278

:

So how do you kind of link these things up?

279

:

So there's only about seven variable, seven classes of variable overall.

280

:

So the DAG only has seven things in, so that's simple for me.

281

:

Yeah, yeah, yeah, indeed, indeed.

282

:

But that's

283

:

That's pretty incredible.

284

:

I love that that simplicity still gives you a model that can be extremely powerful and

then that can be interpreted causally.

285

:

So, last question.

286

:

After that, I give you the stage, Lisa, but I'm curious about your workflow in these

cases.

287

:

Like, are you, like, how do you work?

288

:

Are you setting up the DAG yourself and then you go to other domain experts and you like

basically

289

:

test your DAG on them?

290

:

Or are several of you setting up the DAG?

291

:

And when are you satisfied enough with the DAG to say, that's a good DAG, now we can go

build that in Stan?

292

:

So in this case, we actually intuited our way towards the likelihood first, and then I

realized only afterwards that this, you what DAG does this correspond to?

293

:

And so yes, those two things make sense together.

294

:

within our team and particularly with the network of collaborators we have around us

who've contributed to the data, we are the domain experts, I guess.

295

:

So it's not, as I'll clarify a bit more tomorrow, I don't consider myself a statistician,

I'm an infectious disease epidemiologist.

296

:

So we're coming with the domain expertise and then sort of learning the stats we need to

then make these models work.

297

:

Okay, okay.

298

:

That's awesome.

299

:

I'd really love to see one of these team meetings.

300

:

That must be absolutely fascinating.

301

:

Lisa?

302

:

What about you?

303

:

What have you been up to recently?

304

:

What are you especially excited about?

305

:

And an example that you can share with us.

306

:

There are three directions which are very close to my heart right now.

307

:

And I'll start with the one that I've been working on for the last two or three years, I

think.

308

:

And that is building emulators using deep generative models.

309

:

So what I mentioned already today once, an example is trying to build an emulator for

quantities which are computationally very unpleasant.

310

:

In epidemiology, what are examples of unpleasant quantities?

311

:

They're Gaussian processes.

312

:

are ordinary systems of ordinary differential equations.

313

:

We really don't like them.

314

:

We don't like them within every MCMC step.

315

:

So we would like to get rid of them or we would like to

316

:

create a quantity, a surrogate that quacks like a duck but is not a duck.

317

:

So something that behaves like that quantity of interest but does not have all the

computational burden.

318

:

So that's one direction, building emulators.

319

:

Second is again about Gaussian processes.

320

:

I told you this is a black hole.

321

:

That is building Gaussian processes on graphs.

322

:

because I think if we started talking about Gaussian processes now, the whole conversation

would be mostly around this kernel, that kernel, that representation, that representation,

323

:

but all of that most likely would concern RN, so building Gaussian processes in maybe

multidimensional but real valued space.

324

:

However, in epidemiology, very often we encounter network data, graph data.

325

:

So imagine the start of COVID, we observe cases in a couple of countries, and we would

like to predict what's going on in other countries.

326

:

And it's not really appropriate.

327

:

And some countries are separated by ocean.

328

:

We can't just take geographical coordinates of those countries and say, this is the

distance between the United States and the United Kingdom.

329

:

That's why we will now infer what's going on in the US based on what's happening in the

United Kingdom.

330

:

Still, we would like to use some notion of similarity, maybe airline data, maybe, I don't

know how many sharks based on Viannese talk, swim every day from New York to London.

331

:

And turns out some very clever people have worked out the maths not so long ago in 2020,

but similar story to HSGP paper, I guess.

332

:

There was a paper laying out the theory, but then as a practitioner, you're like, but what

do I do with it?

333

:

Stirring at this math, how do I implement it?

334

:

How do I choose the priors?

335

:

yada, yada.

336

:

So that is the project which I'm mostly excited about right now.

337

:

There's two of us, the brilliant collaborator Slava Degeslav Borovitsky, who is the first

author of the:

338

:

Big shout to him.

339

:

I don't have anything to show for it yet, but it's working.

340

:

Once we are ready, we will share.

341

:

How do you know it's working?

342

:

MCMC is finally running and effective sample size is larger than one, which was not the

case all the way throughout.

343

:

just by running the model and comparing, say, raw data to our layers, we figured out there

was one country that was just sticking out.

344

:

The data did not look right.

345

:

And then it was France.

346

:

Of course, probably a strike.

347

:

Yeah, but then we went and checked the data and indeed either the data is corrupted,

there's something strange going on, but we understood it having run the model, not having

348

:

investigated the data.

349

:

What does it say about us as modelers?

350

:

Well, you're the judge.

351

:

But yeah, that's the second project.

352

:

And the third, I can't say it's one project, it's an overall direction that is

353

:

sequential data collection, and that's related to iterative methods that Anna and I will

cover in the workshop on Friday.

354

:

Can you already tell a bit more?

355

:

Because I won't be here on Friday.

356

:

Sure.

357

:

So the context of surveys is that we would like to make judgment about populations, say we

would like to estimate one quantity about a population, say the average

358

:

wealth of everyone in this room, but it's impossible to survey absolutely everyone.

359

:

So we would like to collect a representative sample of everyone who is in this room.

360

:

So we would like to reach out to poor people and rich people and sample a little bit from

each of the group and then compute say an average or an aggregate.

361

:

And there are in real life certain issues associated with that.

362

:

First of all, non-response.

363

:

that if we go out to very rich people who probably are not declaring their taxes properly,

they might not be willing to tell us how much they earn or what their total wealth is.

364

:

Similar about poor people, they might be coming from marginalized populations, they just

are not cooperative, they do not like organized research.

365

:

So our estimates, hence, will be biased.

366

:

This phenomenon that I now qualitatively described quantitatively means that there is a

correlation between the recording mechanism, basically who and where with sample, and the

367

:

response.

368

:

So whether you are rich is correlated with the fact whether you give us an answer or not.

369

:

That is a big problem.

370

:

surveys often are designed statically.

371

:

Basically, we decide who and where to sample before we go to the field.

372

:

And the idea is to use sequential methods to try and solve those problems as we go.

373

:

It's a little bit like building a plane as you fly, but it does have benefits.

374

:

So we can go and field a batch of samples, collect a little bit of data, come back home,

run our Bayesian model.

375

:

see where we are certain, where we are uncertain, and then use the exploration

exploitation trade-off meaning if we're already certain about the subpopulation, do we

376

:

really need to go and ask them again, or do we rather go and spend our budget where we are

uncertain about that subpopulation?

377

:

would like to learn more.

378

:

Okay, yeah, I mean, that sounds fascinating, at least for the political scientist nerd

that I am.

379

:

I started like that, so it's very, very close to basically all the issues that polls are

having.

380

:

So I'll definitely study and watch this tutorial.

381

:

Thanks, Lida.

382

:

Actually, once it's available, we should put that in the show notes for that episode.

383

:

And also, if you can send me the papers you mentioned, we should definitely put that in

the show notes.

384

:

Thanks.

385

:

Another question for the both of you, let's start with you, Chris.

386

:

I'm curious, what are your most significant challenges when you're developing a model like

the kind of model you told us about a few minutes ago?

387

:

So think the main thing that was a time sync so far, something I will explain in more

detail tomorrow, has been that in working with these...

388

:

datasets with longitudinal data from many individuals, so 2,600 individuals in this

project, using that many numerical parameters for individual specific random effects, I

389

:

think wouldn't be feasible in this case.

390

:

And so I did analytical marginalization, so I'm calculating some of the integrals myself,

and then Stan doesn't have to know about the values of these parameters.

391

:

And so it's the time involved just in doing the maths, and then if you want to know what

the posterior is for these things are.

392

:

you have to kind of undo that maths afterwards.

393

:

And so doing this and testing it on simulation to make sure you're getting it right and

getting some of the linear algebra with the matrices, you know, I spent some time chasing

394

:

up a bug I'd made in the maths.

395

:

So that had been a time sink.

396

:

But otherwise, it's been relatively smooth sailing for what has been sort of one of my

first Bayesian projects and definitely the biggest.

397

:

And so I guess I was just quite lucky that things tended to work pretty much as I went

along.

398

:

But testing on simulation, as many people have said, has been critical to catching bugs at

each level of adding more complexity to the model.

399

:

And it's often something I would never have caught otherwise, like putting the standard

deviation instead of the variance in the function call or something like that.

400

:

OK, yeah.

401

:

So basically, simulating data, running the model on that, and that helps you debug that

nasty bug that was hidden somewhere in the code.

402

:

Yeah.

403

:

Thanks, very practical.

404

:

I like that kind of tip.

405

:

Lisa?

406

:

I think coding bugs definitely, but also speaking very different languages with people

from the domains.

407

:

So coming to survey design, there is literature on surveys which speaks more of a very

practical and statistical language.

408

:

They write down...

409

:

with a sum sign, they have all set up their own jargon such as sampling frame and many,

many other words which I did not know what they mean.

410

:

On the other hand, there is computer science literature and they talk all the time about

information gain and formulate everything in terms of information.

411

:

And then you understand, well, the solution you're looking for somewhere in that

literature.

412

:

but because it's written using completely different language and jargon, it is absolutely

impossible to compare apples with pears.

413

:

Yeah, so I mean, I noticed from your answers that it seems like using Stan or any PPL is

actually not really your bottleneck in these cases, which is amazing to hear, nonetheless.

414

:

I'm going to ask you, if Stan or any PPL you're using could make your life easier, what

would that look like?

415

:

So far I don't know and I think it's because I haven't used Stan enough yet to come up

against any limitations really.

416

:

A key point of learning which Stan does handle already.

417

:

What's normally said is you can't use discrete parameters in Stan or guess any Hamiltonian

Monte Carlo or...

418

:

maybe within reason if other languages let you explore a few different parameter spaces

separately.

419

:

But in Stan you can't, it doesn't natively support discrete parameters, but if your

discrete parameter corresponds to which different process or which different component of

420

:

a finite mixture model is contributing, then it effectively does and it's just that you

need to do the maths to handle that.

421

:

So that was sort of what I initially perceived to be a limitation of Stan, but then you

just need to know the right way of dealing with it and then it's fine.

422

:

No suggestion for me yet.

423

:

For me, I think it comes back to the first answer, that's integration with neural network

libraries.

424

:

As long as I'm able to combine easy ways to formulate an architecture that I would like to

use within a PPL, the bingo that I'm sold.

425

:

OK.

426

:

And what are the neural network libraries you go to?

427

:

PyTorch Jax in particular, because I work now with NumPyro, Jax is our best choice.

428

:

Having said that, it's a little bit fiddly.

429

:

So what people do, they try to build a neural network in PyTorch to figure out all the

details of the architecture.

430

:

And then once they have the final answer, it to Jax.

431

:

Okay.

432

:

So of course you've

433

:

like you're working in a field that we've heard a lot about, at least since COVID, right?

434

:

So it doesn't happen that much that you have like such a big event that I'm guessing has a

lot of consequences in your fields.

435

:

So I'm curious if you have noticed any, yeah, any new trends post-COVID in the work you're

doing.

436

:

in the people you were able to talk to, were able to reach, things like that.

437

:

I don't think it's changed much.

438

:

was through COVID that I got more into statistics.

439

:

I've done very little statistical work before that.

440

:

So it's been a change for me in terms of what I've worked on.

441

:

It meant a change in the level of news coverage our papers received in a way that was kind

of uncomfortable.

442

:

I decided at the very start of the pandemic, didn't want to have a sort of, I didn't want

to...

443

:

engage with reporters, partly because I felt like there's so many people who know much

more about it than me.

444

:

I don't want to contribute noise, partly because I thought another white male being a face

of something that's already could do more diversity would be helpful.

445

:

And partly just feeling not comfortable in that scenario.

446

:

each paper coming out and getting a lot of news coverage was just very strange.

447

:

That's definitely died down now.

448

:

see things are of going a bit more back to normal.

449

:

In terms of changing the work, I would imagine

450

:

quite a lot of people decided it was an interesting thing to go and work in or a useful

thing to go and work in, having lived through some of the negative experiences of not

451

:

controlling infectious diseases.

452

:

I haven't seen that sort of feed through very much yet in terms of like new people coming

into the group, things have been relatively stable for us.

453

:

But I imagine that is a kind of a macro level trend.

454

:

Yeah, yeah, I was going to ask you that because I was talking with Bob Carpenter yesterday

and he was telling me that in computer science, for instance, like the

455

:

the number of students just boomed extremely in NLP and all these fields.

456

:

So I was curious if you had also seen a boom like that in the number of students that

joined your fields.

457

:

Lisa, maybe on my first question, basically, did you notice any new trends?

458

:

How do you feel about that?

459

:

Or are there also things that you'd love to see but that you don't see yet?

460

:

First, very personal trend after people ask me, what do you do, what's your profession?

461

:

And I say, epidemiologists, they stopped asking back, what is it?

462

:

Second trend is data.

463

:

Never ever in epidemiology we had so much data which is so precise, global, of such high

quality, and also a lot of international cooperation.

464

:

So databases were created where global data was collected, which of course enabled

research, which was not able up till that point at this scale on this level.

465

:

Okay.

466

:

Yeah.

467

:

And what about funding?

468

:

Did that make that a bit easier to get funding?

469

:

I've not been involved in any grant applications yet.

470

:

I'm happy staying as a postdoc forever.

471

:

I like spending all my time just doing the analysis and somebody else thinks about money.

472

:

That's great for me.

473

:

So I haven't had personal experience of it.

474

:

I think I heard during the pandemic that something like a billion was promised in the US

for viral genomics.

475

:

And the Biden administration created, I think, a new center for forecasting analytics for

infectious diseases.

476

:

in DC and they've received a lot of money which they're distributing throughout the US to

do lot more mathematical modeling of infectious diseases.

477

:

So I think that a lot more money has become available.

478

:

I haven't seen it personally, but I think it's around.

479

:

Can I ask my previous answer with a specific example of data?

480

:

So here in the UK an unprecedented survey has been conducted and that's React.

481

:

national level survey, which is aimed to be representative, and it was taking place in

several ways.

482

:

And it has served now as a...

483

:

So it laid foundation for surveys which can continue.

484

:

So that only happened due to COVID and during COVID.

485

:

Yeah, okay.

486

:

What was the last question?

487

:

No, it's just, you know, curious if there was something you had helped...

488

:

Yeah, yeah, funding also.

489

:

Sorry, funding, yeah.

490

:

I think I was the lucky one to ride this wave because I was holding fellowship until I

started my last job here at Oxford, funded by Schmitt Sciences.

491

:

And that was, I think, due to the timing.

492

:

So they're very keen on applications of AIML and adjacent methods.

493

:

in real life.

494

:

So sciences which have impact in real life.

495

:

So I guess my pitch came at the right time of AI for epidemiology.

496

:

I guess Gaussian process is qualified for AI, right?

497

:

Yeah.

498

:

Thankfully.

499

:

Do we have already some questions?

500

:

Thank you very much.

501

:

My name is Mpatswa, I'm a third year of my PhD.

502

:

My question goes mostly to Chris, but I think this had some really good points that you

might also want to contribute on.

503

:

So I'm currently at Imperial, your former institution, Chris, and I've loved both your

explanations and definitions for what epidemiology is, but I think you've downplayed how

504

:

much physicists and people from other backgrounds bring to epidemiology.

505

:

So I come from a medical background and then did a masters in AP, but then I've been blown

away by how much to borrow a term that I've from someone who someone's podcasts called

506

:

free associations, how much people are free to use, you know, to make causal inferences

about, you know, small service that they've done.

507

:

So the clarification in the thinking when you bring together, know, like your DAG thinking

and, you, you force people almost to connect the mechanisms, you know, in a principled way

508

:

and then make inferences.

509

:

I found that really helpful and mind-blowing.

510

:

my question is, dealing with people from other domains, which Lisa also commented on,

medical people and epidemiologists specifically, what advice do you have for someone like

511

:

me to be more torn down and more humble about how we connect data and make conclusions

about things in the world?

512

:

Thank you.

513

:

I guess the first thing that leaps to mind is don't be more humble because your experience

is so valuable.

514

:

But collaboration is key, right?

515

:

So it's there in the background or the foreground with lot of Bayesian analysis that

domain expertise is so relevant.

516

:

Not just what functional form do I choose for my prior, where do I concentrate it, but

what kind of likelihood, what kind of data should I be looking at, what question should I

517

:

be trying to answer.

518

:

So domain expertise, so making sure you collaborate with people.

519

:

You can decide for yourself what kind of question you want to work on.

520

:

And then if you're not the expert in sort of how that kind of process works, how that kind

of data works, how it's generated, work with the people who are.

521

:

If you're not the expert in the kind of methods you need to analyze that data and draw

conclusions, work with the people who are.

522

:

So I guess, yeah, don't be humble, but value what you can bring, but value what everybody

else can bring as well.

523

:

Lisa?

524

:

Anything to add?

525

:

Okay, that was another question.

526

:

Yeah, I think you were before.

527

:

This is for Eliza.

528

:

Can you just speak more what's your requirements for Deep PPL or how do you use the deep

learning part of the framework?

529

:

Yes.

530

:

Okay, let's talk about surrogates for a little bit.

531

:

Let's write an imaginary PBL program here.

532

:

Define a model where I have some outcome data and the last line is the likelihood.

533

:

So the last line is y is distributed as something something likelihood, but then this

likelihood I have some difficult term, Gaussian process.

534

:

don't like it, so let's say I want to model on a grid of million by a million.

535

:

If I actually run an MCMC on this program, that at every step of MCMC, it will have to

deal with the Gaussian process.

536

:

In fact, the one line above the likelihood is the sampling statement where I sample from

the prior of the Gaussian process.

537

:

So I say F is distributed as a multivariate normal with

538

:

there are covariance matrix K.

539

:

So what I want to be able to do is to write exact same program, just to scratch this line

out where F is distributed something and write instead F hat is distributed as something

540

:

else very simple, where F hat though looks very close to how F would look.

541

:

So where do I sample this f hat from?

542

:

The samples of f hat are given by a pre-trained dip generative model.

543

:

And what is the structure of the dip generative model?

544

:

And we need to go back to different times to parameterize multivariate normals.

545

:

So we want to use a non-centered parameterization where actually to sample f.

546

:

instead of f hat is the, sorry, f is distributed as multivariate normal, we'd rather write

f equals Lz, and we write one line above z is distributed as standard normal.

547

:

Sampling from standard normal is not hard, right?

548

:

They are all uncorrelated, z does not depend on any parameters, wonderful, so it doesn't

get better than that, z, standard normal.

549

:

Next line, f equals Lz,

550

:

L is really problematic.

551

:

It's that Cholesky factor, cubic complexity, includes all the GP parameters.

552

:

So this is the line we don't like, F equals LZ.

553

:

So F hat then equals some function phi of Z.

554

:

Z is again very nice, no problem sampling from Z, but this phi now is a neural network.

555

:

So basically we've pre-trained a neural network phi which learns

556

:

how to pass random, but simple vector z through a deterministic function phi to create

draws of priors for f hat looking very similar to f.

557

:

Does it make sense?

558

:

Yeah, yeah, no, well done.

559

:

I think that was really impressive to have that without the blackboard.

560

:

So you're basically like

561

:

approximating a Gaussian process with the neural network.

562

:

Precisely.

563

:

And now let's perform this mental exercise.

564

:

You say, don't care about spatial statistics.

565

:

I don't need Gaussian processes.

566

:

care.

567

:

I'm Julian Riu.

568

:

I'm Judith.

569

:

I'm Nicholas.

570

:

I care about mechanistic disease transmission models.

571

:

How can you help me?

572

:

Well, and I can say, all right, let's write your PPL how you usually do it.

573

:

What do you do?

574

:

Okay, always start writing a PPL from the last line.

575

:

Bad habit.

576

:

That's how I do it.

577

:

Okay, last line is always likelihood.

578

:

Y is distributed as something.

579

:

Let's model now virtually the number of daily counts of a particular disease.

580

:

Y is distributed as Poisson with intensity lambda.

581

:

Lambda function lambda of t.

582

:

would have the, okay, so Poisson where the mean is i of t, where i is the function which I

got as a solution of the SIR model, right?

583

:

So I write a very complicated order here, S prime equals i prime equals r prime equals,

then within every MCMC step, I have to solve the system of differential equations

584

:

with three compartments, best case, just to get this one tiny solution, one compartment

out, I of T, plug this into my likelihood.

585

:

So why on earth was I solving that whole ODE, right?

586

:

All I need is I of T.

587

:

What do I do?

588

:

I say, PPL, wait a second.

589

:

I'll go pre-train a neural network.

590

:

What do I show through the neural network?

591

:

I show it many solutions.

592

:

I give it one solution of ODE after the other, after the other.

593

:

and I don't need to give it all three compartments.

594

:

I only need to show its solution I of t.

595

:

Again, I pre-train that model, call it phi of t.

596

:

Coming back to my PPL, scratch out the ODE saying y of t, daily counts, is distributed as

Poisson distribution with a mean i hat of t, where i hat is the pre-trained neural

597

:

network.

598

:

Impressive.

599

:

Lisa Semenova, ladies and gentlemen.

600

:

I think there were still like two questions.

601

:

Do we have time?

602

:

Yeah.

603

:

Thank you.

604

:

I have one question for you each.

605

:

So for Aliza, the top.

606

:

So imagine you're the modeler who needs to handle all three models, like agent-based, and

semi-agent, and compartment.

607

:

I think you already are.

608

:

So what are some characteristics of good test diagnostics for you?

609

:

Context of this question is different data generating process requires different

evaluation measures for the model.

610

:

And even conditional on one modeling, say, compartment model for SIR, diffusions of idea

and diffusions of pathogen should differ, like how we evaluate the fit of the model.

611

:

So I'm just generally curious about, since epidemiology seems to be where different

modeling philosophy comes and convenes, just because it's macro-filled, I want to ask some

612

:

philosophy behind, or what test diagnostic would be most comfortable for you?

613

:

useful for you.

614

:

And for Chris, sorry.

615

:

What's your workflow on choosing which parameter to estimate as opposed to assumed?

616

:

For instance, reproduction number is a ratio of two different parameters.

617

:

And I always get confused which to set as assumed and which to set as estimated.

618

:

One very easy way to frame this is, you first start by estimating every every

619

:

parameter as estimation and then somehow lower the uncertainty of the model or do you

start very small by setting everything except one as assumed and then start to increase

620

:

the uncertainty of the model.

621

:

Thank you.

622

:

May clarify the question please?

623

:

So you asked for the three types of models.

624

:

What are the useful diagnostics?

625

:

For instance, the test diagnostic, think it's pregnant by Ronald, it's compartmental

related because that's mostly representative as a function.

626

:

But if you're doing an agent-based modeling, what are the test diagnostics and how does

that relate to other models?

627

:

I'm not sure I understand test diagnostic in this context.

628

:

fitting, model fitting, yeah, okay.

629

:

We do need to link different parts of models to the data.

630

:

In cases like compartmental models, we already get the daily mean.

631

:

Then in terms of agent-based models, we do need to create summary statistics.

632

:

So if we would like, it's very popular topic right now where people create digital twins

of entire countries.

633

:

So for example, here in the UK also, there's several versions of digital twins,

particularly trying to model spread of infectious disease and they try to be as realistic

634

:

as possible and include existence of schools and shops and where else people would go and

meet each other and at what rate.

635

:

like insanely detailed models.

636

:

And of course, we don't have data at that level, but we still have data at the exact same

level as we have it for SIR, which is what is the number of infected cases per day that we

637

:

record.

638

:

So we do need in that case to...

639

:

So the model itself can be as complex as we want it to be, but then in order to fit it to

the data, we do need to create summary statistics.

640

:

So we run the system simulation forward and then we sum across all individuals and say,

okay, in this country on this day, as a result of this very complex process, what is the

641

:

number of infected individuals?

642

:

Yeah, and so just one more point to add following up on that.

643

:

So some of my colleagues have written some of the best agent-based models for infectious

diseases.

644

:

And as far as I know, they only ever use approximate Bayesian inference for model fitting.

645

:

I don't mean, maybe it's possible I don't know how you would write an agent-based model in

STAND, for example.

646

:

Maybe you can embed it in some of these other alternative methods.

647

:

But as far as I know, ABC tends to be the go-to for agent-based models in infectious

diseases.

648

:

for your question about which parameters to assume and which ones to fix.

649

:

I haven't done any, so one of the most common examples you see, we've already heard, is

sort of estimating dynamics using statistical models, so the R number over time, for

650

:

example.

651

:

I've not done that kind of work.

652

:

I've been sort of estimating parameters of static model, which I'll be talking more about

tomorrow.

653

:

So like, what are the static parameters of this probability distribution?

654

:

What's the static value of this effect size and so on.

655

:

So just a bit of background for those who aren't familiar with the SIR model.

656

:

So this is the of the compartmental model where you are in S, I, or R and you kind of flow

through between them.

657

:

And in those ODE equations, the R number, you can think about it both at a population

level and an individual level.

658

:

And the R number corresponds to the instantaneous hazards for infecting somebody else if

you're infected divided by the instantaneous hazard.

659

:

for recovering and no longer being infected.

660

:

So if you think about the ODEs for little while, you'll see why that makes sense.

661

:

So it's this ratio of two parameters that determines the R number.

662

:

And I mean, if you have no data to tell you anything more than just the number of people

infected over time, I think you should just come out and say, I can only estimate the

663

:

ratio of these two parameters.

664

:

But ideally, you would find something to try and separate those two.

665

:

For example, there is very good data at this stage on how long does it take people to

recover from COVID.

666

:

A limitation of these kind of compartmental models is that because they're working with

kind of constant hazards for transition, the waiting time in any given compartment is

667

:

necessarily exponential.

668

:

Whereas as soon as I become infected with COVID, it's not that I spend an exponential

amount of time infected because my hazard is not going to be, I'm not going to recover

669

:

instantly basically.

670

:

So you might want to think about going to more realistic models in that case anyway, if

you wanted to estimate the rate at which people recover.

671

:

Yeah, we're over time, so thank you very much.

672

:

And please join me in giving a huge round of applause to Lisa Semenova and Chris Wyman.

673

:

This has been another episode of Learning Bayesian Statistics.

674

:

Be sure to rate, review, and follow the show on your favorite podcatcher, and visit

learnbaystats.com for more resources about today's topics, as well as access to more

675

:

episodes to help you reach true Bayesian state of mind.

676

:

That's learnbaystats.com.

677

:

Our theme music is Good Bayesian by Baba Brinkman, fit MC Lars and Meghiraam.

678

:

Check out his awesome work at bababrinkman.com.

679

:

I'm your host.

680

:

Alex Andorra.

681

:

You can follow me on Twitter at Alex underscore Andorra, like the country.

682

:

You can support the show and unlock exclusive benefits by visiting Patreon.com slash

LearnBasedDance.

683

:

Thank you so much for listening and for your support.

684

:

You're truly a good Bayesian.

685

:

Change your predictions after taking information in.

686

:

And if you're thinking I'll be less than amazing, let's adjust those expectations.

687

:

me show you how to be a good Bayesian Change calculations after taking fresh data in Those

predictions that your brain is making Let's get them on a solid foundation

Chapters

Video

More from YouTube

More Episodes
120. #120 Innovations in Infectious Disease Modeling, with Liza Semenova & Chris Wymant
01:01:39
119. #119 Causal Inference, Fiction Writing and Career Changes, with Robert Kubinec
01:25:00
118. #118 Exploring the Future of Stan, with Charles Margossian & Brian Ward
00:58:50
117. #117 Unveiling the Power of Bayesian Experimental Design, with Desi Ivanova
01:13:11
115. #115 Using Time Series to Estimate Uncertainty, with Nate Haines
01:39:50
114. #114 From the Field to the Lab – A Journey in Baseball Science, with Jacob Buffa
01:01:31
113. #113 A Deep Dive into Bayesian Stats, with Alex Andorra, ft. the Super Data Science Podcast
01:30:51
112. #112 Advanced Bayesian Regression, with Tomi Capretto
01:27:18
111. #111 Nerdinsights from the Football Field, with Patrick Ward
01:25:43
110. #110 Unpacking Bayesian Methods in AI with Sam Duffield
01:12:27
108. #108 Modeling Sports & Extracting Player Values, with Paul Sabin
01:18:04
107. #107 Amortized Bayesian Inference with Deep Neural Networks, with Marvin Schmitt
01:21:37
106. #106 Active Statistics, Two Truths & a Lie, with Andrew Gelman
01:16:46
105. #105 The Power of Bayesian Statistics in Glaciology, with Andy Aschwanden & Doug Brinkerhoff
01:15:25
104. #104 Automated Gaussian Processes & Sequential Monte Carlo, with Feras Saad
01:30:47
103. #103 Improving Sampling Algorithms & Prior Elicitation, with Arto Klami
01:14:38
102. #102 Bayesian Structural Equation Modeling & Causal Inference in Psychometrics, with Ed Merkle
01:08:53
98. #98 Fusing Statistical Physics, Machine Learning & Adaptive MCMC, with Marylou Gabrié
01:05:06
97. #97 Probably Overthinking Statistical Paradoxes, with Allen Downey
01:12:35
96. #96 Pharma Models, Sports Analytics & Stan News, with Daniel Lee
00:55:51
94. #94 Psychometrics Models & Choosing Priors, with Jonathan Templin
01:06:25
93. #93 A CERN Odyssey, with Kevin Greif
01:49:04
91. #91, Exploring European Football Analytics, with Max Göbel
01:04:13
90. #90, Demystifying MCMC & Variational Inference, with Charles Margossian
01:37:35
87. #87 Unlocking the Power of Bayesian Causal Inference, with Ben Vincent
01:08:38
86. #86 Exploring Research Synchronous Languages & Hybrid Systems, with Guillaume Baudart
00:58:42
83. #83 Multilevel Regression, Post-Stratification & Electoral Dynamics, with Tarmo Jüristo
01:17:20
6. #6 A principled Bayesian workflow, with Michael Betancourt
01:03:53
11. #11 Taking care of your Hierarchical Models, with Thomas Wiecki
00:58:01
12. #12 Biostatistics and Differential Equations, with Demetri Pananos
00:46:30
14. #14 Hidden Markov Models & Statistical Ecology, with Vianey Leos-Barajas
00:49:01
17. #17 Reparametrize Your Models Automatically, with Maria Gorinova
00:51:30
20. #20 Regression and Other Stories, with Andrew Gelman, Jennifer Hill & Aki Vehtari
01:03:44
21. #21 Gaussian Processes, Bayesian Neural Nets & SIR Models, with Elizaveta Semenova
01:02:11
33. #33 Bayesian Structural Time Series, with Ben Zweig
00:57:49
34. #34 Multilevel Regression, Post-stratification & Missing Data, with Lauren Kennedy
01:12:39
35. #35 The Past, Present & Future of BRMS, with Paul Bürkner
01:07:02
36. #36 Bayesian Non-Parametrics & Developing Turing.jl, with Martin Trapp
01:09:29
37. #37 Prophet, Time Series & Causal Inference, with Sean Taylor
01:06:14
46. #46 Silly & Empowering Statistics, with Chelsea Parlett-Pelleriti
01:13:03
48. #48 Mixed Effects Models & Beautiful Plots, with TJ Mahr
01:01:24
58. #58 Bayesian Modeling and Computation, with Osvaldo Martin, Ravin Kumar and Junpeng Lao
01:09:25
59. #59 Bayesian Modeling in Civil Engineering, with Michael Faber
00:59:13
60. #60 Modeling Dialogues & Languages, with J.P. de Ruiter
01:12:35
61. #61 Why we still use non-Bayesian methods, with EJ Wagenmakers
01:16:44
63. #63 Media Mix Models & Bayes for Marketing, with Luciano Paz
01:14:43
66. #66 Uncertainty Visualization & Usable Stats, with Matthew Kay
01:01:57
68. #68 Probabilistic Machine Learning & Generative Models, with Kevin Murphy
01:05:35
69. #69 Why, When & How to use Bayes Factors, with Jorge Tendeiro
00:53:40
74. #74 Optimizing NUTS and Developing the ZeroSumNormal Distribution, with Adrian Seyboldt
01:12:16
78. #78 Exploring MCMC Sampler Algorithms, with Matt D. Hoffman
01:02:40
80. #80 Bayesian Additive Regression Trees (BARTs), with Sameer Deshpande
01:09:05
82. #82 Sequential Monte Carlo & Bayesian Computation Algorithms, with Nicolas Chopin
01:06:35