Artwork for podcast Learning Bayesian Statistics
#136 Bayesian Inference at Scale: Unveiling INLA, with Haavard Rue & Janet van Niekerk
Open-Source News Episode 1369th July 2025 • Learning Bayesian Statistics • Alexandre Andorra
00:00:00 01:17:36

Share Episode

Shownotes

Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!

Visit our Patreon page to unlock exclusive Bayesian swag ;)

Takeaways:

  • INLA is a fast, deterministic method for Bayesian inference.
  • INLA is particularly useful for large datasets and complex models.
  • The R INLA package is widely used for implementing INLA methodology.
  • INLA has been applied in various fields, including epidemiology and air quality control.
  • Computational challenges in INLA are minimal compared to MCMC methods.
  • The Smart Gradient method enhances the efficiency of INLA.
  • INLA can handle various likelihoods, not just Gaussian.
  • SPDs allow for more efficient computations in spatial modeling.
  • The new INLA methodology scales better for large datasets, especially in medical imaging.
  • Priors in Bayesian models can significantly impact the results and should be chosen carefully.
  • Penalized complexity priors (PC priors) help prevent overfitting in models.
  • Understanding the underlying mathematics of priors is crucial for effective modeling.
  • The integration of GPUs in computational methods is a key future direction for INLA.
  • The development of new sparse solvers is essential for handling larger models efficiently.

Chapters:

06:06 Understanding INLA: A Comparison with MCMC

08:46 Applications of INLA in Real-World Scenarios

11:58 Latent Gaussian Models and Their Importance

15:12 Impactful Applications of INLA in Health and Environment

18:09 Computational Challenges and Solutions in INLA

21:06 Stochastic Partial Differential Equations in Spatial Modeling

23:55 Future Directions and Innovations in INLA

39:51 Exploring Stochastic Differential Equations

43:02 Advancements in INLA Methodology

50:40 Getting Started with INLA

56:25 Understanding Priors in Bayesian Models

Thank you to my Patrons for making this episode possible!

Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser, Julio, Edvin Saveljev, Frederick Ayala, Jeffrey Powell, Gal Kampel, Adan Romero, Will Geary, Blake Walters, Jonathan Morgan, Francesco Madrisotti, Ivy Huang, Gary Clarke, Robert Flannery, Rasmus Hindström, Stefan, Corey Abshire, Mike Loncaric, David McCormick, Ronald Legere, Sergio Dolia, Michael Cao, Yiğit Aşık, Suyog Chandramouli and Adam Tilmar Jakobsen.

Links from the show:

SPDE-INLA book and other resources: 

Penalizing complexity priors: 

Transcript

This is an automatic transcript and may therefore contain errors. Please get in touch if you're willing to correct them.

Transcripts

Speaker:

Today I'm honored to welcome Havard, Ru and Yanet Manikirk, two researchers at the cutting

edge of Bayesian computational statistics.

2

:

Havard, a professor at Kaost, is the person behind integrated nested Leblas

approximations, or INLA, a robust method for efficient Bayesian inference that has

3

:

transformed how we approach large-scale latent Gaussian models.

4

:

Yanet, a research scientist at Kaost, specializes in applying INLA methods to complex

problems, particularly in medical statistics and survival analysis.

5

:

In this conversation, Havard and Yanet guide us through the intuitive and technical

foundations of INLA, contrasting it with traditional MCMC methods and highlight the

6

:

its strengths in handling massive complex data sets.

7

:

We dive into real-world applications ranging from spatial statistics and air quality

control to personalized medicine.

8

:

We also explore the computational advantages of stochastic partial differential equations,

discuss penalized complexity priors, and outline exciting future directions like GPU

9

:

acceleration and advanced SPAS solvers.

10

:

This is Learning Vision Statistics, episode.

11

:

136, recorded May 13, 2025.

12

:

Welcome Bayesian Statistics, a podcast about Bayesian inference, the methods, the

projects, and the people who make it possible.

13

:

I'm your host, Alex Andorra.

14

:

You can follow me on Twitter at alex-underscore-andorra.

15

:

like the country.

16

:

For any info about the show, learnbasedats.com is Laplace to be.

17

:

Show notes, becoming a corporate sponsor, unlocking Beijing Merch, supporting the show on

Patreon, everything is in there.

18

:

That's learnbasedats.com.

19

:

If you're interested in one-on-one mentorship, online courses, or statistical consulting,

feel free to reach out and book a call at topmate.io slash alex underscore and dora.

20

:

See you around, folks.

21

:

and best patient wishes to you all.

22

:

And if today's discussion sparked ideas for your business, well, our team at Pimc Labs can

help bring them to life.

23

:

Check us out at pimc-labs.com.

24

:

Harvard Roo and Janette Manigerk.

25

:

Welcome to Learning Vision Statistics.

26

:

Thank you.

27

:

Thank you.

28

:

Yes, and thank you for tolerating my pronunciation of your name.

29

:

I'm getting there.

30

:

I love this podcast also because it's very cosmopolitan, so I get to kind of speak a lot

of different languages.

31

:

uh

32

:

So I am super happy to have you on the show today because I've been meaning to do an

episode dedicated to Inla, per se, a few months now.

33

:

um I've mentioned it here and there with some guests, but now today is the Inla episode.

34

:

So let's do this.

35

:

But before that, maybe can you...

36

:

um

37

:

Let's start with you, Yanet.

38

:

Can you tell us what you're doing nowadays and how you ended up working on these?

39

:

Okay, good.

40

:

Thank you for having me.

41

:

What I'm doing nowadays, so I work in the In-law Group at KAUST and we do a lot of things

that pertain to In-law and we also do some other BASION things.

42

:

How I got to work at the In-law Group?

43

:

I just sent an email to Howard when I finished my PhD and I said, well, I don't know what

to do next.

44

:

Can I come and visit you?

45

:

And he said yes.

46

:

And I came and then I started to work here and that's how I ended up here.

47

:

That's a long time ago.

48

:

Yes.

49

:

More than seven years ago, Okay, damn.

50

:

And Howard, what about you?

51

:

What are you doing nowadays and how did you end up doing that?

52

:

Nowadays I try to keep up with the rest of the people in the group.

53

:

So how I ended up here at KAUST, you mean?

54

:

No, mainly how you ended up working on these topics because I think your name is pretty

much associated with Inlass.

55

:

So I'm curious how these all started.

56

:

What's your origin story basically?

57

:

It started like 25 years ago.

58

:

So we're doing a lot of this market chain Monte Carlo.

59

:

And we are trying to do, essentially working on this, what we know as a latent Gaussian

model.

60

:

We're trying to really solve this Markov chain Monte Carlo issue to get good samplers.

61

:

Working a lot with this, because Markov random fields to get this sparse matrix

computations, all these things.

62

:

Then we cut up.

63

:

come to an end and you're realizing this is not.

64

:

going to work in the sense that this MCMC, even if you can do like a big, we can update

everything in one block.

65

:

Everything is efficient as it could, but it's still way too slow.

66

:

And then we wrote a book about this thing.

67

:

And then you'll see in the end of that book from 2005, there is a short outline of how to

proceed.

68

:

And it's basically kind of computing the results from the proposal distribution that we

had at that time.

69

:

And that's how things got started.

70

:

then PhD student Sarah Martino, she joined.

71

:

That was January 2005.

72

:

From that point, I've been working on this.

73

:

Yeah, it's over 20 years ago.

74

:

From when it started and another five years before that, where we did all the prep work.

75

:

In parallel to that, also have Finlingan who's joined.

76

:

He's working on the spatial models, but yeah, that's also all connected.

77

:

Yeah.

78

:

the big adventure.

79

:

uh Actually, Howard, could you...

80

:

Well, no, let's go with Janet and then I'll ask...

81

:

I'll just hover about a follow up with that.

82

:

But Yannett, for listeners who are unfamiliar with INLER, so first, that stands for

Integrated Nested Laplace Approximations, if I'm not mistaken.

83

:

Could you give us an intuitive explanation of what that is and how it differs from

traditional MCMC methods?

84

:

Yeah sure.

85

:

um So in MCMC methods what you essentially need to do is you need to keep drawing samples

and at some point these samples will come from the distribution you're actually looking to

86

:

find and then from those samples you can calculate things like the mean or the mode and so

on.

87

:

So what Inla does is oh

88

:

It actually approximates with mathematical functions the posterior in its totality.

89

:

So there is no sampling waiting for things to arrive at the stationary distribution.

90

:

You compute the approximation and then you have the posterior.

91

:

So then you can from that get also the mean and the credible intervals and so on.

92

:

So it's a deterministic method.

93

:

It's not a sampling based method, um which is why it's fast.

94

:

But yeah, essentially that's kind of how it works.

95

:

Yeah, thanks.

96

:

That definitely cleared that up.

97

:

um And I'm curious actually, Howard, when you started developing Inla, um I think from

your previous answer, basically you were motivated by making inference faster.

98

:

Yes.

99

:

But also, was that the main challenge you were trying to address or was that also...

100

:

Were there also all other challenges you were trying to address with Inlet?

101

:

No, it was simply trying to...

102

:

Yeah, I said first we tried to make MCMC work because at that time, early 2000, we

believed that this was the way to go and then you're realizing it's not.

103

:

At some point you make a switch.

104

:

It's like MCMC is not going to work.

105

:

It's never going to give us the kind of the speed.

106

:

that we need.

107

:

And then we worked on, started on these approximations and working our way through that

one.

108

:

And that has been the goal all the time, you know, to make this inference for these kind

of models fast.

109

:

Yeah, fast is more important than, yeah, to make them quick.

110

:

This have been, to make them quick kind of scalable in a way.

111

:

And there are two,

112

:

two types of scales in a way you can scale with a number of data points.

113

:

And we can also scale with a number of the kind of the model size itself.

114

:

So these are two different things.

115

:

And in the first version of the lab, we didn't, we kind of.

116

:

scale okay with both of them.

117

:

And now in the second generation of Inda, then we scale way, way better, both respect to

model size and also the data size.

118

:

So that's another redevelopment.

119

:

yeah.

120

:

That's a- uh

121

:

It's a second generation in that.

122

:

So it was almost like a complete rewrite.

123

:

And we pump the main methods.

124

:

Yeah.

125

:

Yeah.

126

:

This is super exciting.

127

:

mean, well, we'll dive a bit more into that during the rest of the show, but ah yeah, high

level.

128

:

I think now we have a good idea of what that's ah helpful for.

129

:

I mean, what's that doing?

130

:

But um

131

:

I want us to also dig into the cases where that could be helpful.

132

:

So, um Yanet, actually I'm curious, how did you personally get introduced to Inlaid um and

what drew you to this particular computational approach before we dive a bit more into the

133

:

use cases?

134

:

Yeah, so I did my PhD in like Bayesian statistics.

135

:

um and the main focus was kind of to write samplers for covariance and correlation

matrices.

136

:

So of course that doesn't go well no matter which sampler you write.

137

:

um So I think very similar motivation just there has to be kind of a better way to do this

even if you lose a little bit of generality for specific cases.

138

:

you have to be able to do better.

139

:

And that's kind of where I started looking into approximate methods like ABC for the live

lutes and there is VB and so on.

140

:

And then InLa specifically, which like combines many, many different approaches together,

just to do things the same kind of accurate, but just faster.

141

:

Okay, yeah, I see.

142

:

uh And actually, I read while preparing the questions for the show that latent Gaussian

models are particularly suited to Inla.

143

:

em So I'm curious why, and also more in general, which cases are particularly appropriate

and you would recommend to...

144

:

uh listeners to give try to give InLaw a try.

145

:

um So maybe, Yannett, you can take that one and Harvard if you have anything to complete

afterwards.

146

:

Sure.

147

:

So actually, a latent Gaussian model, this is the assumption on which InLaw is built.

148

:

So InLaw is developed to do inference for latent Gaussian models.

149

:

So if you do not have a Gaussian model, then none of the math will hold because it's

developed to do inference for latent Gaussian models.

150

:

So can you maybe just briefly explain to us what a latent Gaussian model is?

151

:

Yes.

152

:

I have to say when I started at KAUST with Howard, we had another colleague, Håkon, which

was also Norwegian and he did his PhD with Howard.

153

:

So was me and them two.

154

:

And at the first uh few months, it felt like they spoke a different language.

155

:

Even though I had a PhD in statistics, I could not figure out what's Gaussian model and

all these things.

156

:

So I totally get this question when it comes up.

157

:

So what is a latent Gaussian model?

158

:

Okay, latent Gaussian model is when you have data points for which you can assign a

likelihood.

159

:

and maybe the mean or some parameter in the likelihood will have a regression model.

160

:

This regression model can contain fixed effects, random effects, different components.

161

:

And conditional on the model, the data is then independent so that the likelihood is then

just this product.

162

:

So where the latent Gaussian part, sorry.

163

:

Yeah, yeah, no, exactly.

164

:

I was going to give you the same way to go.

165

:

It's like a gam up to now, right?

166

:

Like if you have a gam.

167

:

Then the latent Gaussian part comes into the fact that all the fixed effects and the

random effects should have a joint Gaussian prior.

168

:

So you can think of any random effect like an IID random effect or like a time series

model.

169

:

As long as they have a Gaussian joint distribution, then they will be a latent Gaussian

model.

170

:

So at first thought, it might seem like, this is like a very restrictive class, but it's

actually not.

171

:

A lot of models that we use every day are actually latent Gaussian models.

172

:

And even some very complicated models like survival analysis models, they're also latent

Gaussian models, a lot of them.

173

:

ah If you do like co-regionalization where you have spatial measurements at different

locations, you can model them jointly and this is also latent Gaussian model.

174

:

So it's really much broader than you would think initially.

175

:

Yes.

176

:

em Yeah.

177

:

Thanks, Jan.

178

:

It's super, super clear.

179

:

That makes me think about, yeah, state space models where most of the time you linear or

Gaussian state space models.

180

:

So that means m Gaussian likelihood and Gaussian innovations.

181

:

uh yeah, like basically what you just...

182

:

talked about here, even though the structures of the models are different.

183

:

And yeah, as you were saying, that may seem pretty restrictive, but that actually covers a

lot of the cases.

184

:

I guess some cases that can be more complicated when you have count data.

185

:

No, they are the same, actually.

186

:

So the likelihood has no restrictions.

187

:

oh Yeah, can be an likelihood.

188

:

can be a zero-input likelihood.

189

:

You can have...

190

:

multiple likelihoods, like you can do a multivariate data analysis, uh each data type with

their own likelihood, as long as the latent part, so just the fixed effects and the random

191

:

effects should have a Gaussian prior, but what you put on the data, there is no

restriction on that.

192

:

Nice, nice.

193

:

Okay, so that's less restrictive than the classic Kalman filter then, because the classic

Kalman filter only can take normal likelihood.

194

:

So, okay, yeah.

195

:

Yeah, that's really cool.

196

:

then I agree that's even less restrictive than what I had understood.

197

:

uh Havard, it looks like you had something to add.

198

:

it's not to explain everything.

199

:

Of course, if you have one Leiden-Goshen model, if you have another one, you can put them

together.

200

:

So we also can do these kind of joint models where you share

201

:

kind of effects.

202

:

So all these kind of joint models is also covered.

203

:

As soon as you have one, you can also do two and then you can combine them together.

204

:

So it's almost like, I think it's easier to classify all the models who are not

late-engagement model is that it's very surprising result in the sense of

205

:

When you write it down, it looks so simple, but actually to make the connection from the

model that you have to rewrite in that form we needed, it's, many struggled with that, you

206

:

know, because the form is so simple.

207

:

You have some parameters like variances, correlations, over dispersions, and then you have

something Gaussian, and then you have observations of the Gaussian.

208

:

And that's it.

209

:

So this structure covers almost...

210

:

almost everything that is done in practice today.

211

:

You can say, okay, mixture models, this kind of thing is not in the class.

212

:

Yeah, but they are not that much used, you in the sense of in daily life of people who

working with this.

213

:

Like more in research, yeah, you can do it.

214

:

It's a little different, but in our kind of in the practical life of people who do these

things, it's not.

215

:

It's a very, very surprising thing.

216

:

Yeah.

217

:

Yeah.

218

:

Yeah.

219

:

No, for sure.

220

:

I as soon as you start being able to normally distributed likelihood and count data, you

have, I would say, 80, 90 % of the use cases.

221

:

Measures are indeed, they are possible, but they are way less common for sure.

222

:

Yeah.

223

:

Yeah.

224

:

And actually, Harvard, um you've been doing that for

225

:

quite some time now as you were saying.

226

:

So I'm curious what you've seen as the most impactful applications of Inla, especially

maybe at extreme data scales because that's really where Inla shines.

227

:

Janette, help me out.

228

:

You know this better.

229

:

Yeah, so I mean, we've had um the World Health Organization has used INLAW for some air

quality control methods.

230

:

We've had the CDC used INLAW for like epidemiology.

231

:

We've had the Malaria Atlas Project use INLAW to

232

:

model the prevalence of malaria to kind of help inform interventions and where they should

be.

233

:

And for instance, I we are past COVID, but a lot of people still work on COVID.

234

:

And I just checked quickly this morning and there is more than 800 papers who used INLA

for COVID modeling.

235

:

yeah, I mean, there's a lot of impactful applications, but on a very large scale, there...

236

:

um

237

:

There has been applications recently, there is a paper by Finn Lindgren and others who've

used it to model temperature on the global scale with lots and lots and lots of stations

238

:

and they can do a very high resolution um model.

239

:

So like Howard said before, the kind of modern framework for InLa that came like after

:

240

:

It can scale now very, high with data.

241

:

Nice, yeah.

242

:

And actually, I'm wondering, um when you say people are using Inla to model these data

sets, what do they use?

243

:

What's a package you recommend people check out if they want to start using Inla in their

own analysis?

244

:

So the InLaw methodology is implemented in the R InLaw package.

245

:

So it's an R package.

246

:

There is a Python wrapper or soon there will be, or it is there, but anyway, so there will

be like a Python wrapper where you can use InLaw in Python.

247

:

um So yeah, think, yeah, most people just use the R InLaw package and it's...

248

:

Howard does a great job.

249

:

There's kind of almost a new testing version every few days.

250

:

So the development is very, very fast.

251

:

Usually the package is faster than the papers.

252

:

Like we would implement something and then a year later the paper would come out

explaining what was changed.

253

:

So users always have the latest version immediately.

254

:

Yeah, this is super cool.

255

:

So the R in Lab Package is in the show notes already, folks, for those who want to check

it out.

256

:

ah So yeah, check out the website.

257

:

There are some examples in there, of course, of how to use it and so on.

258

:

And if you guys can add the Python wrapper to the show notes, that'd be great because I

think, yeah, definitely that will increase even more your number of users.

259

:

I know, I will use that.

260

:

because I mainly code in Python.

261

:

yeah, like that's for sure.

262

:

Something I will check out.

263

:

Howard, anything you want to add on that?

264

:

No, I think it's fine.

265

:

It is what I trying to say.

266

:

But we are not on this C run.

267

:

We are not on this public repository or the standard R repository, simply because the R

code is just a wrapper.

268

:

So inside there is a C code.

269

:

It's a program that runs independently.

270

:

This is simply too complicated to compile on this automatically built system.

271

:

So we have to build it manually and include it in the package.

272

:

for this, because it's contained in binary, it cannot be in this kind of public

repositories.

273

:

So we have our own that you need to kind of download from.

274

:

Okay.

275

:

Yeah, damn that, that adds to the, the maintaining complexity.

276

:

So thank you so much for doing that for us.

277

:

Um, know it can be complicated.

278

:

in, uh, Yanet, actually, I think, if I understood correctly, you apply a lot of, um, you

apply a lot in medical statistics and survival analysis.

279

:

Um, so

280

:

Can you share with us maybe an example of how you've done that and why InLa was

particularly efficient in this setting?

281

:

Yes.

282

:

So when I started at KAUST, actually, Howard told me, we need someone to do survival

analysis with InLa.

283

:

So that's kind of how it started.

284

:

um And I think up until that stage in 2018,

285

:

There was very few, maybe two or three works on using INLA for survival analysis.

286

:

actually, the survival analysis models are also latent Gaussian models.

287

:

But to make this connection is not so clear at first glance.

288

:

Like you really have to just sit down and think about your model on a higher level to be

able to see the connection to a latent Gaussian model.

289

:

And of course, if we can then use INLA.

290

:

then we can do a lot more complicated models.

291

:

Like we can do spatial survival models, which a lot of survival packages cannot do.

292

:

um We can do then these joint models.

293

:

We now have another package built on top of Imla called Imla Joint that has a very nice

user interface for joint models.

294

:

So where you have something that you monitor uh over time.

295

:

So in event, it could be like relapse of cancer, for instance.

296

:

And then you would have many biomarkers, like lots of blood test values and maybe x-ray uh

image and so on.

297

:

And you would have a lot of these longitudinal series and then you would jointly model

them and assuming that there is some common process driving all of this.

298

:

And these models are very computationally expensive because you you have a lot of data, a

lot of uh high velocity data.

299

:

You can have multiple biomarkers.

300

:

You have this hazard function that you model, which is different than general linear

models where you model the mean, because we don't model the mean time to an event.

301

:

We model the hazard function, like this instantaneous risk at every time point.

302

:

um But then if you can see that your survival model or your very complicated joint model,

even if it contains splines over time or it contains

303

:

a spatial effect, different IID effects like multi-level random effects for hospital

effect and doctor and patient and so on, you can easily see that this model becomes very

304

:

complicated if you want to just do the right thing.

305

:

um But then InLa makes it possible that we can actually fit these models.

306

:

And one thing that we've been trying to do in this regard in terms of medical statistics

307

:

is to really position INLA to be a tool um in kind of the drive to personalised medicine.

308

:

Because in the end, if you want a doctor to run something locally uh and see, okay, the

probability for relapse is lowest on medicine A versus B and C and D, you need something

309

:

that's going to run fast, like it cannot come out tomorrow.

310

:

So that was kind of a main motivation to kind of position INLAT towards medical statistics

also to just show the potential for to actually achieve this personalized medicine target.

311

:

Yeah, this is really cool.

312

:

Yeah, definitely agree that has a lot of potential basically because that makes more

complex models still um practically uh inferable.

313

:

um Whereas with classic MCMC, it wouldn't be possible.

314

:

yeah, that expands the universe of patient models in a way.

315

:

That's really fantastic.

316

:

Something I'm wondering, uh Harvard, is the computational challenges.

317

:

Are there any such challenges when you're using NLA, um numerical stability, efficiency of

the algorithms, all these diagnostics that we get with MCMC samplers?

318

:

How does that work with...

319

:

in when you're running a model.

320

:

How do you know that conversions was good?

321

:

How do you diagnose the conversions?

322

:

How do you handle conversions issues?

323

:

Are there even conversions issues?

324

:

How do you go about that practically?

325

:

The conversions issue is very different.

326

:

This is like a numerical optimization.

327

:

So if it doesn't work, you will be told that this doesn't work.

328

:

Over the years, know, there's some code working on, been developed for 20 years, you know,

so we are, we are getting a good experience.

329

:

What is working, what is not.

330

:

Of course there is a specialized version and tailored version of everything, numerical

optimization algorithm.

331

:

We don't use just a library.

332

:

Every code is from some kind of

333

:

or just on the library, but it's tailored to exactly what we want.

334

:

Just computing gradients, we do it in a very different way.

335

:

That is also tailored to what we do.

336

:

Everything is kind of tailored, every kind of custom made, everything kind of refined.

337

:

So that it's amazing how.

338

:

how well it runs in the sense that if you have any decent model will run.

339

:

If you have a problem with convergence, it's usually because you have a model that to be

honest doesn't make sense.

340

:

Since in the way that you have very little data, using wake priors, there is no

information in the model, there is no information in the data and then this is harder.

341

:

So about convergence is like,

342

:

This is kind of different from kind of Mark-Chain Monte Carlo where you have to do kind of

diagnostics.

343

:

And here is more if you work with contain, if you do the optimisation, this is on the

highest level on the parameters on the top.

344

:

We're talking variance, correlation, over dispersion, this kind of parameters.

345

:

If you come to a point that is a well-defined maximum,

346

:

And essentially you're good.

347

:

And we're looking around that point to correct for kind of skewness that is not perfect

kind of symmetric in that sense.

348

:

And these are also kind of small diagnose, but they are never, it's never a kind of

serious issue that it was for MCMC.

349

:

Now it's like it's, if you put something reasonable in, you'll get something reasonable

out.

350

:

is very little.

351

:

don't think we have serious convergence issues, no?

352

:

I don't think so.

353

:

We might have it like 15 years ago or 10 in beginning.

354

:

There was less, there was more trouble computing gradient session, this kind of thing.

355

:

But now we we do this very differently.

356

:

We are being smarter.

357

:

And then, no, there isn't any...

358

:

It works pretty good, actually.

359

:

Yes.

360

:

So everything is very...

361

:

It's different.

362

:

Cool, yeah.

363

:

I mean, that's great to hear.

364

:

And Yanet, do you have any tips to share with the listeners about that?

365

:

No, I think I will just comment on the gradient.

366

:

There is a paper describing the gradient method that's used in Inland.

367

:

It's called Smart Gradient.

368

:

uh Describing uh how this kind of gradient descent methods and so on work with this

different

369

:

type of uh way to get a gradient.

370

:

And it is really good.

371

:

It's also the one we use inside Inla, but of course, people who use gradient-based methods

would also benefit from that as a standalone contribution.

372

:

Yeah.

373

:

Okay.

374

:

Yeah.

375

:

Thanks, Jan.

376

:

That's uh helpful and feel free to add that to the show notes uh if you think that's

something that's going to be interesting for listeners.

377

:

em And Howard, actually, um I read that you've applied stochastic partial differential

equations.

378

:

em So let's call them SPDEs.

379

:

Very poetic name.

380

:

And so yeah, you've used that mainly for spatial modeling.

381

:

So can you give us an overview of why these SPDs are interesting?

382

:

And I think you use that to model Gaussian fields.

383

:

yeah, maybe if you can, Vivian, talk a bit more about that because that sounds super

interesting.

384

:

Yes.

385

:

So normally Gaussian fields like normal distribution is described for the covariance or a

covariance function.

386

:

uh

387

:

traditional way of doing things and there is nothing wrong with that except that it's not

very

388

:

is not very smart way of doing things.

389

:

from the case and uh like in the beginning, like 20, 25 years ago, there was a big, this

was also one of the problems we wanted to solve in a way, some of these kind of spatial

390

:

stat problems.

391

:

But of course there is a version, there are models that are kind of Markov.

392

:

Markov in the sense that instead of the covariance matrix, you're working with the

precision matrix.

393

:

inverse of the covariance and that is sparse.

394

:

So this was often called this regional models or marker models.

395

:

And then you had the spatial models.

396

:

The Gaussian field with kind of a turn covariance, you choose a covariance function and

they were dense and they were kind of a mess.

397

:

And then of course it's like in parallel to the development of this now professor in

Edinburgh, Finn Lindgen.

398

:

It's also part of this overall in a project.

399

:

He started or him and me started to work on this thing because we had like, I think we had

the first version in:

400

:

was almost Markov.

401

:

They were almost Markov, but we had to compute, we had to fit them kind of numerically.

402

:

And from that point, we can kind of move on to working with precision matrices and stuff.

403

:

And this is one of the key things in Inland that you don't work with covariance, you work

with precision.

404

:

Because precision can be put together.

405

:

This is like playing with Lego.

406

:

You have one component, you have another one, you just stick it together and it's easy.

407

:

If you work with covariance, there is a lot of math.

408

:

uh

409

:

that go in just to put them together.

410

:

So this falls directly out.

411

:

In addition to this, you have this, the fact that they are Markov.

412

:

Markov, you mentioned the state space models where if you condition on pass, you only need

the latest one.

413

:

This really simplified computations.

414

:

And of course, this apply more generally.

415

:

And these are called the Smokov random fields.

416

:

Yeah, this connect back to the book we had 20 years ago, 2005, but Leona held.

417

:

And the point is that the computations are very efficient for these kind of models.

418

:

Instead of using dense matrices, you can use algorithms for sparse matrices.

419

:

And these scales way, way better.

420

:

So back to the SPD thing, and there was a quest, there was a hunt for trying to solve this

in more, again, the motivation was computational speed.

421

:

We need to do things faster.

422

:

And Finn, he's a small genius.

423

:

And no, he figured out that, okay, we can connect these to these.

424

:

stochastic differential equation and this go actually back to very old work in the late

50s and early 60s that show that these Gaussian fields are exactly solutions of these

425

:

stochastic differential equations.

426

:

As soon as you realize that, then you can say, okay, let's solve this thing.

427

:

We don't need to solve them.

428

:

We need to represent the solution of.

429

:

And that is done by this finite element method from applied math.

430

:

And then you get something that is sparse.

431

:

And the matrices you get there will be our kind of precision matrices.

432

:

So everything up to that point was done for computational speed.

433

:

And we can do things faster.

434

:

Of course, when at some point you're realizing the most important thing about this SPD

approach is not

435

:

speed.

436

:

It's actually that you can use this way of thinking to do things very easily that was up

to that point almost impossible.

437

:

And now it's just a course we can do it.

438

:

It's like having, yeah, Janet mentioned his post-hocumbaca.

439

:

That was here in beginning.

440

:

He worked on these barrier models.

441

:

What happened if you have islands and you want the dependence to go around the islands or

through

442

:

following rivers and do this follow adjust for the coastline and all this thing.

443

:

This is super complicated unless you do it with the SPDs.

444

:

Then it's just follow from the definition.

445

:

Very little things you need to do and then it's just do the right thing.

446

:

And also this is connected to the kind of the physical laws that we think

447

:

these kind of processes were almost or they will follow or almost follow.

448

:

So it's like stochastic differential equations.

449

:

It's just more or less elliptic ones that we make kind of stochastic.

450

:

Now, so this is super super useful.

451

:

But of course the complexity is way higher.

452

:

It's very hard.

453

:

You need tools to create the mesh.

454

:

You have to work with a mesh.

455

:

You have to work with transation of an area.

456

:

You have to do all these kind of things.

457

:

But Finn has written all these kind of tools for doing that.

458

:

He's also pretty good in this coding, you know?

459

:

And as soon as you have the tools, everything of this is quite easy.

460

:

But there is a huge kind of step to take before you get to that point.

461

:

But when this is done, when somebody has done it for you, then it's pretty easy.

462

:

Also, these have been very, very useful.

463

:

Also, you can do these non-separable models going in time as well.

464

:

You just have a time dependence of this.

465

:

And this follows the same procedure.

466

:

Yeah, damn.

467

:

Yeah, for sure.

468

:

That sounds like a big project, but really fascinating.

469

:

If you have any link you can share with us in the show notes, that'd be great because I

think that makes for a very interesting case study that we can follow along.

470

:

It's a very large project.

471

:

It's like if we're giving kind of tutorials courses on this thing, you...

472

:

You can almost have half of the time is spent on the in-app package itself and the second

half is on this SPD thing.

473

:

Of course, this is integrated, but the complexity of that part is of the same order as the

complexity of the whole inlet package itself.

474

:

Right.

475

:

ah Yannett, to, get back to you, um, I'm curious, uh, like given, given your experience in

applying patient models, especially in complex medical scenarios, what new features or

476

:

improvement, if any, would you most like to see incorporated into Inla?

477

:

Oh, so

478

:

I'm probably biased, but I think the new methodology has solved a lot of issues that were

there before.

479

:

Because a lot of the medical statistics models are very data heavy and not model size

heavy.

480

:

For instance, you think about MRI data,

481

:

you have many, many data points, but maybe not so many parameters.

482

:

And in the classical uh INLA from the 2009 paper, this did not scale so well to very many

data points.

483

:

But now the new implementations scales extremely well.

484

:

And I think, for instance, in medical problems, especially medical imaging,

485

:

In-law, the new in-law now can be applied to that, whereas before it could not.

486

:

in my biased opinion, there is not uh anything I see at the moment that I would like to be

incorporated.

487

:

There are still many applications that can be explored and see how far we could push this

uh kind of new in-law, I would say.

488

:

Harvard kind of a related question.

489

:

What do you see since you're um developing the package at the forefront?

490

:

What do you see as the next major frontier for Inla, whether that's methodological

advancement or new application domains?

491

:

Yeah, I think that the most pressing issue now is to have a sparse solver that is more

scalable.

492

:

It's scaled better in parallel and it's also able to kind of start taking advantage like

everything with GPUs, you know.

493

:

Now GPUs are everywhere and it's coming.

494

:

They are really good.

495

:

and we need to take advantage of them.

496

:

But this is not easy.

497

:

It's not easy at all.

498

:

So there is one post-hoc in the group, Esmaltata.

499

:

has been working on a new implementation of a new spar solver that is targeted towards

this.

500

:

So it's more a modern design, has modern ideas.

501

:

And this is going really well.

502

:

So to have this kind of a

503

:

better in numerical kind of backend supporting this kind of calculations.

504

:

This had been a main struggle for a long time.

505

:

The thing is that sparse matrix solvers are not as...

506

:

I said developed.

507

:

It's easier to work with if you're doing supercomputing than working with dense matrices

is far more easier and it's far more kind of relevant.

508

:

This large sparse matrices is less relevant.

509

:

So therefore there are fewer implementations.

510

:

Those who are often kind of so what are now close.

511

:

not open source, and they don't scale too well in terms of in parallel.

512

:

Because nowadays it's like for the older, the modern machines we have now, it's a lot of

problems.

513

:

It's more about memory, not about CPU.

514

:

It's more important to have the data you want to multiply.

515

:

You have them right there in front of you and then you can do it instead of doing fewer

competitions.

516

:

You so it like a memory management and this kind of everything with memory is much more.

517

:

important now than it was before.

518

:

So then looking at the sparse matrix as

519

:

Instead of a sparse matrix of just elements, you can look at that sparse matrix of dense

blocks.

520

:

And these dense blocks is called a tile.

521

:

So then you can work with these small dense blocks instead.

522

:

Even computing too much, but it's faster to do that than figuring out only what needs to

be computed.

523

:

And that's the key thing.

524

:

And this scales way, way better.

525

:

It can connect to GPUs and connect

526

:

to all these things.

527

:

So there are, in fact, in the group, are works in two directions of this.

528

:

One is to kind of build a new sparse matrix solver to be directly into the inlet code that

is now.

529

:

We also have initiative on writing a completely distributed solver.

530

:

This is a not a post of Lisa.

531

:

is doing that with some colleagues in Zurich to have a complete distributed solver with

necessarily in Python.

532

:

That also have this ability you could distribute the calculations, you have GPUs, it can

take care of this.

533

:

And this is aimed for a different data scale.

534

:

So we have to work on two pilot tracks.

535

:

One is to kind of keep the current code developing.

536

:

And the other one is try to prepare also for the future, maybe also for larger models.

537

:

They need different things.

538

:

So I think that's main pressing issues.

539

:

Janice said, there is a lot of things that was a problem before.

540

:

The problem, yes.

541

:

things that we would like to have slightly better.

542

:

But now I think they are basically solved.

543

:

So there are two main things are developing applications, as Jan said, but also this more

modern computing to get this integrated.

544

:

It's also the major task.

545

:

oh This is the reason you saw it take years.

546

:

There's years of work.

547

:

It's not something you do in a weekend.

548

:

It's a scale of years.

549

:

Yes, that sounds about right.

550

:

uh But definitely looking forward to ah seeing these advancements in the coming months and

years.

551

:

um

552

:

Yannet, since you've been there, you know, at some point you've been a beginner of In-Lay,

you had to start somewhere.

553

:

So I'm curious, uh for listeners who want to get started with In-Lay, what resources or

practices would you recommend as first steps?

554

:

I think this also depends a little bit on why you want to learn it.

555

:

So if you want to use it, you're going to be an applier of Inna.

556

:

On the website, there are links to some tutorials that people have done.

557

:

But also the papers we do from the group always have code available.

558

:

On the website, there are a few open source books.

559

:

There is one book from Virgilio that goes through a lot of common models with code.

560

:

So it's written like oracle, the output and so on.

561

:

The book is in this format.

562

:

So if you want to learn how to code inla and like where to get the posteriors from and so

on, then that kind of book is a very good uh tool.

563

:

Then to learn really what is

564

:

what is Inla, not just to be able to use it.

565

:

I this is a little bit harder.

566

:

ah I think we've tried since we have uh developed the new methodology, we've tried to

write it up um in a way that's easy to understand.

567

:

Yeah, but there is also this uh gentle introduction.

568

:

the one I used, it's also linked on the website, it's called A Gentle Introduction to

INLA.

569

:

ah But yes, this will be on the old methodology.

570

:

But it just gives a good intuition, like how it's different and questions about

convergence and so on, and kind of makes it clear that it's not based on samples.

571

:

So everything with samples that comes with samples is not there.

572

:

But then what is there if there is not samples and you know how it works and so on.

573

:

And something that's really nice um that I think maybe a lot don't know initially is that

you can always draw samples after because you have the full posterior.

574

:

So if you need to calculate whatever for some reason, you can draw samples.

575

:

So you can still have samples.

576

:

It's not like, there now, you know, we have like a built-in way to draw samples from an

in-law object.

577

:

So it's very versatile in that way that you can have the samples, but it's not based on

samples.

578

:

It's based on math, I would say.

579

:

Yeah, yeah.

580

:

That all makes sense.

581

:

Jannet.

582

:

Havard, anything you would add here to basically recommend resources to beginners?

583

:

It is, as Jannet said, there are a few books out there that are also open.

584

:

You can read them just on the web.

585

:

They are pretty good.

586

:

We have one.

587

:

This is more target...

588

:

to what is SPD models, this whole book about only these.

589

:

I'll just introduce, yeah, there are some of these books to the background theory.

590

:

They just as Janna said, this is more.

591

:

This less clear because it's a serious, there is a development and there is a series of

key paper.

592

:

have to read one and then you another one and do another one and do a third one.

593

:

And then you do another one that we do a lot of things and there's another, it's like a

sequence of things.

594

:

So that's less.

595

:

That is, I understand this is less clear.

596

:

And the first thing you do is to write, to read the book.

597

:

about ghost marker random fields.

598

:

So that's a harder thing.

599

:

I see that because it contains a lot.

600

:

There is a lot of things going on.

601

:

For us, I don't think we see it anymore.

602

:

But you realize it when, because we have been working on this a lot.

603

:

But I understand there is a lot of concept, there is a lot of things that.

604

:

You just put together, if you put on top of each other is new things.

605

:

It's, it's a lot of details.

606

:

And many of the details are never written down, you because, um, where should you write

them?

607

:

Or journals would have them.

608

:

So it's like, there are these things.

609

:

It's like, it gives a kind of skeleton and then you have to figure out everything in

between yourself.

610

:

But, it's basically there, but it's.

611

:

It's hard.

612

:

I see it's kind of hard to get a complete picture of what is going on.

613

:

Maybe we have to write another book in the end, you know, with all the details.

614

:

That sounds good, yeah.

615

:

And um maybe another question I have that's very practical, but...

616

:

um

617

:

Is there any difference in the way you set appropriate priors when you're using the Inline

Algorithm for inference compared to when you would use Stan or Pimcitor in your models?

618

:

Or is that pretty much the same?

619

:

It's just the inference engine or the hood is changing.

620

:

Yeah, so you can set your priors whatever you want, but of course the latent ones has to

be Gaussian, but you can set like the parameters of the Gaussian if you want.

621

:

But then also we have default priors.

622

:

So you can run your model without putting any prior inside and a lot of patients I've

spoken to says this is really bad because then I don't know what's going on.

623

:

But for practitioners, a lot of them just kind of want to run a model and not make a

decision really about priors.

624

:

So this brought up a whole new field of research, I would say, within the In-Large Group.

625

:

Because if we need to decide on a default prior, we have to make sure that this works.

626

:

And the definition of works is what?

627

:

But generally, I think in the field I see this a lot, maybe not so much in the Bayesian

statistics community, but more in the applied community, where people would choose a prior

628

:

that's kind of the most used prior in the literature.

629

:

And they just use the other papers as motivation.

630

:

I choose this prior because everybody else has chosen it, right?

631

:

And this has caused that we've ended up with a lot of priors that's been used for like

variances and so on.

632

:

That's really bad.

633

:

but has been accepted because everybody else have used them and a lot of the priors we use

for hyperparameters or nuisance parameters I would say like variance components and so on

634

:

their initial motivation often was to be a conjugate prior

635

:

But then if we are anyway doing something like MCMC where we code and we have a computer,

then the idea of conjugacy is a little strange in that sense, right?

636

:

So kind of the motivation for this gamma, let's say gamma prior for this inverse variance,

the motivation was different when it was proposed, but it's still used because it's been

637

:

cited in 5,000 papers, right?

638

:

So this opened up

639

:

the idea of, okay, but what should we then do?

640

:

What should we put as a default prior that works well?

641

:

And this is how the idea of penalizing complexity priors, or in short, it's called PC

priors, were born.

642

:

So how can we derive and propose priors for all these hyperparameters that are not in the

latent part of the model that we know will do a good job?

643

:

in the sense that they will not overfit.

644

:

So for instance, if you include a random effect in your model, let's say just an IID

effect, you use the variance or the estimated variance to see how big is the random

645

:

effect, right?

646

:

But if you have a prior that's never going to be able to estimate a variance that's small,

then you could get a larger variance just even if it's not true.

647

:

But if your prior does not have sufficient mass to take it small,

648

:

then you're going to get a big value for the variance thinking this random effect is

important even when it's not.

649

:

So the PC priors are developed for each type of these parameters.

650

:

It's not like uniform prior on all of it.

651

:

Like you have to derive it case by case.

652

:

But essentially what they do is they will shrink the parameter to the value where it means

the model is going to be simpler.

653

:

So, and this value depends on what the parameter is.

654

:

So, for instance, for the Weibull model, if you have one of the parameters equal to one,

then it's actually the exponential model.

655

:

But then, if you think about it, you will almost never estimate it equal to one for any

data, even if you simulate from exponential data, will not estimate it to be one.

656

:

um So, the penalizing complexity prior will put a lot of mass, then, for instance, at one.

657

:

But...

658

:

it's derived based on the distance between the models.

659

:

So we put a prior on the distance between the models, not directly on the parameter,

because the distance we can understand.

660

:

So the default priors in InLav for the hyperparameters, most of them are PC priors.

661

:

So even if you don't set the prior and use the default, now the defaults are mostly good.

662

:

There is still some development at the moment going on, especially for

663

:

of correlation matrices, um but in general most of them have good default priors now.

664

:

Okay, this is really cool.

665

:

Yeah, I didn't know you were using PC primes for ah that by default, but yeah, like, can

definitely vouch for them.

666

:

ah Also, like, not only because they have these great mathematical properties that you're

talking about, but also they are much easier to set uh because they have a more intuitive

667

:

explanation.

668

:

ah

669

:

So I know they are derived differently for um different parameters, but I use them all the

time now where I'm, as you were saying, setting um standard deviations on varying effects.

670

:

So basically the shrinkage factor of the random effects.

671

:

uh And yeah, so that's an exponential.

672

:

And then there is a formula that at least I hard-coded.

673

:

in Python that's coming from the paper.

674

:

guess that's what you did in Inla.

675

:

um And also I use that for the amplitude of Gaussian processes, which are basically a

standard deviation of Gaussian processes.

676

:

So yeah, that's the same.

677

:

And I really love that because you can think about them on the data scale.

678

:

So it'd be like...

679

:

I think there is a 60 % chance that the amplitude of the GP is higher than X.

680

:

X being uh a number that you can interpret on the data scale.

681

:

And that makes it much, much, much easier uh to think about for me.

682

:

So yeah, this is awesome that you guys do that.

683

:

um Actually, do you have any...

684

:

um

685

:

Any readings we can link to in the show notes because the only thing I know about are some

papers who talk about penalized complexity priors, but they are not super digestible for

686

:

people.

687

:

So yeah, I don't know if you know about code demonstrations that are a bit clearer and

also which penalized complexity priors are appropriate for different kinds of parameters.

688

:

Yeah, we have some works like for specific things so I can link them in the show notes for

sure.

689

:

Yeah, yeah, yeah, for sure.

690

:

Because I think, I mean, that's not always still quite new research, but yeah, it's still

not, it still hasn't distilled yet a lot in practice.

691

:

But I think it should be faster because these are extremely helpful and practical for

people.

692

:

So yeah, that's awesome that you guys do that.

693

:

I really like that.

694

:

And I think it's also something we want to try doing ah on the Pimcee side to be able to

make it easier for people to just use PC Prize.

695

:

Awesome.

696

:

Damn.

697

:

Have you added anything to add on these priors in general or PC priors in particular?

698

:

No, but it is exactly what Janette saying.

699

:

It's like at some point you realize you need some kind of structured framework to think

about priors and to derive priors.

700

:

And this is not guessing a prior.

701

:

This is like putting up

702

:

All

703

:

like principles, how to think about them, and then you can derive them.

704

:

It's just, that has just become math.

705

:

And you know, they work in the same way all the time because it is the same thing.

706

:

It's about distance between distribution and whether this is parameterized by a parameter

in standard deviation or log precision.

707

:

It's the same prior because in distance scale is the same.

708

:

It's just materialized differently for different parameters.

709

:

But the scary part is that at least I realized that I don't understand parameters.

710

:

If I see a parameter and see a prior, how does this effectively impact the distributions?

711

:

I can try, but I'm not very good at it.

712

:

So I don't trust myself.

713

:

I trust the distance-based thing because it's doing the same thing all the time.

714

:

And this was also derived.

715

:

How many years?

716

:

15 years?

717

:

No, 10 years ago?

718

:

More than 10 years ago?

719

:

Yes.

720

:

That was in Cronheim.

721

:

A key person there was also Daniel Simpson.

722

:

That was part of his group.

723

:

great guy.

724

:

Yeah.

725

:

So this really solves this because before that point, I think our kind of stand was, OK,

prior is not my problem.

726

:

If you want this prior, it's your problem.

727

:

But at some point it's actually become your problem because you're realizing how a lot of

the uh problems come from bad priors.

728

:

And then you need to, cannot think about this, this kind of standard priors that is kind

of asymptotically best.

729

:

They often define it that way.

730

:

So you have to do the math and you have to let something go to infinity, right?

731

:

To make sense of this thing.

732

:

And then you want prior that doesn't do anything.

733

:

You know, you have all these things that's usually the standard, but we want the prior to

do something.

734

:

We just want to prevent it from doing bad things.

735

:

So these priors are derived not to be the best.

736

:

They are

737

:

is prior you can use when you don't know what else to do and they are never bad.

738

:

So if you're doing kind of how do they perform against other things, they are always,

they're always on the top three, you know, they are never bad.

739

:

I just do the same thing because they are, they are derived from this distance way of

thinking.

740

:

it, no matter how you do the parameter session, whether you look for over parameter

session, over

741

:

or what's dispersion and negative binomial.

742

:

You're looking for variance in solters, solters, something.

743

:

You don't need to understand things.

744

:

You can just understand the concept of distance.

745

:

And as you said, Andrea, you can connect it to a property of the data.

746

:

That's what you do.

747

:

You have to do some kind of calibration.

748

:

You have to set some kind of scale.

749

:

And this is what you do.

750

:

The rest, the math is doing for us.

751

:

No, it's really nice.

752

:

But some.

753

:

There are still parameters that have bad default priors because there is no general good

one and we don't want to change whatever was there.

754

:

usually there is a little, most parameters have some kind of a PC prior you can use.

755

:

Yes.

756

:

Which makes life easier.

757

:

Yes.

758

:

Yeah.

759

:

I mean, I completely second everything you just said here.

760

:

Um, Howard, and I think it's also very telling that somebody very, um, mathematically

inclined and experienced as you are still has difficulties thinking about how the

761

:

different parameters interact in complex models.

762

:

And that means everybody has that problem.

763

:

I definitely have it and it's like, yeah, sure.

764

:

I I can always put a prior on.

765

:

that standard deviation or that covariance matrix, but once you start being deep enough

into the layers of a model, you don't really know how that impacts the uh model.

766

:

And the only way you can do that is mainly through painstakingly going through prior

predictive checks.

767

:

um So that's still possible, but that's a bit inefficient.

768

:

Sometimes there is no other way, but I'm sure there are a lot.

769

:

faster and better ways most of the time and PC priors give you the Pareto effect on that

so yeah folks let's try and start using that all of us instead of uh instead of going

770

:

blindly uh choosing our priors I think it's a good it's a good way to to start closing up

the show actually before I ask you the last two questions uh

771

:

Is there anything ah you wanted to talk about or mention that I didn't get to ask you?

772

:

No, would just say that honestly I think for if you have a latent Gaussian model there's

nothing better you can do to infer your model.

773

:

you cannot basically you achieve the accuracy of MCMC but really in almost real time.

774

:

So really if you have a latent Gaussian model just try it and if you need any help there

is a our Inla discussion group.

775

:

I also linked it in the show notes and Howard's very good at replying almost instantly and

there are also others that reply so if you try Inla and there's anything that comes up

776

:

that you're unsure just send them email to that group.

777

:

Yeah.

778

:

Nice.

779

:

Yeah.

780

:

Great.

781

:

uh Fantastic.

782

:

Thank you so much, folks, for taking the time.

783

:

That's really wonderful.

784

:

Before you get to go to bed, because it's very late for you.

785

:

So again, thank you so much for that.

786

:

ah I will ask you the last two questions I ask every guest at the end of the show.

787

:

So first one is, if you had unlimited time and resources, which problem?

788

:

would you try to solve?

789

:

So, uh Harvard, you wanna start?

790

:

I think I would try to solve the problems I'm working on now.

791

:

It's like we are in situation here where we have quite good resources and we are able to

do this kind of trying to solve this kind of problem that we have.

792

:

So I think I would just stick to those, you know.

793

:

I'm good.

794

:

Yeah, that's great.

795

:

mean, I'm sure people can hear you're passionate about what you're doing, I'm not that

surprised.

796

:

um Yannett.

797

:

Yeah, think we have lot of interesting problems already and KAUST makes it very easy to

work on big problems.

798

:

We have a very uh nice academic environment so I don't know if there is any big problem I

can think of to solve.

799

:

Awesome.

800

:

Yeah, that's cool, folks.

801

:

And well, yeah, since you have the floor, let's continue with you.

802

:

um If you could have dinner with any great scientific mind, dead, alive or fictional, who

would it be?

803

:

Okay, this is quite hard because I have had dinner with a lot.

804

:

So uh thinking about someone I've not had dinner with and who's also a South African, I

would love to have a dinner with either Trevor Hasty, who's still alive and he is a South

805

:

African, uh or uh Donny Krieger, who was the inventor of Krieging and he was also a South

African.

806

:

So one of these two would work for me.

807

:

That sounds good.

808

:

uh And Harvard, what about you?

809

:

If I could choose, it would be nice to meet Isaac Newton.

810

:

But he's been away for a long time.

811

:

I think he must have been quite special.

812

:

I'm not sure if it would be a very pleasant experience.

813

:

But still, it would be nice to meet someone like him.

814

:

Just meeting, yeah, I don't think it would be.

815

:

It will be an experience.

816

:

Yeah, for sure.

817

:

That sounds very interesting.

818

:

At least you could talk English to him.

819

:

Maybe there would be some vocabulary difficulties, but at least you would have English in

common.

820

:

uh I um

821

:

Well, I think we can call it a show.

822

:

I'm super happy because I had a lot of questions for you, but we could cover everything.

823

:

So thank you so much, uh for uh that.

824

:

um Thank you so much also to Hans Monchow for putting me in contact with you guys.

825

:

He was like, you should talk to these groups.

826

:

They are doing amazing work.

827

:

So thank you so much, Hans, for the recommendation and also for...

828

:

listening to the show, obviously have good taste.

829

:

uh And well, on that note, thanks again, Yanet and Harvard, for taking the time and being

on this show, and as usual, we'll put a lot of links, folks, in the show notes, so if you

830

:

were interested and want to dig deeper, make sure to check those out.

831

:

Yanet, Harvard.

832

:

Thank you again for being on the show.

833

:

Thank you.

834

:

Thank you.

835

:

It's been very nice.

836

:

Thank you.

837

:

This has been another episode of Learning Bayesian Statistics.

838

:

Be sure to rate, review, and follow the show on your favorite podcatcher, and visit

learnbaystats.com for more resources about today's topics, as well as access to more

839

:

episodes to help you reach true Bayesian state of mind.

840

:

That's learnbaystats.com.

841

:

Our theme music is Good Bayesian by Baba Brinkman, fit MC Lass and Meghiraan.

842

:

Check out his awesome work at bababrinkman.com.

843

:

I'm your host.

844

:

Alex and Dora.

845

:

can follow me on Twitter at Alex underscore and Dora like the country.

846

:

You can support the show and unlock exclusive benefits by visiting Patreon.com slash

LearnBasedDance.

847

:

Thank you so much for listening and for your support.

848

:

You're truly a good Bayesian.

849

:

Change your predictions after taking information in and if you're thinking I'll be less

than amazing.

850

:

Let's adjust those expectations.

851

:

Let me show you how to be.

852

:

Let's get them on a solid foundation

Chapters

Video

More from YouTube

More Episodes
136. #136 Bayesian Inference at Scale: Unveiling INLA, with Haavard Rue & Janet van Niekerk
01:17:36
135. #135 Bayesian Calibration and Model Checking, with Teemu Säilynoja
01:12:12
134. #134 Bayesian Econometrics, State Space Models & Dynamic Regression, with David Kohns
01:40:55
133. #133 Making Models More Efficient & Flexible, with Sean Pinkney & Adrian Seyboldt
01:12:12
129. #129 Bayesian Deep Learning & AI for Science with Vincent Fortuin
01:02:42
126. #126 MMM, CLV & Bayesian Marketing Analytics, with Will Dean
00:54:46
125. #125 Bayesian Sports Analytics & The Future of PyMC, with Chris Fonnesbeck
00:58:14
124. #124 State Space Models & Structural Time Series, with Jesse Grabowski
01:35:43
123. #123 BART & The Future of Bayesian Tools, with Osvaldo Martin
01:32:13
122. #122 Learning and Teaching in the Age of AI, with Hugo Bowne-Anderson
01:23:10
121. #121 Exploring Bayesian Structural Equation Modeling, with Nathaniel Forde
01:08:12
118. #118 Exploring the Future of Stan, with Charles Margossian & Brian Ward
00:58:50
117. #117 Unveiling the Power of Bayesian Experimental Design, with Desi Ivanova
01:13:11
115. #115 Using Time Series to Estimate Uncertainty, with Nate Haines
01:39:50
113. #113 A Deep Dive into Bayesian Stats, with Alex Andorra, ft. the Super Data Science Podcast
01:30:51
112. #112 Advanced Bayesian Regression, with Tomi Capretto
01:27:18
110. #110 Unpacking Bayesian Methods in AI with Sam Duffield
01:12:27
107. #107 Amortized Bayesian Inference with Deep Neural Networks, with Marvin Schmitt
01:21:37
105. #105 The Power of Bayesian Statistics in Glaciology, with Andy Aschwanden & Doug Brinkerhoff
01:15:25
104. #104 Automated Gaussian Processes & Sequential Monte Carlo, with Feras Saad
01:30:47
103. #103 Improving Sampling Algorithms & Prior Elicitation, with Arto Klami
01:14:38
101. #101 Black Holes Collisions & Gravitational Waves, with LIGO Experts Christopher Berry & John Veitch
01:09:53
100. #100 Reactive Message Passing & Automated Inference in Julia, with Dmitry Bagaev
00:54:41
98. #98 Fusing Statistical Physics, Machine Learning & Adaptive MCMC, with Marylou Gabrié
01:05:06
97. #97 Probably Overthinking Statistical Paradoxes, with Allen Downey
01:12:35
96. #96 Pharma Models, Sports Analytics & Stan News, with Daniel Lee
00:55:51
90. #90, Demystifying MCMC & Variational Inference, with Charles Margossian
01:37:35
87. #87 Unlocking the Power of Bayesian Causal Inference, with Ben Vincent
01:08:38
86. #86 Exploring Research Synchronous Languages & Hybrid Systems, with Guillaume Baudart
00:58:42
1. #1 Bayes, open-source and bioinformatics, with Osvaldo Martin
00:49:40
2. #2 When should you use Bayesian tools, and Bayes in sports analytics, with Chris Fonnesbeck
00:43:37
7. #7 Designing a Probabilistic Programming Language & Debugging a Model, with Junpeng Lao
00:45:42
10. #10 Exploratory Analysis of Bayesian Models, with ArviZ and Ari Hartikainen
00:44:06
11. #11 Taking care of your Hierarchical Models, with Thomas Wiecki
00:58:01
12. #12 Biostatistics and Differential Equations, with Demetri Pananos
00:46:30
13. #13 Building a Probabilistic Programming Framework in Julia, with Chad Scherrer
00:43:51
19. #19 Turing, Julia and Bayes in Economics, with Cameron Pfiffer
01:00:26
24. #24 Bayesian Computational Biology in Julia, with Seth Axen
00:56:30
26. #26 What you’ll learn & who you’ll meet at the PyMC Conference, with Ravin Kumar & Quan Nguyen
00:46:25
29. #29 Model Assessment, Non-Parametric Models, And Much More, with Aki Vehtari
01:05:04
30. #30 Symbolic Computation & Dynamic Linear Models, with Brandon Willard
01:00:16
32. #32 Getting involved into Bayesian Stats & Open-Source Development, with Peadar Coyle
00:53:04
37. #37 Prophet, Time Series & Causal Inference, with Sean Taylor
01:06:14
44. #44 Building Bayesian Models at scale, with Rémi Louf
01:15:07
46. #46 Silly & Empowering Statistics, with Chelsea Parlett-Pelleriti
01:13:03
47. #47 Bayes in Physics & Astrophysics, with JJ Ruby
01:15:47
65. #65 PyMC, Aeppl, & Aesara: the new cool kids on the block, with Ricardo Vieira
01:05:28
74. #74 Optimizing NUTS and Developing the ZeroSumNormal Distribution, with Adrian Seyboldt
01:12:16
76. #76 The Past, Present & Future of Stan, with Bob Carpenter
01:11:09
82. #82 Sequential Monte Carlo & Bayesian Computation Algorithms, with Nicolas Chopin
01:06:35