Speaker:
00:00:04
Today I'm honored to welcome Havard, Ru and Yanet Manikirk, two researchers at the cutting
edge of Bayesian computational statistics.
2
:
00:00:14
Havard, a professor at Kaost, is the person behind integrated nested Leblas
approximations, or INLA, a robust method for efficient Bayesian inference that has
3
:
00:00:26
transformed how we approach large-scale latent Gaussian models.
4
:
00:00:31
Yanet, a research scientist at Kaost, specializes in applying INLA methods to complex
problems, particularly in medical statistics and survival analysis.
5
:
00:00:43
In this conversation, Havard and Yanet guide us through the intuitive and technical
foundations of INLA, contrasting it with traditional MCMC methods and highlight the
6
:
00:00:54
its strengths in handling massive complex data sets.
7
:
00:00:58
We dive into real-world applications ranging from spatial statistics and air quality
control to personalized medicine.
8
:
00:01:04
We also explore the computational advantages of stochastic partial differential equations,
discuss penalized complexity priors, and outline exciting future directions like GPU
9
:
00:01:17
acceleration and advanced SPAS solvers.
10
:
00:01:20
This is Learning Vision Statistics, episode.
11
:
00:01:23
136, recorded May 13, 2025.
12
:
00:01:32
Welcome Bayesian Statistics, a podcast about Bayesian inference, the methods, the
projects, and the people who make it possible.
13
:
00:01:53
I'm your host, Alex Andorra.
14
:
00:01:55
You can follow me on Twitter at alex-underscore-andorra.
15
:
00:01:59
like the country.
16
:
00:02:00
For any info about the show, learnbasedats.com is Laplace to be.
17
:
00:02:04
Show notes, becoming a corporate sponsor, unlocking Beijing Merch, supporting the show on
Patreon, everything is in there.
18
:
00:02:11
That's learnbasedats.com.
19
:
00:02:13
If you're interested in one-on-one mentorship, online courses, or statistical consulting,
feel free to reach out and book a call at topmate.io slash alex underscore and dora.
20
:
00:02:24
See you around, folks.
21
:
00:02:26
and best patient wishes to you all.
22
:
00:02:27
And if today's discussion sparked ideas for your business, well, our team at Pimc Labs can
help bring them to life.
23
:
00:02:35
Check us out at pimc-labs.com.
24
:
00:02:41
Harvard Roo and Janette Manigerk.
25
:
00:02:44
Welcome to Learning Vision Statistics.
26
:
00:02:47
Thank you.
27
:
00:02:49
Thank you.
28
:
00:02:51
Yes, and thank you for tolerating my pronunciation of your name.
29
:
00:02:55
I'm getting there.
30
:
00:03:00
I love this podcast also because it's very cosmopolitan, so I get to kind of speak a lot
of different languages.
31
:
00:03:06
uh
32
:
00:03:08
So I am super happy to have you on the show today because I've been meaning to do an
episode dedicated to Inla, per se, a few months now.
33
:
00:03:23
um I've mentioned it here and there with some guests, but now today is the Inla episode.
34
:
00:03:30
So let's do this.
35
:
00:03:32
But before that, maybe can you...
36
:
00:03:34
um
37
:
00:03:35
Let's start with you, Yanet.
38
:
00:03:36
Can you tell us what you're doing nowadays and how you ended up working on these?
39
:
00:03:42
Okay, good.
40
:
00:03:43
Thank you for having me.
41
:
00:03:46
What I'm doing nowadays, so I work in the In-law Group at KAUST and we do a lot of things
that pertain to In-law and we also do some other BASION things.
42
:
00:03:58
How I got to work at the In-law Group?
43
:
00:04:03
I just sent an email to Howard when I finished my PhD and I said, well, I don't know what
to do next.
44
:
00:04:11
Can I come and visit you?
45
:
00:04:13
And he said yes.
46
:
00:04:14
And I came and then I started to work here and that's how I ended up here.
47
:
00:04:20
That's a long time ago.
48
:
00:04:22
Yes.
49
:
00:04:24
More than seven years ago, Okay, damn.
50
:
00:04:29
And Howard, what about you?
51
:
00:04:31
What are you doing nowadays and how did you end up doing that?
52
:
00:04:36
Nowadays I try to keep up with the rest of the people in the group.
53
:
00:04:42
So how I ended up here at KAUST, you mean?
54
:
00:04:46
No, mainly how you ended up working on these topics because I think your name is pretty
much associated with Inlass.
55
:
00:04:54
So I'm curious how these all started.
56
:
00:04:58
What's your origin story basically?
57
:
00:05:01
It started like 25 years ago.
58
:
00:05:04
So we're doing a lot of this market chain Monte Carlo.
59
:
00:05:09
And we are trying to do, essentially working on this, what we know as a latent Gaussian
model.
60
:
00:05:19
We're trying to really solve this Markov chain Monte Carlo issue to get good samplers.
61
:
00:05:27
Working a lot with this, because Markov random fields to get this sparse matrix
computations, all these things.
62
:
00:05:37
Then we cut up.
63
:
00:05:41
come to an end and you're realizing this is not.
64
:
00:05:45
going to work in the sense that this MCMC, even if you can do like a big, we can update
everything in one block.
65
:
00:05:56
Everything is efficient as it could, but it's still way too slow.
66
:
00:06:01
And then we wrote a book about this thing.
67
:
00:06:06
And then you'll see in the end of that book from 2005, there is a short outline of how to
proceed.
68
:
00:06:15
And it's basically kind of computing the results from the proposal distribution that we
had at that time.
69
:
00:06:23
And that's how things got started.
70
:
00:06:25
then PhD student Sarah Martino, she joined.
71
:
00:06:31
That was January 2005.
72
:
00:06:37
From that point, I've been working on this.
73
:
00:06:39
Yeah, it's over 20 years ago.
74
:
00:06:43
From when it started and another five years before that, where we did all the prep work.
75
:
00:06:51
In parallel to that, also have Finlingan who's joined.
76
:
00:06:55
He's working on the spatial models, but yeah, that's also all connected.
77
:
00:07:03
Yeah.
78
:
00:07:04
the big adventure.
79
:
00:07:06
uh Actually, Howard, could you...
80
:
00:07:15
Well, no, let's go with Janet and then I'll ask...
81
:
00:07:18
I'll just hover about a follow up with that.
82
:
00:07:21
But Yannett, for listeners who are unfamiliar with INLER, so first, that stands for
Integrated Nested Laplace Approximations, if I'm not mistaken.
83
:
00:07:32
Could you give us an intuitive explanation of what that is and how it differs from
traditional MCMC methods?
84
:
00:07:42
Yeah sure.
85
:
00:07:42
um So in MCMC methods what you essentially need to do is you need to keep drawing samples
and at some point these samples will come from the distribution you're actually looking to
86
:
00:07:58
find and then from those samples you can calculate things like the mean or the mode and so
on.
87
:
00:08:06
So what Inla does is oh
88
:
00:08:10
It actually approximates with mathematical functions the posterior in its totality.
89
:
00:08:17
So there is no sampling waiting for things to arrive at the stationary distribution.
90
:
00:08:22
You compute the approximation and then you have the posterior.
91
:
00:08:26
So then you can from that get also the mean and the credible intervals and so on.
92
:
00:08:30
So it's a deterministic method.
93
:
00:08:33
It's not a sampling based method, um which is why it's fast.
94
:
00:08:40
But yeah, essentially that's kind of how it works.
95
:
00:08:46
Yeah, thanks.
96
:
00:08:48
That definitely cleared that up.
97
:
00:08:51
um And I'm curious actually, Howard, when you started developing Inla, um I think from
your previous answer, basically you were motivated by making inference faster.
98
:
00:09:08
Yes.
99
:
00:09:09
But also, was that the main challenge you were trying to address or was that also...
100
:
00:09:16
Were there also all other challenges you were trying to address with Inlet?
101
:
00:09:20
No, it was simply trying to...
102
:
00:09:24
Yeah, I said first we tried to make MCMC work because at that time, early 2000, we
believed that this was the way to go and then you're realizing it's not.
103
:
00:09:35
At some point you make a switch.
104
:
00:09:37
It's like MCMC is not going to work.
105
:
00:09:40
It's never going to give us the kind of the speed.
106
:
00:09:45
that we need.
107
:
00:09:47
And then we worked on, started on these approximations and working our way through that
one.
108
:
00:09:54
And that has been the goal all the time, you know, to make this inference for these kind
of models fast.
109
:
00:10:03
Yeah, fast is more important than, yeah, to make them quick.
110
:
00:10:08
This have been, to make them quick kind of scalable in a way.
111
:
00:10:13
And there are two,
112
:
00:10:14
two types of scales in a way you can scale with a number of data points.
113
:
00:10:21
And we can also scale with a number of the kind of the model size itself.
114
:
00:10:26
So these are two different things.
115
:
00:10:28
And in the first version of the lab, we didn't, we kind of.
116
:
00:10:35
scale okay with both of them.
117
:
00:10:39
And now in the second generation of Inda, then we scale way, way better, both respect to
model size and also the data size.
118
:
00:10:50
So that's another redevelopment.
119
:
00:10:52
yeah.
120
:
00:10:52
That's a- uh
121
:
00:11:01
It's a second generation in that.
122
:
00:11:04
So it was almost like a complete rewrite.
123
:
00:11:08
And we pump the main methods.
124
:
00:11:10
Yeah.
125
:
00:11:11
Yeah.
126
:
00:11:12
This is super exciting.
127
:
00:11:14
mean, well, we'll dive a bit more into that during the rest of the show, but ah yeah, high
level.
128
:
00:11:22
I think now we have a good idea of what that's ah helpful for.
129
:
00:11:27
I mean, what's that doing?
130
:
00:11:29
But um
131
:
00:11:31
I want us to also dig into the cases where that could be helpful.
132
:
00:11:36
So, um Yanet, actually I'm curious, how did you personally get introduced to Inlaid um and
what drew you to this particular computational approach before we dive a bit more into the
133
:
00:11:51
use cases?
134
:
00:11:54
Yeah, so I did my PhD in like Bayesian statistics.
135
:
00:11:58
um and the main focus was kind of to write samplers for covariance and correlation
matrices.
136
:
00:12:06
So of course that doesn't go well no matter which sampler you write.
137
:
00:12:11
um So I think very similar motivation just there has to be kind of a better way to do this
even if you lose a little bit of generality for specific cases.
138
:
00:12:26
you have to be able to do better.
139
:
00:12:28
And that's kind of where I started looking into approximate methods like ABC for the live
lutes and there is VB and so on.
140
:
00:12:35
And then InLa specifically, which like combines many, many different approaches together,
just to do things the same kind of accurate, but just faster.
141
:
00:12:50
Okay, yeah, I see.
142
:
00:12:52
uh And actually, I read while preparing the questions for the show that latent Gaussian
models are particularly suited to Inla.
143
:
00:13:05
em So I'm curious why, and also more in general, which cases are particularly appropriate
and you would recommend to...
144
:
00:13:20
uh listeners to give try to give InLaw a try.
145
:
00:13:24
um So maybe, Yannett, you can take that one and Harvard if you have anything to complete
afterwards.
146
:
00:13:34
Sure.
147
:
00:13:35
So actually, a latent Gaussian model, this is the assumption on which InLaw is built.
148
:
00:13:42
So InLaw is developed to do inference for latent Gaussian models.
149
:
00:13:47
So if you do not have a Gaussian model, then none of the math will hold because it's
developed to do inference for latent Gaussian models.
150
:
00:13:57
So can you maybe just briefly explain to us what a latent Gaussian model is?
151
:
00:14:04
Yes.
152
:
00:14:04
I have to say when I started at KAUST with Howard, we had another colleague, Håkon, which
was also Norwegian and he did his PhD with Howard.
153
:
00:14:14
So was me and them two.
154
:
00:14:17
And at the first uh few months, it felt like they spoke a different language.
155
:
00:14:23
Even though I had a PhD in statistics, I could not figure out what's Gaussian model and
all these things.
156
:
00:14:30
So I totally get this question when it comes up.
157
:
00:14:34
So what is a latent Gaussian model?
158
:
00:14:36
Okay, latent Gaussian model is when you have data points for which you can assign a
likelihood.
159
:
00:14:46
and maybe the mean or some parameter in the likelihood will have a regression model.
160
:
00:14:53
This regression model can contain fixed effects, random effects, different components.
161
:
00:14:59
And conditional on the model, the data is then independent so that the likelihood is then
just this product.
162
:
00:15:08
So where the latent Gaussian part, sorry.
163
:
00:15:12
Yeah, yeah, no, exactly.
164
:
00:15:13
I was going to give you the same way to go.
165
:
00:15:16
It's like a gam up to now, right?
166
:
00:15:18
Like if you have a gam.
167
:
00:15:20
Then the latent Gaussian part comes into the fact that all the fixed effects and the
random effects should have a joint Gaussian prior.
168
:
00:15:32
So you can think of any random effect like an IID random effect or like a time series
model.
169
:
00:15:39
As long as they have a Gaussian joint distribution, then they will be a latent Gaussian
model.
170
:
00:15:45
So at first thought, it might seem like, this is like a very restrictive class, but it's
actually not.
171
:
00:15:51
A lot of models that we use every day are actually latent Gaussian models.
172
:
00:15:56
And even some very complicated models like survival analysis models, they're also latent
Gaussian models, a lot of them.
173
:
00:16:02
ah If you do like co-regionalization where you have spatial measurements at different
locations, you can model them jointly and this is also latent Gaussian model.
174
:
00:16:12
So it's really much broader than you would think initially.
175
:
00:16:20
Yes.
176
:
00:16:21
em Yeah.
177
:
00:16:22
Thanks, Jan.
178
:
00:16:23
It's super, super clear.
179
:
00:16:25
That makes me think about, yeah, state space models where most of the time you linear or
Gaussian state space models.
180
:
00:16:36
So that means m Gaussian likelihood and Gaussian innovations.
181
:
00:16:42
uh yeah, like basically what you just...
182
:
00:16:46
talked about here, even though the structures of the models are different.
183
:
00:16:50
And yeah, as you were saying, that may seem pretty restrictive, but that actually covers a
lot of the cases.
184
:
00:16:57
I guess some cases that can be more complicated when you have count data.
185
:
00:17:03
No, they are the same, actually.
186
:
00:17:06
So the likelihood has no restrictions.
187
:
00:17:09
oh Yeah, can be an likelihood.
188
:
00:17:11
can be a zero-input likelihood.
189
:
00:17:13
You can have...
190
:
00:17:14
multiple likelihoods, like you can do a multivariate data analysis, uh each data type with
their own likelihood, as long as the latent part, so just the fixed effects and the random
191
:
00:17:25
effects should have a Gaussian prior, but what you put on the data, there is no
restriction on that.
192
:
00:17:31
Nice, nice.
193
:
00:17:31
Okay, so that's less restrictive than the classic Kalman filter then, because the classic
Kalman filter only can take normal likelihood.
194
:
00:17:39
So, okay, yeah.
195
:
00:17:40
Yeah, that's really cool.
196
:
00:17:41
then I agree that's even less restrictive than what I had understood.
197
:
00:17:48
uh Havard, it looks like you had something to add.
198
:
00:17:54
it's not to explain everything.
199
:
00:17:56
Of course, if you have one Leiden-Goshen model, if you have another one, you can put them
together.
200
:
00:18:04
So we also can do these kind of joint models where you share
201
:
00:18:09
kind of effects.
202
:
00:18:10
So all these kind of joint models is also covered.
203
:
00:18:14
As soon as you have one, you can also do two and then you can combine them together.
204
:
00:18:20
So it's almost like, I think it's easier to classify all the models who are not
late-engagement model is that it's very surprising result in the sense of
205
:
00:18:39
When you write it down, it looks so simple, but actually to make the connection from the
model that you have to rewrite in that form we needed, it's, many struggled with that, you
206
:
00:18:55
know, because the form is so simple.
207
:
00:18:57
You have some parameters like variances, correlations, over dispersions, and then you have
something Gaussian, and then you have observations of the Gaussian.
208
:
00:19:08
And that's it.
209
:
00:19:10
So this structure covers almost...
210
:
00:19:17
almost everything that is done in practice today.
211
:
00:19:20
You can say, okay, mixture models, this kind of thing is not in the class.
212
:
00:19:24
Yeah, but they are not that much used, you in the sense of in daily life of people who
working with this.
213
:
00:19:34
Like more in research, yeah, you can do it.
214
:
00:19:38
It's a little different, but in our kind of in the practical life of people who do these
things, it's not.
215
:
00:19:47
It's a very, very surprising thing.
216
:
00:19:50
Yeah.
217
:
00:19:51
Yeah.
218
:
00:19:51
Yeah.
219
:
00:19:52
No, for sure.
220
:
00:19:53
I as soon as you start being able to normally distributed likelihood and count data, you
have, I would say, 80, 90 % of the use cases.
221
:
00:20:07
Measures are indeed, they are possible, but they are way less common for sure.
222
:
00:20:12
Yeah.
223
:
00:20:12
Yeah.
224
:
00:20:13
And actually, Harvard, um you've been doing that for
225
:
00:20:17
quite some time now as you were saying.
226
:
00:20:20
So I'm curious what you've seen as the most impactful applications of Inla, especially
maybe at extreme data scales because that's really where Inla shines.
227
:
00:20:37
Janette, help me out.
228
:
00:20:38
You know this better.
229
:
00:20:43
Yeah, so I mean, we've had um the World Health Organization has used INLAW for some air
quality control methods.
230
:
00:20:53
We've had the CDC used INLAW for like epidemiology.
231
:
00:21:01
We've had the Malaria Atlas Project use INLAW to
232
:
00:21:06
model the prevalence of malaria to kind of help inform interventions and where they should
be.
233
:
00:21:12
And for instance, I we are past COVID, but a lot of people still work on COVID.
234
:
00:21:17
And I just checked quickly this morning and there is more than 800 papers who used INLA
for COVID modeling.
235
:
00:21:25
yeah, I mean, there's a lot of impactful applications, but on a very large scale, there...
236
:
00:21:32
um
237
:
00:21:33
There has been applications recently, there is a paper by Finn Lindgren and others who've
used it to model temperature on the global scale with lots and lots and lots of stations
238
:
00:21:46
and they can do a very high resolution um model.
239
:
00:21:50
So like Howard said before, the kind of modern framework for InLa that came like after
:
2021
240
:
00:21:59
It can scale now very, high with data.
241
:
00:22:05
Nice, yeah.
242
:
00:22:06
And actually, I'm wondering, um when you say people are using Inla to model these data
sets, what do they use?
243
:
00:22:21
What's a package you recommend people check out if they want to start using Inla in their
own analysis?
244
:
00:22:34
So the InLaw methodology is implemented in the R InLaw package.
245
:
00:22:40
So it's an R package.
246
:
00:22:42
There is a Python wrapper or soon there will be, or it is there, but anyway, so there will
be like a Python wrapper where you can use InLaw in Python.
247
:
00:22:54
um So yeah, think, yeah, most people just use the R InLaw package and it's...
248
:
00:23:01
Howard does a great job.
249
:
00:23:02
There's kind of almost a new testing version every few days.
250
:
00:23:06
So the development is very, very fast.
251
:
00:23:09
Usually the package is faster than the papers.
252
:
00:23:13
Like we would implement something and then a year later the paper would come out
explaining what was changed.
253
:
00:23:20
So users always have the latest version immediately.
254
:
00:23:27
Yeah, this is super cool.
255
:
00:23:29
So the R in Lab Package is in the show notes already, folks, for those who want to check
it out.
256
:
00:23:34
ah So yeah, check out the website.
257
:
00:23:36
There are some examples in there, of course, of how to use it and so on.
258
:
00:23:42
And if you guys can add the Python wrapper to the show notes, that'd be great because I
think, yeah, definitely that will increase even more your number of users.
259
:
00:23:53
I know, I will use that.
260
:
00:23:55
because I mainly code in Python.
261
:
00:23:57
yeah, like that's for sure.
262
:
00:23:59
Something I will check out.
263
:
00:24:02
Howard, anything you want to add on that?
264
:
00:24:05
No, I think it's fine.
265
:
00:24:08
It is what I trying to say.
266
:
00:24:10
But we are not on this C run.
267
:
00:24:12
We are not on this public repository or the standard R repository, simply because the R
code is just a wrapper.
268
:
00:24:23
So inside there is a C code.
269
:
00:24:25
It's a program that runs independently.
270
:
00:24:30
This is simply too complicated to compile on this automatically built system.
271
:
00:24:36
So we have to build it manually and include it in the package.
272
:
00:24:41
for this, because it's contained in binary, it cannot be in this kind of public
repositories.
273
:
00:24:50
So we have our own that you need to kind of download from.
274
:
00:24:54
Okay.
275
:
00:24:55
Yeah, damn that, that adds to the, the maintaining complexity.
276
:
00:24:59
So thank you so much for doing that for us.
277
:
00:25:02
Um, know it can be complicated.
278
:
00:25:06
in, uh, Yanet, actually, I think, if I understood correctly, you apply a lot of, um, you
apply a lot in medical statistics and survival analysis.
279
:
00:25:19
Um, so
280
:
00:25:22
Can you share with us maybe an example of how you've done that and why InLa was
particularly efficient in this setting?
281
:
00:25:32
Yes.
282
:
00:25:32
So when I started at KAUST, actually, Howard told me, we need someone to do survival
analysis with InLa.
283
:
00:25:42
So that's kind of how it started.
284
:
00:25:44
um And I think up until that stage in 2018,
285
:
00:25:51
There was very few, maybe two or three works on using INLA for survival analysis.
286
:
00:25:58
actually, the survival analysis models are also latent Gaussian models.
287
:
00:26:02
But to make this connection is not so clear at first glance.
288
:
00:26:08
Like you really have to just sit down and think about your model on a higher level to be
able to see the connection to a latent Gaussian model.
289
:
00:26:17
And of course, if we can then use INLA.
290
:
00:26:20
then we can do a lot more complicated models.
291
:
00:26:23
Like we can do spatial survival models, which a lot of survival packages cannot do.
292
:
00:26:28
um We can do then these joint models.
293
:
00:26:32
We now have another package built on top of Imla called Imla Joint that has a very nice
user interface for joint models.
294
:
00:26:41
So where you have something that you monitor uh over time.
295
:
00:26:46
So in event, it could be like relapse of cancer, for instance.
296
:
00:26:50
And then you would have many biomarkers, like lots of blood test values and maybe x-ray uh
image and so on.
297
:
00:26:58
And you would have a lot of these longitudinal series and then you would jointly model
them and assuming that there is some common process driving all of this.
298
:
00:27:10
And these models are very computationally expensive because you you have a lot of data, a
lot of uh high velocity data.
299
:
00:27:20
You can have multiple biomarkers.
300
:
00:27:22
You have this hazard function that you model, which is different than general linear
models where you model the mean, because we don't model the mean time to an event.
301
:
00:27:32
We model the hazard function, like this instantaneous risk at every time point.
302
:
00:27:37
um But then if you can see that your survival model or your very complicated joint model,
even if it contains splines over time or it contains
303
:
00:27:50
a spatial effect, different IID effects like multi-level random effects for hospital
effect and doctor and patient and so on, you can easily see that this model becomes very
304
:
00:28:02
complicated if you want to just do the right thing.
305
:
00:28:06
um But then InLa makes it possible that we can actually fit these models.
306
:
00:28:12
And one thing that we've been trying to do in this regard in terms of medical statistics
307
:
00:28:19
is to really position INLA to be a tool um in kind of the drive to personalised medicine.
308
:
00:28:28
Because in the end, if you want a doctor to run something locally uh and see, okay, the
probability for relapse is lowest on medicine A versus B and C and D, you need something
309
:
00:28:43
that's going to run fast, like it cannot come out tomorrow.
310
:
00:28:48
So that was kind of a main motivation to kind of position INLAT towards medical statistics
also to just show the potential for to actually achieve this personalized medicine target.
311
:
00:29:06
Yeah, this is really cool.
312
:
00:29:08
Yeah, definitely agree that has a lot of potential basically because that makes more
complex models still um practically uh inferable.
313
:
00:29:24
um Whereas with classic MCMC, it wouldn't be possible.
314
:
00:29:30
yeah, that expands the universe of patient models in a way.
315
:
00:29:36
That's really fantastic.
316
:
00:29:40
Something I'm wondering, uh Harvard, is the computational challenges.
317
:
00:29:48
Are there any such challenges when you're using NLA, um numerical stability, efficiency of
the algorithms, all these diagnostics that we get with MCMC samplers?
318
:
00:30:02
How does that work with...
319
:
00:30:04
in when you're running a model.
320
:
00:30:06
How do you know that conversions was good?
321
:
00:30:09
How do you diagnose the conversions?
322
:
00:30:13
How do you handle conversions issues?
323
:
00:30:16
Are there even conversions issues?
324
:
00:30:19
How do you go about that practically?
325
:
00:30:22
The conversions issue is very different.
326
:
00:30:25
This is like a numerical optimization.
327
:
00:30:28
So if it doesn't work, you will be told that this doesn't work.
328
:
00:30:33
Over the years, know, there's some code working on, been developed for 20 years, you know,
so we are, we are getting a good experience.
329
:
00:30:45
What is working, what is not.
330
:
00:30:46
Of course there is a specialized version and tailored version of everything, numerical
optimization algorithm.
331
:
00:30:56
We don't use just a library.
332
:
00:30:57
Every code is from some kind of
333
:
00:31:01
or just on the library, but it's tailored to exactly what we want.
334
:
00:31:05
Just computing gradients, we do it in a very different way.
335
:
00:31:09
That is also tailored to what we do.
336
:
00:31:12
Everything is kind of tailored, every kind of custom made, everything kind of refined.
337
:
00:31:17
So that it's amazing how.
338
:
00:31:22
how well it runs in the sense that if you have any decent model will run.
339
:
00:31:28
If you have a problem with convergence, it's usually because you have a model that to be
honest doesn't make sense.
340
:
00:31:35
Since in the way that you have very little data, using wake priors, there is no
information in the model, there is no information in the data and then this is harder.
341
:
00:31:48
So about convergence is like,
342
:
00:31:52
This is kind of different from kind of Mark-Chain Monte Carlo where you have to do kind of
diagnostics.
343
:
00:31:59
And here is more if you work with contain, if you do the optimisation, this is on the
highest level on the parameters on the top.
344
:
00:32:10
We're talking variance, correlation, over dispersion, this kind of parameters.
345
:
00:32:16
If you come to a point that is a well-defined maximum,
346
:
00:32:21
And essentially you're good.
347
:
00:32:23
And we're looking around that point to correct for kind of skewness that is not perfect
kind of symmetric in that sense.
348
:
00:32:31
And these are also kind of small diagnose, but they are never, it's never a kind of
serious issue that it was for MCMC.
349
:
00:32:42
Now it's like it's, if you put something reasonable in, you'll get something reasonable
out.
350
:
00:32:50
is very little.
351
:
00:32:54
don't think we have serious convergence issues, no?
352
:
00:33:00
I don't think so.
353
:
00:33:01
We might have it like 15 years ago or 10 in beginning.
354
:
00:33:08
There was less, there was more trouble computing gradient session, this kind of thing.
355
:
00:33:13
But now we we do this very differently.
356
:
00:33:19
We are being smarter.
357
:
00:33:21
And then, no, there isn't any...
358
:
00:33:25
It works pretty good, actually.
359
:
00:33:29
Yes.
360
:
00:33:29
So everything is very...
361
:
00:33:31
It's different.
362
:
00:33:35
Cool, yeah.
363
:
00:33:36
I mean, that's great to hear.
364
:
00:33:39
And Yanet, do you have any tips to share with the listeners about that?
365
:
00:33:46
No, I think I will just comment on the gradient.
366
:
00:33:49
There is a paper describing the gradient method that's used in Inland.
367
:
00:33:54
It's called Smart Gradient.
368
:
00:33:56
uh Describing uh how this kind of gradient descent methods and so on work with this
different
369
:
00:34:05
type of uh way to get a gradient.
370
:
00:34:07
And it is really good.
371
:
00:34:09
It's also the one we use inside Inla, but of course, people who use gradient-based methods
would also benefit from that as a standalone contribution.
372
:
00:34:23
Yeah.
373
:
00:34:23
Okay.
374
:
00:34:23
Yeah.
375
:
00:34:25
Thanks, Jan.
376
:
00:34:25
That's uh helpful and feel free to add that to the show notes uh if you think that's
something that's going to be interesting for listeners.
377
:
00:34:35
em And Howard, actually, um I read that you've applied stochastic partial differential
equations.
378
:
00:34:45
em So let's call them SPDEs.
379
:
00:34:50
Very poetic name.
380
:
00:34:53
And so yeah, you've used that mainly for spatial modeling.
381
:
00:34:57
So can you give us an overview of why these SPDs are interesting?
382
:
00:35:04
And I think you use that to model Gaussian fields.
383
:
00:35:07
yeah, maybe if you can, Vivian, talk a bit more about that because that sounds super
interesting.
384
:
00:35:12
Yes.
385
:
00:35:12
So normally Gaussian fields like normal distribution is described for the covariance or a
covariance function.
386
:
00:35:21
uh
387
:
00:35:21
traditional way of doing things and there is nothing wrong with that except that it's not
very
388
:
00:35:32
is not very smart way of doing things.
389
:
00:35:34
from the case and uh like in the beginning, like 20, 25 years ago, there was a big, this
was also one of the problems we wanted to solve in a way, some of these kind of spatial
390
:
00:35:47
stat problems.
391
:
00:35:50
But of course there is a version, there are models that are kind of Markov.
392
:
00:35:56
Markov in the sense that instead of the covariance matrix, you're working with the
precision matrix.
393
:
00:36:02
inverse of the covariance and that is sparse.
394
:
00:36:04
So this was often called this regional models or marker models.
395
:
00:36:09
And then you had the spatial models.
396
:
00:36:14
The Gaussian field with kind of a turn covariance, you choose a covariance function and
they were dense and they were kind of a mess.
397
:
00:36:23
And then of course it's like in parallel to the development of this now professor in
Edinburgh, Finn Lindgen.
398
:
00:36:32
It's also part of this overall in a project.
399
:
00:36:36
He started or him and me started to work on this thing because we had like, I think we had
the first version in:
1999
400
:
00:36:50
was almost Markov.
401
:
00:36:53
They were almost Markov, but we had to compute, we had to fit them kind of numerically.
402
:
00:37:00
And from that point, we can kind of move on to working with precision matrices and stuff.
403
:
00:37:06
And this is one of the key things in Inland that you don't work with covariance, you work
with precision.
404
:
00:37:13
Because precision can be put together.
405
:
00:37:15
This is like playing with Lego.
406
:
00:37:21
You have one component, you have another one, you just stick it together and it's easy.
407
:
00:37:26
If you work with covariance, there is a lot of math.
408
:
00:37:28
uh
409
:
00:37:30
that go in just to put them together.
410
:
00:37:34
So this falls directly out.
411
:
00:37:37
In addition to this, you have this, the fact that they are Markov.
412
:
00:37:43
Markov, you mentioned the state space models where if you condition on pass, you only need
the latest one.
413
:
00:37:51
This really simplified computations.
414
:
00:37:53
And of course, this apply more generally.
415
:
00:37:56
And these are called the Smokov random fields.
416
:
00:37:59
Yeah, this connect back to the book we had 20 years ago, 2005, but Leona held.
417
:
00:38:07
And the point is that the computations are very efficient for these kind of models.
418
:
00:38:13
Instead of using dense matrices, you can use algorithms for sparse matrices.
419
:
00:38:19
And these scales way, way better.
420
:
00:38:23
So back to the SPD thing, and there was a quest, there was a hunt for trying to solve this
in more, again, the motivation was computational speed.
421
:
00:38:38
We need to do things faster.
422
:
00:38:41
And Finn, he's a small genius.
423
:
00:38:47
And no, he figured out that, okay, we can connect these to these.
424
:
00:38:53
stochastic differential equation and this go actually back to very old work in the late
50s and early 60s that show that these Gaussian fields are exactly solutions of these
425
:
00:39:07
stochastic differential equations.
426
:
00:39:10
As soon as you realize that, then you can say, okay, let's solve this thing.
427
:
00:39:15
We don't need to solve them.
428
:
00:39:17
We need to represent the solution of.
429
:
00:39:22
And that is done by this finite element method from applied math.
430
:
00:39:27
And then you get something that is sparse.
431
:
00:39:32
And the matrices you get there will be our kind of precision matrices.
432
:
00:39:37
So everything up to that point was done for computational speed.
433
:
00:39:41
And we can do things faster.
434
:
00:39:44
Of course, when at some point you're realizing the most important thing about this SPD
approach is not
435
:
00:39:51
speed.
436
:
00:39:52
It's actually that you can use this way of thinking to do things very easily that was up
to that point almost impossible.
437
:
00:40:01
And now it's just a course we can do it.
438
:
00:40:05
It's like having, yeah, Janet mentioned his post-hocumbaca.
439
:
00:40:12
That was here in beginning.
440
:
00:40:13
He worked on these barrier models.
441
:
00:40:15
What happened if you have islands and you want the dependence to go around the islands or
through
442
:
00:40:21
following rivers and do this follow adjust for the coastline and all this thing.
443
:
00:40:28
This is super complicated unless you do it with the SPDs.
444
:
00:40:34
Then it's just follow from the definition.
445
:
00:40:39
Very little things you need to do and then it's just do the right thing.
446
:
00:40:45
And also this is connected to the kind of the physical laws that we think
447
:
00:40:52
these kind of processes were almost or they will follow or almost follow.
448
:
00:40:57
So it's like stochastic differential equations.
449
:
00:41:01
It's just more or less elliptic ones that we make kind of stochastic.
450
:
00:41:07
Now, so this is super super useful.
451
:
00:41:11
But of course the complexity is way higher.
452
:
00:41:17
It's very hard.
453
:
00:41:18
You need tools to create the mesh.
454
:
00:41:21
You have to work with a mesh.
455
:
00:41:22
You have to work with transation of an area.
456
:
00:41:25
You have to do all these kind of things.
457
:
00:41:30
But Finn has written all these kind of tools for doing that.
458
:
00:41:34
He's also pretty good in this coding, you know?
459
:
00:41:38
And as soon as you have the tools, everything of this is quite easy.
460
:
00:41:46
But there is a huge kind of step to take before you get to that point.
461
:
00:41:50
But when this is done, when somebody has done it for you, then it's pretty easy.
462
:
00:41:56
Also, these have been very, very useful.
463
:
00:41:58
Also, you can do these non-separable models going in time as well.
464
:
00:42:03
You just have a time dependence of this.
465
:
00:42:05
And this follows the same procedure.
466
:
00:42:09
Yeah, damn.
467
:
00:42:11
Yeah, for sure.
468
:
00:42:12
That sounds like a big project, but really fascinating.
469
:
00:42:18
If you have any link you can share with us in the show notes, that'd be great because I
think that makes for a very interesting case study that we can follow along.
470
:
00:42:30
It's a very large project.
471
:
00:42:32
It's like if we're giving kind of tutorials courses on this thing, you...
472
:
00:42:39
You can almost have half of the time is spent on the in-app package itself and the second
half is on this SPD thing.
473
:
00:42:51
Of course, this is integrated, but the complexity of that part is of the same order as the
complexity of the whole inlet package itself.
474
:
00:43:02
Right.
475
:
00:43:03
ah Yannett, to, get back to you, um, I'm curious, uh, like given, given your experience in
applying patient models, especially in complex medical scenarios, what new features or
476
:
00:43:22
improvement, if any, would you most like to see incorporated into Inla?
477
:
00:43:27
Oh, so
478
:
00:43:31
I'm probably biased, but I think the new methodology has solved a lot of issues that were
there before.
479
:
00:43:44
Because a lot of the medical statistics models are very data heavy and not model size
heavy.
480
:
00:43:54
For instance, you think about MRI data,
481
:
00:43:58
you have many, many data points, but maybe not so many parameters.
482
:
00:44:03
And in the classical uh INLA from the 2009 paper, this did not scale so well to very many
data points.
483
:
00:44:14
But now the new implementations scales extremely well.
484
:
00:44:19
And I think, for instance, in medical problems, especially medical imaging,
485
:
00:44:25
In-law, the new in-law now can be applied to that, whereas before it could not.
486
:
00:44:31
in my biased opinion, there is not uh anything I see at the moment that I would like to be
incorporated.
487
:
00:44:38
There are still many applications that can be explored and see how far we could push this
uh kind of new in-law, I would say.
488
:
00:44:53
Harvard kind of a related question.
489
:
00:44:56
What do you see since you're um developing the package at the forefront?
490
:
00:45:02
What do you see as the next major frontier for Inla, whether that's methodological
advancement or new application domains?
491
:
00:45:15
Yeah, I think that the most pressing issue now is to have a sparse solver that is more
scalable.
492
:
00:45:27
It's scaled better in parallel and it's also able to kind of start taking advantage like
everything with GPUs, you know.
493
:
00:45:37
Now GPUs are everywhere and it's coming.
494
:
00:45:40
They are really good.
495
:
00:45:43
and we need to take advantage of them.
496
:
00:45:45
But this is not easy.
497
:
00:45:46
It's not easy at all.
498
:
00:45:48
So there is one post-hoc in the group, Esmaltata.
499
:
00:45:53
has been working on a new implementation of a new spar solver that is targeted towards
this.
500
:
00:46:02
So it's more a modern design, has modern ideas.
501
:
00:46:09
And this is going really well.
502
:
00:46:11
So to have this kind of a
503
:
00:46:12
better in numerical kind of backend supporting this kind of calculations.
504
:
00:46:18
This had been a main struggle for a long time.
505
:
00:46:21
The thing is that sparse matrix solvers are not as...
506
:
00:46:28
I said developed.
507
:
00:46:31
It's easier to work with if you're doing supercomputing than working with dense matrices
is far more easier and it's far more kind of relevant.
508
:
00:46:44
This large sparse matrices is less relevant.
509
:
00:46:50
So therefore there are fewer implementations.
510
:
00:46:53
Those who are often kind of so what are now close.
511
:
00:46:58
not open source, and they don't scale too well in terms of in parallel.
512
:
00:47:09
Because nowadays it's like for the older, the modern machines we have now, it's a lot of
problems.
513
:
00:47:18
It's more about memory, not about CPU.
514
:
00:47:21
It's more important to have the data you want to multiply.
515
:
00:47:24
You have them right there in front of you and then you can do it instead of doing fewer
competitions.
516
:
00:47:32
You so it like a memory management and this kind of everything with memory is much more.
517
:
00:47:40
important now than it was before.
518
:
00:47:42
So then looking at the sparse matrix as
519
:
00:47:48
Instead of a sparse matrix of just elements, you can look at that sparse matrix of dense
blocks.
520
:
00:47:55
And these dense blocks is called a tile.
521
:
00:47:58
So then you can work with these small dense blocks instead.
522
:
00:48:03
Even computing too much, but it's faster to do that than figuring out only what needs to
be computed.
523
:
00:48:11
And that's the key thing.
524
:
00:48:12
And this scales way, way better.
525
:
00:48:15
It can connect to GPUs and connect
526
:
00:48:18
to all these things.
527
:
00:48:20
So there are, in fact, in the group, are works in two directions of this.
528
:
00:48:26
One is to kind of build a new sparse matrix solver to be directly into the inlet code that
is now.
529
:
00:48:35
We also have initiative on writing a completely distributed solver.
530
:
00:48:44
This is a not a post of Lisa.
531
:
00:48:47
is doing that with some colleagues in Zurich to have a complete distributed solver with
necessarily in Python.
532
:
00:48:59
That also have this ability you could distribute the calculations, you have GPUs, it can
take care of this.
533
:
00:49:06
And this is aimed for a different data scale.
534
:
00:49:13
So we have to work on two pilot tracks.
535
:
00:49:16
One is to kind of keep the current code developing.
536
:
00:49:20
And the other one is try to prepare also for the future, maybe also for larger models.
537
:
00:49:28
They need different things.
538
:
00:49:30
So I think that's main pressing issues.
539
:
00:49:35
Janice said, there is a lot of things that was a problem before.
540
:
00:49:40
The problem, yes.
541
:
00:49:43
things that we would like to have slightly better.
542
:
00:49:47
But now I think they are basically solved.
543
:
00:49:51
So there are two main things are developing applications, as Jan said, but also this more
modern computing to get this integrated.
544
:
00:50:03
It's also the major task.
545
:
00:50:06
oh This is the reason you saw it take years.
546
:
00:50:12
There's years of work.
547
:
00:50:14
It's not something you do in a weekend.
548
:
00:50:18
It's a scale of years.
549
:
00:50:22
Yes, that sounds about right.
550
:
00:50:27
uh But definitely looking forward to ah seeing these advancements in the coming months and
years.
551
:
00:50:38
um
552
:
00:50:40
Yannet, since you've been there, you know, at some point you've been a beginner of In-Lay,
you had to start somewhere.
553
:
00:50:48
So I'm curious, uh for listeners who want to get started with In-Lay, what resources or
practices would you recommend as first steps?
554
:
00:51:03
I think this also depends a little bit on why you want to learn it.
555
:
00:51:08
So if you want to use it, you're going to be an applier of Inna.
556
:
00:51:14
On the website, there are links to some tutorials that people have done.
557
:
00:51:19
But also the papers we do from the group always have code available.
558
:
00:51:27
On the website, there are a few open source books.
559
:
00:51:31
There is one book from Virgilio that goes through a lot of common models with code.
560
:
00:51:39
So it's written like oracle, the output and so on.
561
:
00:51:43
The book is in this format.
562
:
00:51:44
So if you want to learn how to code inla and like where to get the posteriors from and so
on, then that kind of book is a very good uh tool.
563
:
00:51:58
Then to learn really what is
564
:
00:52:01
what is Inla, not just to be able to use it.
565
:
00:52:05
I this is a little bit harder.
566
:
00:52:07
ah I think we've tried since we have uh developed the new methodology, we've tried to
write it up um in a way that's easy to understand.
567
:
00:52:22
Yeah, but there is also this uh gentle introduction.
568
:
00:52:31
the one I used, it's also linked on the website, it's called A Gentle Introduction to
INLA.
569
:
00:52:35
ah But yes, this will be on the old methodology.
570
:
00:52:39
But it just gives a good intuition, like how it's different and questions about
convergence and so on, and kind of makes it clear that it's not based on samples.
571
:
00:52:49
So everything with samples that comes with samples is not there.
572
:
00:52:54
But then what is there if there is not samples and you know how it works and so on.
573
:
00:53:00
And something that's really nice um that I think maybe a lot don't know initially is that
you can always draw samples after because you have the full posterior.
574
:
00:53:13
So if you need to calculate whatever for some reason, you can draw samples.
575
:
00:53:19
So you can still have samples.
576
:
00:53:21
It's not like, there now, you know, we have like a built-in way to draw samples from an
in-law object.
577
:
00:53:29
So it's very versatile in that way that you can have the samples, but it's not based on
samples.
578
:
00:53:36
It's based on math, I would say.
579
:
00:53:42
Yeah, yeah.
580
:
00:53:45
That all makes sense.
581
:
00:53:47
Jannet.
582
:
00:53:48
Havard, anything you would add here to basically recommend resources to beginners?
583
:
00:53:59
It is, as Jannet said, there are a few books out there that are also open.
584
:
00:54:06
You can read them just on the web.
585
:
00:54:08
They are pretty good.
586
:
00:54:10
We have one.
587
:
00:54:11
This is more target...
588
:
00:54:12
to what is SPD models, this whole book about only these.
589
:
00:54:15
I'll just introduce, yeah, there are some of these books to the background theory.
590
:
00:54:24
They just as Janna said, this is more.
591
:
00:54:29
This less clear because it's a serious, there is a development and there is a series of
key paper.
592
:
00:54:37
have to read one and then you another one and do another one and do a third one.
593
:
00:54:41
And then you do another one that we do a lot of things and there's another, it's like a
sequence of things.
594
:
00:54:49
So that's less.
595
:
00:54:53
That is, I understand this is less clear.
596
:
00:54:55
And the first thing you do is to write, to read the book.
597
:
00:54:59
about ghost marker random fields.
598
:
00:55:02
So that's a harder thing.
599
:
00:55:05
I see that because it contains a lot.
600
:
00:55:09
There is a lot of things going on.
601
:
00:55:12
For us, I don't think we see it anymore.
602
:
00:55:16
But you realize it when, because we have been working on this a lot.
603
:
00:55:22
But I understand there is a lot of concept, there is a lot of things that.
604
:
00:55:28
You just put together, if you put on top of each other is new things.
605
:
00:55:32
It's, it's a lot of details.
606
:
00:55:36
And many of the details are never written down, you because, um, where should you write
them?
607
:
00:55:43
Or journals would have them.
608
:
00:55:46
So it's like, there are these things.
609
:
00:55:48
It's like, it gives a kind of skeleton and then you have to figure out everything in
between yourself.
610
:
00:55:55
But, it's basically there, but it's.
611
:
00:55:58
It's hard.
612
:
00:55:59
I see it's kind of hard to get a complete picture of what is going on.
613
:
00:56:04
Maybe we have to write another book in the end, you know, with all the details.
614
:
00:56:12
That sounds good, yeah.
615
:
00:56:15
And um maybe another question I have that's very practical, but...
616
:
00:56:21
um
617
:
00:56:25
Is there any difference in the way you set appropriate priors when you're using the Inline
Algorithm for inference compared to when you would use Stan or Pimcitor in your models?
618
:
00:56:40
Or is that pretty much the same?
619
:
00:56:42
It's just the inference engine or the hood is changing.
620
:
00:56:50
Yeah, so you can set your priors whatever you want, but of course the latent ones has to
be Gaussian, but you can set like the parameters of the Gaussian if you want.
621
:
00:57:02
But then also we have default priors.
622
:
00:57:06
So you can run your model without putting any prior inside and a lot of patients I've
spoken to says this is really bad because then I don't know what's going on.
623
:
00:57:17
But for practitioners, a lot of them just kind of want to run a model and not make a
decision really about priors.
624
:
00:57:25
So this brought up a whole new field of research, I would say, within the In-Large Group.
625
:
00:57:31
Because if we need to decide on a default prior, we have to make sure that this works.
626
:
00:57:38
And the definition of works is what?
627
:
00:57:44
But generally, I think in the field I see this a lot, maybe not so much in the Bayesian
statistics community, but more in the applied community, where people would choose a prior
628
:
00:57:56
that's kind of the most used prior in the literature.
629
:
00:57:59
And they just use the other papers as motivation.
630
:
00:58:02
I choose this prior because everybody else has chosen it, right?
631
:
00:58:04
And this has caused that we've ended up with a lot of priors that's been used for like
variances and so on.
632
:
00:58:13
That's really bad.
633
:
00:58:14
but has been accepted because everybody else have used them and a lot of the priors we use
for hyperparameters or nuisance parameters I would say like variance components and so on
634
:
00:58:30
their initial motivation often was to be a conjugate prior
635
:
00:58:36
But then if we are anyway doing something like MCMC where we code and we have a computer,
then the idea of conjugacy is a little strange in that sense, right?
636
:
00:58:48
So kind of the motivation for this gamma, let's say gamma prior for this inverse variance,
the motivation was different when it was proposed, but it's still used because it's been
637
:
00:59:02
cited in 5,000 papers, right?
638
:
00:59:04
So this opened up
639
:
00:59:06
the idea of, okay, but what should we then do?
640
:
00:59:09
What should we put as a default prior that works well?
641
:
00:59:12
And this is how the idea of penalizing complexity priors, or in short, it's called PC
priors, were born.
642
:
00:59:22
So how can we derive and propose priors for all these hyperparameters that are not in the
latent part of the model that we know will do a good job?
643
:
00:59:34
in the sense that they will not overfit.
644
:
00:59:37
So for instance, if you include a random effect in your model, let's say just an IID
effect, you use the variance or the estimated variance to see how big is the random
645
:
00:59:46
effect, right?
646
:
00:59:48
But if you have a prior that's never going to be able to estimate a variance that's small,
then you could get a larger variance just even if it's not true.
647
:
00:59:59
But if your prior does not have sufficient mass to take it small,
648
:
01:00:04
then you're going to get a big value for the variance thinking this random effect is
important even when it's not.
649
:
01:00:11
So the PC priors are developed for each type of these parameters.
650
:
01:00:16
It's not like uniform prior on all of it.
651
:
01:00:19
Like you have to derive it case by case.
652
:
01:00:23
But essentially what they do is they will shrink the parameter to the value where it means
the model is going to be simpler.
653
:
01:00:35
So, and this value depends on what the parameter is.
654
:
01:00:37
So, for instance, for the Weibull model, if you have one of the parameters equal to one,
then it's actually the exponential model.
655
:
01:00:46
But then, if you think about it, you will almost never estimate it equal to one for any
data, even if you simulate from exponential data, will not estimate it to be one.
656
:
01:00:55
um So, the penalizing complexity prior will put a lot of mass, then, for instance, at one.
657
:
01:01:03
But...
658
:
01:01:04
it's derived based on the distance between the models.
659
:
01:01:08
So we put a prior on the distance between the models, not directly on the parameter,
because the distance we can understand.
660
:
01:01:16
So the default priors in InLav for the hyperparameters, most of them are PC priors.
661
:
01:01:22
So even if you don't set the prior and use the default, now the defaults are mostly good.
662
:
01:01:28
There is still some development at the moment going on, especially for
663
:
01:01:34
of correlation matrices, um but in general most of them have good default priors now.
664
:
01:01:44
Okay, this is really cool.
665
:
01:01:46
Yeah, I didn't know you were using PC primes for ah that by default, but yeah, like, can
definitely vouch for them.
666
:
01:01:55
ah Also, like, not only because they have these great mathematical properties that you're
talking about, but also they are much easier to set uh because they have a more intuitive
667
:
01:02:11
explanation.
668
:
01:02:12
ah
669
:
01:02:13
So I know they are derived differently for um different parameters, but I use them all the
time now where I'm, as you were saying, setting um standard deviations on varying effects.
670
:
01:02:30
So basically the shrinkage factor of the random effects.
671
:
01:02:35
uh And yeah, so that's an exponential.
672
:
01:02:39
And then there is a formula that at least I hard-coded.
673
:
01:02:43
in Python that's coming from the paper.
674
:
01:02:46
guess that's what you did in Inla.
675
:
01:02:48
um And also I use that for the amplitude of Gaussian processes, which are basically a
standard deviation of Gaussian processes.
676
:
01:03:00
So yeah, that's the same.
677
:
01:03:03
And I really love that because you can think about them on the data scale.
678
:
01:03:09
So it'd be like...
679
:
01:03:11
I think there is a 60 % chance that the amplitude of the GP is higher than X.
680
:
01:03:20
X being uh a number that you can interpret on the data scale.
681
:
01:03:27
And that makes it much, much, much easier uh to think about for me.
682
:
01:03:33
So yeah, this is awesome that you guys do that.
683
:
01:03:36
um Actually, do you have any...
684
:
01:03:38
um
685
:
01:03:40
Any readings we can link to in the show notes because the only thing I know about are some
papers who talk about penalized complexity priors, but they are not super digestible for
686
:
01:03:54
people.
687
:
01:03:54
So yeah, I don't know if you know about code demonstrations that are a bit clearer and
also which penalized complexity priors are appropriate for different kinds of parameters.
688
:
01:04:11
Yeah, we have some works like for specific things so I can link them in the show notes for
sure.
689
:
01:04:18
Yeah, yeah, yeah, for sure.
690
:
01:04:19
Because I think, I mean, that's not always still quite new research, but yeah, it's still
not, it still hasn't distilled yet a lot in practice.
691
:
01:04:31
But I think it should be faster because these are extremely helpful and practical for
people.
692
:
01:04:40
So yeah, that's awesome that you guys do that.
693
:
01:04:42
I really like that.
694
:
01:04:43
And I think it's also something we want to try doing ah on the Pimcee side to be able to
make it easier for people to just use PC Prize.
695
:
01:04:59
Awesome.
696
:
01:05:00
Damn.
697
:
01:05:01
Have you added anything to add on these priors in general or PC priors in particular?
698
:
01:05:09
No, but it is exactly what Janette saying.
699
:
01:05:13
It's like at some point you realize you need some kind of structured framework to think
about priors and to derive priors.
700
:
01:05:22
And this is not guessing a prior.
701
:
01:05:25
This is like putting up
702
:
01:05:28
All
703
:
01:05:32
like principles, how to think about them, and then you can derive them.
704
:
01:05:36
It's just, that has just become math.
705
:
01:05:38
And you know, they work in the same way all the time because it is the same thing.
706
:
01:05:43
It's about distance between distribution and whether this is parameterized by a parameter
in standard deviation or log precision.
707
:
01:05:53
It's the same prior because in distance scale is the same.
708
:
01:05:56
It's just materialized differently for different parameters.
709
:
01:06:02
But the scary part is that at least I realized that I don't understand parameters.
710
:
01:06:10
If I see a parameter and see a prior, how does this effectively impact the distributions?
711
:
01:06:20
I can try, but I'm not very good at it.
712
:
01:06:23
So I don't trust myself.
713
:
01:06:24
I trust the distance-based thing because it's doing the same thing all the time.
714
:
01:06:29
And this was also derived.
715
:
01:06:32
How many years?
716
:
01:06:33
15 years?
717
:
01:06:33
No, 10 years ago?
718
:
01:06:35
More than 10 years ago?
719
:
01:06:38
Yes.
720
:
01:06:40
That was in Cronheim.
721
:
01:06:41
A key person there was also Daniel Simpson.
722
:
01:06:45
That was part of his group.
723
:
01:06:49
great guy.
724
:
01:06:51
Yeah.
725
:
01:06:52
So this really solves this because before that point, I think our kind of stand was, OK,
prior is not my problem.
726
:
01:07:02
If you want this prior, it's your problem.
727
:
01:07:05
But at some point it's actually become your problem because you're realizing how a lot of
the uh problems come from bad priors.
728
:
01:07:15
And then you need to, cannot think about this, this kind of standard priors that is kind
of asymptotically best.
729
:
01:07:29
They often define it that way.
730
:
01:07:30
So you have to do the math and you have to let something go to infinity, right?
731
:
01:07:34
To make sense of this thing.
732
:
01:07:35
And then you want prior that doesn't do anything.
733
:
01:07:38
You know, you have all these things that's usually the standard, but we want the prior to
do something.
734
:
01:07:45
We just want to prevent it from doing bad things.
735
:
01:07:49
So these priors are derived not to be the best.
736
:
01:07:54
They are
737
:
01:07:56
is prior you can use when you don't know what else to do and they are never bad.
738
:
01:08:02
So if you're doing kind of how do they perform against other things, they are always,
they're always on the top three, you know, they are never bad.
739
:
01:08:13
I just do the same thing because they are, they are derived from this distance way of
thinking.
740
:
01:08:19
it, no matter how you do the parameter session, whether you look for over parameter
session, over
741
:
01:08:27
or what's dispersion and negative binomial.
742
:
01:08:30
You're looking for variance in solters, solters, something.
743
:
01:08:34
You don't need to understand things.
744
:
01:08:36
You can just understand the concept of distance.
745
:
01:08:39
And as you said, Andrea, you can connect it to a property of the data.
746
:
01:08:44
That's what you do.
747
:
01:08:45
You have to do some kind of calibration.
748
:
01:08:47
You have to set some kind of scale.
749
:
01:08:50
And this is what you do.
750
:
01:08:51
The rest, the math is doing for us.
751
:
01:08:54
No, it's really nice.
752
:
01:08:59
But some.
753
:
01:09:03
There are still parameters that have bad default priors because there is no general good
one and we don't want to change whatever was there.
754
:
01:09:13
usually there is a little, most parameters have some kind of a PC prior you can use.
755
:
01:09:22
Yes.
756
:
01:09:24
Which makes life easier.
757
:
01:09:25
Yes.
758
:
01:09:28
Yeah.
759
:
01:09:29
I mean, I completely second everything you just said here.
760
:
01:09:33
Um, Howard, and I think it's also very telling that somebody very, um, mathematically
inclined and experienced as you are still has difficulties thinking about how the
761
:
01:09:47
different parameters interact in complex models.
762
:
01:09:50
And that means everybody has that problem.
763
:
01:09:54
I definitely have it and it's like, yeah, sure.
764
:
01:09:58
I I can always put a prior on.
765
:
01:10:01
that standard deviation or that covariance matrix, but once you start being deep enough
into the layers of a model, you don't really know how that impacts the uh model.
766
:
01:10:13
And the only way you can do that is mainly through painstakingly going through prior
predictive checks.
767
:
01:10:20
um So that's still possible, but that's a bit inefficient.
768
:
01:10:25
Sometimes there is no other way, but I'm sure there are a lot.
769
:
01:10:30
faster and better ways most of the time and PC priors give you the Pareto effect on that
so yeah folks let's try and start using that all of us instead of uh instead of going
770
:
01:10:46
blindly uh choosing our priors I think it's a good it's a good way to to start closing up
the show actually before I ask you the last two questions uh
771
:
01:10:57
Is there anything ah you wanted to talk about or mention that I didn't get to ask you?
772
:
01:11:09
No, would just say that honestly I think for if you have a latent Gaussian model there's
nothing better you can do to infer your model.
773
:
01:11:20
you cannot basically you achieve the accuracy of MCMC but really in almost real time.
774
:
01:11:29
So really if you have a latent Gaussian model just try it and if you need any help there
is a our Inla discussion group.
775
:
01:11:39
I also linked it in the show notes and Howard's very good at replying almost instantly and
there are also others that reply so if you try Inla and there's anything that comes up
776
:
01:11:51
that you're unsure just send them email to that group.
777
:
01:11:57
Yeah.
778
:
01:11:59
Nice.
779
:
01:11:59
Yeah.
780
:
01:12:01
Great.
781
:
01:12:02
uh Fantastic.
782
:
01:12:03
Thank you so much, folks, for taking the time.
783
:
01:12:05
That's really wonderful.
784
:
01:12:08
Before you get to go to bed, because it's very late for you.
785
:
01:12:12
So again, thank you so much for that.
786
:
01:12:15
ah I will ask you the last two questions I ask every guest at the end of the show.
787
:
01:12:21
So first one is, if you had unlimited time and resources, which problem?
788
:
01:12:26
would you try to solve?
789
:
01:12:28
So, uh Harvard, you wanna start?
790
:
01:12:37
I think I would try to solve the problems I'm working on now.
791
:
01:12:40
It's like we are in situation here where we have quite good resources and we are able to
do this kind of trying to solve this kind of problem that we have.
792
:
01:12:52
So I think I would just stick to those, you know.
793
:
01:12:55
I'm good.
794
:
01:13:00
Yeah, that's great.
795
:
01:13:02
mean, I'm sure people can hear you're passionate about what you're doing, I'm not that
surprised.
796
:
01:13:07
um Yannett.
797
:
01:13:11
Yeah, think we have lot of interesting problems already and KAUST makes it very easy to
work on big problems.
798
:
01:13:21
We have a very uh nice academic environment so I don't know if there is any big problem I
can think of to solve.
799
:
01:13:34
Awesome.
800
:
01:13:35
Yeah, that's cool, folks.
801
:
01:13:37
And well, yeah, since you have the floor, let's continue with you.
802
:
01:13:42
um If you could have dinner with any great scientific mind, dead, alive or fictional, who
would it be?
803
:
01:13:52
Okay, this is quite hard because I have had dinner with a lot.
804
:
01:13:56
So uh thinking about someone I've not had dinner with and who's also a South African, I
would love to have a dinner with either Trevor Hasty, who's still alive and he is a South
805
:
01:14:09
African, uh or uh Donny Krieger, who was the inventor of Krieging and he was also a South
African.
806
:
01:14:18
So one of these two would work for me.
807
:
01:14:23
That sounds good.
808
:
01:14:24
uh And Harvard, what about you?
809
:
01:14:29
If I could choose, it would be nice to meet Isaac Newton.
810
:
01:14:34
But he's been away for a long time.
811
:
01:14:36
I think he must have been quite special.
812
:
01:14:40
I'm not sure if it would be a very pleasant experience.
813
:
01:14:44
But still, it would be nice to meet someone like him.
814
:
01:14:50
Just meeting, yeah, I don't think it would be.
815
:
01:14:54
It will be an experience.
816
:
01:14:56
Yeah, for sure.
817
:
01:15:02
That sounds very interesting.
818
:
01:15:04
At least you could talk English to him.
819
:
01:15:08
Maybe there would be some vocabulary difficulties, but at least you would have English in
common.
820
:
01:15:14
uh I um
821
:
01:15:19
Well, I think we can call it a show.
822
:
01:15:22
I'm super happy because I had a lot of questions for you, but we could cover everything.
823
:
01:15:25
So thank you so much, uh for uh that.
824
:
01:15:30
um Thank you so much also to Hans Monchow for putting me in contact with you guys.
825
:
01:15:37
He was like, you should talk to these groups.
826
:
01:15:40
They are doing amazing work.
827
:
01:15:42
So thank you so much, Hans, for the recommendation and also for...
828
:
01:15:46
listening to the show, obviously have good taste.
829
:
01:15:49
uh And well, on that note, thanks again, Yanet and Harvard, for taking the time and being
on this show, and as usual, we'll put a lot of links, folks, in the show notes, so if you
830
:
01:16:07
were interested and want to dig deeper, make sure to check those out.
831
:
01:16:12
Yanet, Harvard.
832
:
01:16:14
Thank you again for being on the show.
833
:
01:16:16
Thank you.
834
:
01:16:17
Thank you.
835
:
01:16:18
It's been very nice.
836
:
01:16:19
Thank you.
837
:
01:16:25
This has been another episode of Learning Bayesian Statistics.
838
:
01:16:28
Be sure to rate, review, and follow the show on your favorite podcatcher, and visit
learnbaystats.com for more resources about today's topics, as well as access to more
839
:
01:16:39
episodes to help you reach true Bayesian state of mind.
840
:
01:16:43
That's learnbaystats.com.
841
:
01:16:45
Our theme music is Good Bayesian by Baba Brinkman, fit MC Lass and Meghiraan.
842
:
01:16:50
Check out his awesome work at bababrinkman.com.
843
:
01:16:53
I'm your host.
844
:
01:16:54
Alex and Dora.
845
:
01:16:55
can follow me on Twitter at Alex underscore and Dora like the country.
846
:
01:16:59
You can support the show and unlock exclusive benefits by visiting Patreon.com slash
LearnBasedDance.
847
:
01:17:06
Thank you so much for listening and for your support.
848
:
01:17:09
You're truly a good Bayesian.
849
:
01:17:11
Change your predictions after taking information in and if you're thinking I'll be less
than amazing.
850
:
01:17:18
Let's adjust those expectations.
851
:
01:17:21
Let me show you how to be.
852
:
01:17:31
Let's get them on a solid foundation