Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!
Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!
Visit our Patreon page to unlock exclusive Bayesian swag ;)
Takeaways:
Chapters:
00:00 Introduction to Bayesian Modeling in Insurance
13:00 Time Series Models and Their Applications
30:51 Bayesian Model Averaging Explained
56:20 Impact of External Factors on Forecasting
01:25:03 Future of Bayesian Modeling and AI
Thank you to my Patrons for making this episode possible!
Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser, Julio, Edvin Saveljev, Frederick Ayala, Jeffrey Powell, Gal Kampel, Adan Romero, Will Geary, Blake Walters, Jonathan Morgan and Francesco Madrisotti.
Links from the show:
Transcript
This is an automatic transcript and may therefore contain errors. Please get in touch if you're willing to correct them.
In this episode, I am thrilled to host Nate Haynes, the head of data science research at
Ledger Investing and a PhD from Ohio State University.
2
:Nate's expertise in generative invasion modeling
3
:helps tackle the challenges in insurance -linked securities, especially with issues like
measurement errors and small data sets.
4
:He delves into his use of state -space and traditional time series models to effectively
predict loss ratios and discusses the importance of informed priors in these models.
5
:Nate
6
:also introduces the BaseBlood package, designed to enhance predictive performance by
integrating diverse model predictions through model stacking.
7
:He also explains how they assess model performance using both traditional metrics like
RMSE and innovative methods like simulation -based calibration, one of my favorites, to
8
:ensure accuracy and robustness in their focus.
9
:So join us as Nate unpacks the complexities of Bayesian modeling.
10
:the insurance sector, revealing how advanced statistical techniques can lead to more
informed decision -making.
11
:This is Learning Vision Statistics, episode 115, recorded June 25, 2024.
12
:Welcome Bayesian Statistics, a podcast about Bayesian inference, the methods, the
projects, and the people who make it possible.
13
:I'm your host, Alex Andorra.
14
:You can follow me on Twitter at alex -underscore -andorra.
15
:like the country.
16
:For any info about the show, learnbasedats .com is Laplace to be.
17
:Show notes, becoming a corporate sponsor, unlocking Bayesian Merge, supporting the show on
Patreon, everything is in there.
18
:That's learnbasedats .com.
19
:If you're interested in one -on -one mentorship, online courses, or statistical
consulting, feel free to reach out and book a call at topmate .io slash alex underscore
20
:and dora.
21
:See you around, folks.
22
:and best patient wishes to you all.
23
:And if today's discussion sparked ideas for your business, well, our team at PIMC Labs can
help bring them to life.
24
:Check us out at pimc -labs .com.
25
:Nate Haynes, welcome to Learning Bayesian Statistics.
26
:Thanks for having me, very excited to be here.
27
:Same.
28
:Very, very excited to have you here.
29
:Also because a lot of patrons of the show have requested you to be here.
30
:One of the most convincing was Stefan Lorentz.
31
:I'm pronouncing that the German way because
32
:think he's somewhere from from there.
33
:Maybe he's in or Swiss and then like he hates me right now.
34
:But yeah, Stefan, thank you so much for recommending Nate on the show.
35
:And I hope you'll appreciate the the episode.
36
:If you don't, this is entirely my fault.
37
:And not Nate's at all.
38
:Yeah, well, I appreciate the shoutouts.
39
:Yeah, no, he was like really
40
:He told me like that, I'll tell you what he told me.
41
:was, for a while now, I've been thinking about an interesting hook to recommend Nathaniel
Haynes.
42
:Someone who is not like so many of my previous recommendations currently in academia.
43
:Yeah.
44
:Yeah.
45
:And he was like, today it seems to have presented itself.
46
:He just released.
47
:a Python library for Bayesian model averaging, a very practical topic that hasn't been
discussed yet in any episode.
48
:know, he was really, really happy about your work.
49
:Very cool.
50
:Yeah, that's all I wanted to hear.
51
:Yeah, and we're definitely going to talk about that today, model averaging and a lot of
cool stuff on the deck.
52
:But first, can you tell us basically what you're doing nowadays?
53
:If you're not in academia, especially since you live in Columbus.
54
:which I think is mostly not for Ohio State University.
55
:Right, yeah.
56
:We think of ourselves as a flyover state.
57
:Well, others think of us as that.
58
:We like to think that we're much cooler and hipper and all that.
59
:Yeah, yeah, Yeah, so I've been, for the last few years, I've been working as a data
scientist remotely, and I've been at two different startups during my time the last few
60
:years after
61
:graduating from my PhD with my PhD focused on clinical mathematical psychology, where it's
kind of really where I did a lot of Bayesian modelings where I learned a lot of Bayesian
62
:modeling, which led me to yeah, where I am today at my current company, I'm with Ledger
Investing.
63
:And we are sort of like in between what I would call like insurance industry and like
finance in a way.
64
:We are not an insurance company, but that's the data that we deal with.
65
:And so a lot of our modeling is focused on that.
66
:And at Ledger, I'm the manager of data science research.
67
:So a lot of my work focuses on building new models, productionizing those models, and
finding out different ways to kind of incorporate models into our production workflow.
68
:And yeah, I'll be.
69
:happy to dive into more detail about that or kind of how I got here because I know it's
something I've talked to a lot of people about too, especially on the academic side.
70
:Like the transition to industry itself can be kind of something that's a little bit
opaque, but then also like based in industry is like, I didn't know there was like people
71
:doing a lot of that.
72
:And so yeah, excited to talk in more detail about about all of that.
73
:for sure.
74
:Actually, how did you end up working on these different on these topics because
75
:BASE is already a niche.
76
:then specializing in something BASE is even more niche.
77
:So I'm really curious about how I ended up doing that.
78
:Yeah, just like BASE in general.
79
:So I actually got exposed to BASE pretty early on.
80
:I guess you could say I have a weird background as a data scientist because I did my
undergrad degree in psychology and I just did a BA, so I didn't really take...
81
:much math in undergrad.
82
:But I got involved in a mathematical psychology lab, research lab, later on in my
undergraduate degree.
83
:And this was run by Tricia Van Zandt at Ohio State.
84
:So actually, I grew up in Columbus and I've been here ever since.
85
:But I started to work with her and some grad students in her lab.
86
:And they were all, yeah, the best way to put it, were hardcore Bayesians.
87
:So they did a lot of mathematical modeling of sort of more simple decision making tasks.
88
:And by simple, guess I just mean like the, you know, like response time types of tasks.
89
:And so they did a lot of reaction time modeling, which has a pretty deep history in
psychology.
90
:And so they were all Bayesians.
91
:That was the first time I saw the word.
92
:I remember seeing that word on like a grad students poster one time.
93
:Like, what is that?
94
:And so I got exposed to it a bit in undergrad.
95
:And then when I, like as I was going through undergrad, I knew I wanted to go to grad
school.
96
:I wanted to do a clinical psychology program.
97
:I was really interested in cognitive mechanisms, things involved with mental health.
98
:And I got really lucky because there was an incoming faculty at Ohio State who, Wooyoung
Ahn, that's his name, and he
99
:was the one who he brought a lot of that to Ohio State.
100
:He wasn't that he didn't end up being there for too long, but he said now it's still
National University.
101
:But I worked with him for a year as a lab manager.
102
:And in that first year, he he really wanted to build some open source software to allow
other people to do decision making modeling with psychological data.
103
:And the way to do that was to use hierarchical bays.
104
:And so
105
:I kind of got exposed to all of that through my work with Young.
106
:yeah, we did a lot of that work in Stan.
107
:And so that was kind of like the first time I really worked on it myself.
108
:But I'd kind of known about it and knew about some of the benefits that Bayes can offer
over other philosophies of statistics.
109
:And that started pretty early on in grad school.
110
:So I think
111
:I'm probably a weird case because I didn't really have like traditional stats training
before I got the Bayes training.
112
:And so a lot of my perspective is very much like I'm, I think a lot in terms of generative
models and I didn't have to unlearn a lot of frequentist stuff because my understanding by
113
:the time I really started diving into Bayes was pretty rudimentary on the frequentist
side.
114
:And so yeah, that, kind of naturally
115
:I got really involved in some of that open source work during graduate school on the
package we released called HBaseDM, which is a mouthful, but really good for search
116
:because nothing else pops up if you search HBaseDM.
117
:so that was kind of my first foray into like open source Bayesian modeling types of
software.
118
:that eventually, like I decided that
119
:you know, I really like to do this method stuff.
120
:It was more focused on modeling side of my work than I was on like the domain per se.
121
:And had a really interesting kind of track into industry.
122
:I wasn't actually originally pursuing that.
123
:That wasn't my intention, but I actually got, I just got a cold email one day the summer I
graduated from a co -founder at my previous company, which is called AVO Health.
124
:and they were looking for someone who did something very particular, which was like
hierarchical Bayesian modeling.
125
:They were familiar with psychological data.
126
:And so I kind of fit the bill for that.
127
:And I decided that it'd be worth the shot to try that out.
128
:And I've been in industry ever since.
129
:so, yeah, I think it was...
130
:Really what got me into it originally was just kind of being in the context surrounded by
people doing it, which I don't think most people get that experience because BASE is still
131
:rather niche, like you said.
132
:But at least in the circle that I was in, in undergrad and grad school and things like
that, it was just kind of the way to do things.
133
:And so I think that's colored my perspective of it quite a bit and definitely played a big
role in why I ended up at Ledger today.
134
:Yeah, super cool.
135
:Yeah, and I definitely can relate to the background in the sense that I too was introduced
to stats mainly through the Bayesian framework.
136
:So thankfully, mean, that was hard, but that was not as hard as having to forget
everything again.
137
:so.
138
:right, right.
139
:It was great.
140
:I remember being very afraid when I opened a classic statistics book and
141
:was like, my God, how many tests are there?
142
:It's just terrible.
143
:No, exactly.
144
:And it's hard to see how things connect together, yeah.
145
:No, I was not liking stats at all at that point.
146
:And then thankfully, I did electoral forecasting and you kind of have to do base in these
realms.
147
:know, that was really cool.
148
:one of the best things that ever happened to me.
149
:exactly.
150
:So you're forced into it from the start.
151
:It doesn't give you much choice.
152
:And then you look back and you're happy that that ended up happening, Exactly.
153
:Yeah.
154
:And actually, you do quite a lot of time series models, if I understood correctly.
155
:So yeah, could you talk a bit about that?
156
:I'm always very interested in time series and forecasting models.
157
:how useful they are in your work.
158
:Yeah, yeah.
159
:So I think maybe first to start, like I can give a bit of context on kind of the core
problem we're trying to solve at Ledger and that'll help kind of frame what we do with the
160
:time series models.
161
:So like, basically what we provide is an alternative source of capital for insurance
companies.
162
:And so it's like, you if I wanted to start an insurance company, I'd have to have
163
:a ton of money to like have in the bank so that when people are, know, if something goes
wrong and I write a bunch of policies for private auto, for example, for car insurance, I
164
:have to be able to make people whole when, you know, an accident happens.
165
:And so when insurers are trying to fund different books of business, they need often to
raise lots of capital for that.
166
:And traditionally,
167
:one of the methods that they have done to accomplish this is to approach reinsurers, which
I didn't know anything about before I joined Ledger.
168
:I'm kind of an insurance newbie at Ledger.
169
:Now it's been a couple years, so I can't say that anymore.
170
:But basically go to someone with even more money to kind of provide the capital and kind
of allow them to write their business.
171
:And so we...
172
:kind of, we basically work with insurers or other similar entities and then investors and
allow the investors access to this insurance risk as like an asset class.
173
:And then from the perspective of the insurance side, they're getting the capital that they
need to fund their programs.
174
:And so it's sort of a way for the insurance companies like it because it's the source of
capital they need to do business.
175
:the investors like it because they get to invest in how these portfolios of insurance
programs are performing as opposed to like say investing in an insurance company stock or
176
:something like that.
177
:And so it's a little bit more uncorrelated with like the market in terms of like other
types of assets that investors might have access to.
178
:And so that's kind of the core problem.
179
:Our data science team is like the context that we're
180
:we're baked within and what we're actually modeling, like the thing we're trying to solve
is, you know, so we have, say an insurance company approaches us and they have a
181
:commercial auto or a private auto program or a workers compensation program.
182
:You know, so a lot of times they'll have like been writing that kind of program.
183
:They've been in that business for, you know, five, 10 years or something.
184
:And so they have historic data.
185
:And the way we look at the data is you have like different accident years.
186
:So if you think like, you know, if we're looking at it today in year 2024, maybe they have
like 10 years of business that they've been writing.
187
:And so we look back all the way to like 2014, 2015, 2016, and we see how much have they
lost, like how much claims have they paid out versus premium have they taken in.
188
:And so there's this quantity of like the loss ratio is really important.
189
:And in a lot of areas of business, it's like around a 60%.
190
:And this is before like you're paying salaries and all of that, just like the pure like
insurance side, like around 60 % might be pretty typical or pretty reasonable for like a
191
:good -ish program.
192
:So it's an overgeneralization, but just to keep some numbers in mind.
193
:And the interesting thing about this though is that, you know, we look back at 2014 and we
have 10 years of history.
194
:So we...
195
:We kind of know what the loss is for 2014 for a program that comes to us today, but what
of like:
196
:There's only been a year since then.
197
:And the way that insurance often works, you've ever had to file a claim for like
homeowners or car insurance, something like this, you're probably familiar.
198
:It can take quite a while for you to get paid out.
199
:And sometimes there's lawsuits, sometimes people don't file a claim until years later.
200
:Like it can, there can be a lot of different reasons that
201
:you know, the information we have today about losses in any given year maybe isn't
complete or the way that we think about it is it's a loss ratio isn't developed.
202
:And so you think about like the data that we have, it kind of takes the shape of this
triangle where if you, and we call them loss triangles where you have, you can think of
203
:like a matrix where the sort of act or the Y axis would be
204
:the accident years, so the different years that accidents are occurring.
205
:And then the X axis would be how much time has passed since we're looking at that accident
year.
206
:So we call that like the development period or something similar.
207
:And so like 2014, we have 10 cells, 10 years of data that we can look back on.
208
:2014, we have nine and so on and so forth.
209
:And so it kind of forms this triangle.
210
:And so basically for us to price these deals,
211
:What we end up needing to do is two things and there's kind of two basic like modeling
steps involved.
212
:And the first is to find out, you know, where do we think the loss ratio is going to end
up for all of these accident years?
213
:Like if we were looking back, you know, like a hundred years from now.
214
:And so we want to know like what that ultimate state of the loss ratio is.
215
:And so that's the first part where some time series models come into play.
216
:And so we have this kind of weirdly shaped data and we want to extrapolate out from
historical to kind of thinking about the year as being static that we're looking at, but
217
:like our information that we have on it is dynamic and we can learn more about that as
time goes on.
218
:And so our first task is to kind of predict that ultimate state.
219
:And that just gives us a sort of more accurate representation of what we think the history
will look like.
220
:And so that's our first step.
221
:And then the second step, which is where we use much more traditional time series models.
222
:And the second step is to then say, okay, given that history of like the ultimate state
for each previous year, like what are the next two, three years going to look like?
223
:And that's where we have more traditional forecasting models.
224
:But because we have this multi -stage process, like there's uncertainty from one model
output that we need to account for in that.
225
:second stage.
226
:And so we have, like we do some measurement error modeling and things like that.
227
:And that at the end of the day is really why like Bayes ends up being such a useful tool
for this problem, just because there's lots of sources of uncertainty.
228
:There's a rich history of actuarial science where actuaries have developed models to solve
similar problems to this.
229
:And so there's like theory -informed models.
230
:historic data that we can use.
231
:And so we get to use really everything in the base toolbox.
232
:Like we get to use priors, we get to use very theory informed generative models.
233
:And then we also get to do some some fun things like measurement error modeling and things
of that nature, kind of between the various stages of the modeling workflow that we
234
:follow.
235
:I know this is of a long explanation, but I think the context is kind of helpful to
understand like
236
:why we approach it that way and why we think base is a useful way to do so.
237
:Yeah.
238
:Thanks a lot for that context.
239
:I think it's very useful because also then I want to ask you about, you know, then which
kind of time series models you mostly use for these use cases and what are some of the
240
:most significant challenges you face when dealing with that?
241
:that kind of time series data?
242
:Yeah, no, it's a question.
243
:So I'd say like the time series models, we do a lot of state space modeling.
244
:so we've done, it really kind of like there's, we do a lot of research on exploring
different forms of models, but the stuff that we end up using in production, like that
245
:first stage where we do our kind of development process.
246
:Those models are more similar to like the classic actuarial science models.
247
:So they technically are science or time series models, but they're kind of these non
-parametric models where we're just estimating, you know, say your loss is 20 % during the
248
:first development life.
249
:what are, can we estimate some parameters that if you kind of multiply that by some factor
that that gives us the next.
250
:period.
251
:And so there's these link ratio style models that we use in that context.
252
:And so there's a little less traditional but more in line with what actuaries have
historically done.
253
:And then for the forecasting piece, that's where we use more kind of modern, more
classical, or not classical, but more what you would imagine when you think of time series
254
:models today.
255
:things like, like
256
:autoregressive styles of models.
257
:We do, like I said, states -based models where we kind of assume that these loss ratios
are really this latent kind of drifting parameter over time.
258
:And then we have sort of the latent dynamics paired with some observational model of how
those losses are distributed.
259
:And then sometimes in
260
:A lot of times in the investment or the finance world, people talk about whether they
think some sort of asset is mean reverting or if it shows some sort of momentum in the
261
:underlying trends.
262
:And so we have different models that capture some of those different assumptions.
263
:actually, pretty interesting, people all across the business tend to be pretty interested
in whether a model has mean reversion in it or momentum in it.
264
:And that becomes actually a question that
265
:a lot of investors and insurance companies alike are interested in knowing because they
might have disagreements about whether or not that component should be in the model.
266
:But so that's the I'd say like those types of time series models are what we use most
regularly in production.
267
:So like your traditional state space models that like terms of the challenges we face.
268
:I think the big challenge is that
269
:kind of based on the context that I was providing, you we might have like 10 years of
history on a program and that would be a good outcome.
270
:And so, you know, if our time series is 10 previous data points where some of the more
recent ones are highly uncertain because they're actually, you know, they're predictions
271
:from a previous model, I think you might kind of start to imagine where the issues can
arise there.
272
:And so I think...
273
:I would say that's probably our biggest challenge is the data that we work with from a
given program.
274
:The numbers are big because we're talking about investment amounts of dollars.
275
:A program might write 10 to $100 million of premium.
276
:And so the loss values are pretty high themselves.
277
:And so it's a lot of information there.
278
:But the history that we have to build a time series model on is pretty...
279
:short.
280
:And so a lot of like classical time series approaches, there's quite a bit more data that
people are working with.
281
:You'll hear about like things with seasonality and other types of things where you're
decomposing a time series.
282
:And we don't really have the ability to do any of those classical modeling approaches,
mostly just because we don't have the history for it.
283
:And so one of the ways that we approach that problem to help
284
:solve it, at least to solve it to some extent, is that we do have information on many,
many different insurance companies and their losses historically.
285
:even if the history may not be very long, we might have at maximum 30, 40 years of
history, 50 years of history sometimes on 50 basically individual data points in a time
286
:series model.
287
:Typically we have much less.
288
:But one of the things that happens in the insurance industry is that all of these
companies, need to publicly release certain information each year.
289
:so we're able to use that to basically, we were able to take that information and use it
to help us obtain informed, like data informed priors.
290
:And so that when a smaller program comes our way and we are using our time series models
on that.
291
:we have priors that have been pretty fine tuned to the problem.
292
:so like priors that are fine tuned to that particular line of business, whether it's
commercial auto or workers' compensation or something like that.
293
:I'd say that's like our biggest challenge is that small kind of problem.
294
:then based with the informed priors is a way that we're able to tackle that in a more
principled way.
295
:Yeah, yeah.
296
:That makes, like...
297
:ton of sense.
298
:And that sounds like very fun models to work on.
299
:Yeah.
300
:Yeah.
301
:I really love that.
302
:So state space models.
303
:You're mainly talking about HMMs, things like that.
304
:Yeah, of the same form.
305
:Yeah, we've done some experimenting with Gaussian processes.
306
:Actually about to submit, my colleagues about to submit a paper doing some work with HMMs.
307
:hidden markup models for anyone who's listening, guess, who might not know what that
acronym stands for.
308
:But typically, the models that we use end up being even more simpler than that for our
forecasting problem, mostly just because of the fact that we do have such a small data to
309
:work with.
310
:Oftentimes, the functional form of the model can't be too complex.
311
:And so they end up being kind of more similar to your
312
:typical like a Rima style models, which you can kind of write in a state space fashion.
313
:And so that it tends to be, yeah, more closely related to those than, than models that do
like regime switching things like that, because oftentimes we just don't have enough
314
:information to, be able to fit those types of models, even with them from priors, it might
not be as believable.
315
:That being said, it's, one of those things that like, if we,
316
:If we do think that something might work well, like if we think that, you know, adding a
more complicated mechanism on the momentum piece of the model or, or adding in different
317
:assumptions about mean reversion and things like that, we, we typically do explore those
types of things, but surprisingly hard to beat simple time series models with the small n
318
:in our context.
319
:And so we, we do, we do quite a lot of
320
:cross validation to determine which types of models we should be using in production.
321
:And oftentimes it's a mix of evaluating those models based on their performance, but also
how well calibrated they are and things of that nature so that we know that the models
322
:we're using are interpretable and we can kind of defend them if something ends up going
sideways.
323
:We want to be able to go to the investor and say, like, you know, we did our due diligence
and here's why we think this was still a good choice at the time.
324
:I'm not sure if that gets at that question, but let me know if I can expand on the models
in particular.
325
:No, mean, it's funny you'd say that because definitely it's hard to beat, it's
surprisingly hard to beat regression in a lot of contexts.
326
:If you do a generalized regression, that's already a very good baseline and that's
327
:pretty hard to beat.
328
:So I'm not supposed to hear that's the same here.
329
:Right.
330
:And I think part of the issue too with our data is that like the more recent observations
in the time series have this high uncertainty along with them.
331
:So with the measurement error component in there, it's difficult to choose between
different model configurations.
332
:And so the more complicated your forecasting model is, that uncertainty ends up kind of
333
:making it even harder for a more complex model to win out in our tests.
334
:And so that's one of the things that we've observed in something that I think probably
anyone who's been involved in similar contexts would be able to say they've run into as
335
:well.
336
:Yeah.
337
:And well, actually, I want to make sure we get to model leveraging and comparison.
338
:So I still have
339
:a few questions for you with time series and these kind of models.
340
:let's switch gears a bit here.
341
:And like, tell us how you use Bayesian model averaging in your projects.
342
:And what advantages do you do you see in this approach over a single model predictions?
343
:Yeah, no, it's a good question.
344
:So I hadn't done a ton of work with
345
:Bayesian model averaging or model averaging in general before I joined Ledger.
346
:And so I was really excited to get to work on some of that stuff.
347
:one of the ways I'd say it comes up in multiple parts of our workflow now, but one of the
first use cases was for our forecasting models.
348
:And I was describing a bit earlier, we
349
:You know, they, we have different models that make different assumptions about the
underlying losses and how they might change over time.
350
:And I think the one, one example is, like, does the, does the process have momentum or
not?
351
:like if a loss ratio is trending upward, do we think it's going like, is there going to be
some component of the model that kind of keeps it trending upward over time versus do we
352
:have something in there where it functions more like a random walk and.
353
:And this is something that a lot of industry experts might debate or like if you're if
you're like an actuary or CEO of some insurance company and you're trying to explain like
354
:why your losses are trending in a certain direction, like people talk about these things
pretty normally, like momentum or reversion, things like that.
355
:And so so because people have varying opinions about this, our approach
356
:you know, one approach would be to try different models that make those different
assumptions and then do some do some model comparison and just select one.
357
:But the because often there's certain contexts where, you know, it might make sense to
assume a momentum.
358
:It might make sense to assume reversion and other contexts where it might not.
359
:The model averaging became kind of like a very natural.
360
:thing to do and try out in that context.
361
:And so that was really what inspired it is just this idea that we don't have to
necessarily choose a model.
362
:If we think both are reasonable, we can allow the data to make that decision for us.
363
:And so that's really where it came into our workflow.
364
:when we're doing our forecasts, we'll have these different models that we fit and make
predictions with.
365
:And then we have our model averaging models, which now talking about models of models gets
a little bit fun terminology wise.
366
:But that's the stage where we bring those in and we say, like, okay, given, you know, we
might have some covariates that we can use to build those models, those averaging models.
367
:so things like we know what line of business it is, it's commercial auto workers'
compensation, something like that.
368
:Like we know how much, how like...
369
:big the volume is, like how much premium that the program brings in.
370
:We know locations of these different businesses.
371
:And so all of those can then be used as covariates and like a stacking model, for example.
372
:And we can train those models to combine, rely more on the assumptions of one model over
the other, depending on the context.
373
:And that was the motivation and that's where we still do that work today is mostly at that
forecasting step.
374
:But yeah, I think Bayesian model averaging is really nice because if you have the capacity
to be able to fit the models that you want to blend together, we found through our
375
:research, if we do that and compare it to like a single model in isolation,
376
:Not always, but oftentimes it will end up performing better.
377
:And so it's sort of like, why not take the best of both worlds as opposed to having to
worry about model selection?
378
:And especially when the underlying models that we're blending together are both like
equally theoretically motivated and it's hard to really make a decision, even if the data
379
:were to suggest one over the other.
380
:Yeah, I mean, that definitely makes sense if you have a bunch of good models.
381
:That's really cool to be able to average them.
382
:I remember when I started learning Bayesian stanza, I was really blown away by the fact
that this is even possible.
383
:that's just incredible.
384
:Can you...
385
:So actually, can you contrast model averaging with Bayesian model comparison to make sure
listeners understand both concepts and how they fit together, and then talk about how you
386
:implement
387
:these techniques in your modeling workflow?
388
:Yeah, no, great question.
389
:think so when I think of Bayesian model comparison, I often think of different types of
metrics that we might have, whether it's approximations or done by brute force, we might
390
:like we might have some sort of cross validation metrics that we evaluate the models on.
391
:So like in our forecasting case, you know, we might have actual historical
392
:you know, maybe we look, have actual data from like 2000.
393
:And so we actually have like 10 years of history on it.
394
:We know what the ultimate state is.
395
:We know what like the forecast should predict.
396
:In those cases, you know, we can train our models.
397
:We can have them do the out of sample predictions.
398
:And then we can score on those out of sample predictions, like how well they're
performing.
399
:So, you know, we often actually do the brute force.
400
:as opposed to doing something like the, I know in the Stan community, you might have like
the Pareto smooth importance sampling, leave one out approximations, things like that is
401
:another way to approach the problem.
402
:But basically a lot of times when you're doing Bayesian model comparison, you'll have some
out of sample metric or approximation to it.
403
:And then you like, you might have that for a bunch of out of sample data points.
404
:And then those data points, can
405
:do some statistical tests or even just look at sort of absolute values of how much better
one model is predicting now to sample performance metrics versus another.
406
:And in the STAND community, and well, PMC as well, think like the expected log point -wise
predictive density or the ELPD is a quantity that's often used, which is sort of a log
407
:likelihood based metric that we can use on.
408
:out of sample data to compute like expected predictive performance.
409
:And typically for Bayesian model comparison, the practice almost stops after you get that
ELPD value or something similar.
410
:might be, you might do some test of like how different they are, like some standard error
on the difference of the ELPD between two models.
411
:But at the end of the day, like once you have that metric,
412
:that's sort of the inference that you might have at the end is that, okay, this model is
performing better per this metric.
413
:with stacking, what you're doing, and I guess there's different forms of model averaging.
414
:have like Bayesian model averaging, which is slightly different than stacking and things
like that.
415
:But what would all of them follow the same basic principle is that you have your out of
sample performance metrics.
416
:And then what you do is instead of just choosing one model based on the model that has
better out of sample performance metrics, you build a model on those performance metrics
417
:to kind of tell you when you should rely on, you know, model A versus model B.
418
:And so, so the stacking or averaging models we can think of as just like a different model
themselves that are trained instead of on your outcome.
419
:measure in your actual substantive or your candidate model that you care about.
420
:It's trained on the performance metrics, the auto sample performance metrics that you are
using to, in this case, you wouldn't be doing model selection.
421
:You'd be blending together the predictions from each of your candidate models according to
the model and how it thinks you should weight both based on the auto sample performance.
422
:And so.
423
:So kind of going that route does require a bit more, you have to think a little bit more
about like how you're using your data because if you want to evaluate how well a stacking
424
:model is performing, for example, you have to leave out a little bit more validation data.
425
:So you don't want to do any double dipping.
426
:so you'll have your candidate models that you'll make out of sample predictions on.
427
:Those predictions become
428
:that your performance on those predictions become the basis for training your stacking
model.
429
:And then at the end of the day, you might train your stacking model on some other third
validation set of data.
430
:So I think that's really the only big limitation, I would say, of using those approaches
over just like your traditional model comparison, where you're kind of done once you
431
:select your model.
432
:That being said, think, yeah, being able to combine the predictions from the candidate
models ends up oftentimes being well worth, well worth kind of dividing your data up that
433
:way.
434
:Yeah.
435
:Yeah, yeah, definitely.
436
:That's, that's an extremely, extremely good point and also very useful method.
437
:I know in, in PIMC, for instance, with RVs, you can do that very easily where you
438
:basically do your model comparison with all these, it's gonna give weights to the models
and then those weights are used by a plan C with the PMW sample, post -hera predictive W
439
:where we weight each models, predictions, each models, post -hera predictive samples,
according to the weights from the model comparison.
440
:So is that.
441
:how you usually end up implementing that or using Stan?
442
:How do you do that?
443
:I think it's going to be interesting for the listeners who want to give that a try.
444
:Yeah, no, it's a great question.
445
:So that's good plug for the paper we just wrote.
446
:So yeah, we've been actually using some internal software to do a lot of this.
447
:like actually, all of our software historically has been like we have our own kind of
448
:BRMS is not the right term for it, but we have our own language that we use to write Stan
models.
449
:And then we do a lot of our stacking and stuff.
450
:had our own internal code that we would use to do all of this.
451
:But we decided recently that this was something that, yeah, I think we were talking about
before the show started today, that it's not something that
452
:has gotten a lot of attention, like in terms of like making this easy to do in a generic
way with like a bunch of different types of stacking models.
453
:And so we've actually just released and wrote a paper on this package in Python called
Bayes blend.
454
:And what this package, the intent and what we hope that it will allow users to do, what
allows us to do it at least.
455
:hopefully other users as well is
456
:like given, you know, I might have a model that I fit in Stan or PMC or, you know,
whatever probabilistic programming language of choice.
457
:We built the package such that you can kind of initialize a stacking model, one of a
various different ones.
458
:So we have like the pseudo Bayesian model averaging types of models, the pseudo BMA plus
models, which are things that are based on the
459
:ELPD and they blend based on that.
460
:And then we also have like proper Bayesian stacking and hierarchical stacking models that
you can use with Bayes blend where given the out of sample likelihood metrics that you can
461
:get by training your data or training your model on one set of data, making out of sample
predictions on another test set, given those as input.
462
:you can fit a variety of these different stacking models and then easily blend them all
together and evaluate performance and things like that.
463
:And so we've built that in Python just because that's the stack that like we use Python
for our day to day work and in our production setting.
464
:And then we've been building integrations so that like right now it's really easy to
interface with.
465
:command stand because that's what we use.
466
:So we kind of built from that perspective first, but it does interface with our viz as
well.
467
:So if you're using like IMC, for example, make your, can kind of create that our viz
inference data object and then use that as input for base blend.
468
:And then yeah, what you will get at the end of that workflow if you use base blend is you
get the blended predictions from your candidate models.
469
:as well as the blended likelihood, like the posterior likelihood, which you can use then
to evaluate performance and things like that.
470
:And so, yeah, we're really excited about this.
471
:I'm really excited to get other people outside of Ledger to use it and tell us what they
think, make some complaints, some issues.
472
:There's a discussion board on the GitHub page as well.
473
:And we have a paper that we've submitted to
474
:We have a preprint on archive and we've submitted the paper as well, the journal, see how
that goes.
475
:But it's something that we use regularly and so it's something that we plan to keep
contributing to.
476
:if there's like quality of life or convenience things to make it easier for other folks to
use, we'd to hear about it.
477
:Because I think there's a lot of work that can still be done with stacking.
478
:There's a lot of really cool methods out there.
479
:I think hierarchical stacking in particular is something that I haven't really seen used
much in the wild.
480
:It's something we use every day at Ledger, which I think is, yeah, so I'm hoping base
blend will allow other people to kind of see that benefit and apply it in their own work
481
:easily in a reproducible way.
482
:Yeah, this is super cool.
483
:And so of course I put in the show notes the paper.
484
:and the documentation website to Baseband for people who want to dig deeper, which I
definitely encourage you to do.
485
:And when you're using Baseband, let's say I'm using Baseband from a PIMC model.
486
:So I'm going to give an inference data object.
487
:Do I get back an inference data object?
488
:object with my weighted positive predictive samples?
489
:How do I get back the like, which format am I going to get back the predictions?
490
:Yeah, that's a great question.
491
:I think if I can remember correctly, I don't want to give you the wrong information.
492
:I'm pretty sure we have like, like, when you create the object that does the stacking, so
the model object, there's a from our method.
493
:And then I think we have a toRViz method that you can kind of, it will use its own
internal representation of the predictions and things like that for just the sake of
494
:fitting the stacking model.
495
:But then I think you can return it back to an RViz inference object at the end.
496
:And one of the things that I wanna do, we haven't had the bandwidth for it quite yet, but
it's not that many steps to then just kind of get rid of like.
497
:we should just have like a from time C method, for example.
498
:And I think implementing something like that would be pretty straightforward.
499
:So I think we'll probably get to it eventually, but if anyone else wants to contribute,
once they know that, we have that doc on like how to contribute as well on the GitHub
500
:page.
501
:So, but yeah, so I think we, our intention is to make it as seamless as possible.
502
:And so to the extent that there's ways that we can make it easier to use, definitely open
to add those features or take recommendations on how we should approach it.
503
:But yeah, think probably right now working through the RVIS entrance data object is the
way you can interface with most things other than command stand.
504
:Yeah.
505
:Yeah.
506
:Yeah.
507
:I mean, for people listening in the Pinesy world and even
508
:Python world and even Stan world, investing in understanding better the inference data
object and XRA is definitely a great investment of your time because I know it sounds a
509
:bit frustrating, but it's like, basically it's like pandas.
510
:It's the pandas of our world.
511
:And if you become proficient at that format, it's gonna help you tremendously in your
Bayesian modeling workflow because
512
:You may only want to interact with the model, but actually a huge part of your time is
going to be making plots.
513
:And making plots is done with prior predictive or preserve predictive samples.
514
:And that means they live in the inference data object.
515
:I know it can be a bit frustrating because you have yet another thing to learn, but it is
actually extremely powerful because it's a multi -dimensional pandas data frame,
516
:basically.
517
:So instead of only having to have.
518
:2D pandas data frames, you can do a lot of things with a lot more dimensions, which is
always the case in No, I totally agree with that.
519
:And I think the other thing that's nice about it is you can use it in it.
520
:They have integrations in ARVIZ to work with a whole host of different PPLs.
521
:it's like, whether you're using Stan or PrimeC or whatever else, if ARVIZ data inference
object is always the commonality, it's...
522
:makes it easy if like, I'm in a different setting and I'm using this other PPO in this
case, and having to learn a bunch of different tools to do plotting and deal with the data
523
:can be quite annoying.
524
:So it's nice to have one place to do most of it.
525
:And I think we're gonna lean on that pretty heavily with like developing base blends.
526
:I think there's more we could probably do to integrate with the inference data structure
and like.
527
:in terms of making it easier to plot things and stuff like that.
528
:I think it's something I'm learning more and more about myself and would definitely also
recommend others to do.
529
:Yeah, that's what I tell my students almost all the time.
530
:It's like, time spent learning how inference data object works is well spent.
531
:Yeah, agreed.
532
:Because you're going to have to do that anyways.
533
:So you might as well start over.
534
:Right.
535
:Yeah.
536
:You'll encounter it at some point.
537
:yeah, yeah.
538
:And I'm wondering, so you talked about model stacking too.
539
:I'm not familiar with that term.
540
:Is that just the same as model averaging or is that different?
541
:Yeah, I mean, so there's like, there's technically some differences.
542
:And I think some of the ways that like when
543
:I think historically the term Bayesian model averaging has meant something pretty specific
in the literature.
544
:And yeah, I want to hope to not get this wrong because sometimes I mix things up in my
head when thinking about them.
545
:It's just due to the names makes it easy.
546
:But I'm pretty sure historically Bayesian model averaging was done on like in sample fit
statistics and not out of sample, which can kind of.
547
:it's a small thing, but it can make a big difference in terms of like results and how the
problem is approached and things like that.
548
:And so when talking about model averaging, I'd say like stacking is one form of model
averaging.
549
:And there are many ways that one could perform model averaging, whereas like stacking is
one specific variant of that.
550
:like the way that we like actually implement stacking is
551
:There's a couple of different ways that you can do it, but you're basically optimizing
the, like if you have the out of sample log likelihood statistics, you can compute like a
552
:point -wise ELPD, if you will.
553
:So it's not like the sum of the log predictive density across all of your data points, but
just like each data point has its own LPD.
554
:And then what you're essentially doing with stacking is you're fitting a model to optimize
combining all of those points across your different models.
555
:So it's like, maybe for certain data points, yeah, model A has a higher out of sample
likelihood than model B and for others it has lower.
556
:And so the goal of the stacking model is to fit it to those, with those as outcome
measures.
557
:And then,
558
:the weights that you derive from that are basically just optimizing how to combine those
likelihood values.
559
:so the way that stacking is actually implemented after you estimate those weights is to
sample from the posterior.
560
:So if I have, for a given data point, I have a 50 % weight on one model, 50 % weight on
another.
561
:kind of blending together the posteriors by drawing samples in proportion to the weights.
562
:so that's kind of how stacking is approached and how we've implemented it in Bayes Blend.
563
:I know like Pseudo -BMA, think Yu Lingyao who had done a lot of work with stacking and
Pseudo -BMA and stuff, we've had some talks with him.
564
:as well as Aki, Vittari, and some other folks who have done some work on these methods.
565
:I think they're moving away from the pseudo Bayesian model averaging terminology to start
to call it something that is less suggestive of like what classical Bayesian model
566
:averaging has typically referred to.
567
:And so I think like for folks interested in exploring more of that today,
568
:I mean, if you can read the preprint, some definitions that does a pretty good job, I'd
say, kind of defining some of these different ideas and gives you the math that you can
569
:look at to see how it's actually done mathematically.
570
:But then if you're kind of searching for resources to, I think, focusing on like the
stacking terminology should be probably pretty helpful over like Bayesian model averaging.
571
:That's my two cents, at least.
572
:Okay, yeah.
573
:Yeah, so what I get from that is that it's basically trying to do the same thing, but
using different approaches.
574
:Yeah, right, right, right.
575
:And that's my impression.
576
:I'm sure other people will have different reads on the literature.
577
:Like I said, it's something I've only really begun to explore in the last couple of years.
578
:And so there's I'm sure there are many other people out there that know much more.
579
:Okay, yeah, yeah, for sure.
580
:If we said
581
:If we've made a big mistake here and someone knows about that, please get in touch to me.
582
:can, you can be outraged in your message.
583
:That's I've learned something from that.
584
:I'm welcome.
585
:That's right.
586
:Me as well.
587
:And so actually I'd like to get back a bit to the previous
588
:the previous models we talked about, know, now that we've talked about your model
averaging work, and I'm curious about how do external factors like economic downturns or
589
:global health crises, for instance, how does that affect your forecasting models and what
strategies do you employ?
590
:to adjust models in response to such events?
591
:no, it's a great question.
592
:So yeah, can economic factors definitely, definitely can influence kind of the performance
of these portfolios.
593
:But a lot of times it's actually surprisingly, like, these loss ratios are surprisingly
robust to a lot of these economic factors.
594
:And partly,
595
:It's just because of the way that I think insurance generally works where, you know, if a
good example of this is in, yeah, like COVID times, people like, for example, if you're
596
:thinking about insuring commercial auto or private auto insurance policies, and like when
COVID happened, people stopped driving.
597
:And so people got into a lot less accidents.
598
:And so in that case, loss ratio is one.
599
:really far down for auto policies or for auto programs.
600
:And in some cases, insurance companies actually paid back some of the policyholders, like
some portion of the premium, just because things were so, like the the loss ratios were so
601
:low.
602
:And so there's examples of things like that happening.
603
:like just due to the nature of how like policies are written out,
604
:You do have to have, and how they're paid out.
605
:So like I paid my insurance upfront and then I only, they only lose money when claims are
made.
606
:The things that we think about, they have to be things that would, mostly things that
would influence claims, I would say is the primary factor.
607
:So if there's something economic that we believe is going to affect how much claims are
made, whether we think it will make them go up or down, like that's going to be our, like
608
:the primary force through which
609
:like economic conditions could affect these models, mostly because like the premium that
is written is pretty stable.
610
:Like generally, regardless of what's going on economically, the same types of insurance
policies like are often either required or things like that.
611
:So unless management is changing a lot about the program in terms of how they're pricing
things or something of that nature, you don't tend to get huge swings in the premium
612
:that's coming in.
613
:And so that's what we focus on.
614
:Mostly it would be things that affect claims.
615
:when we do look at that, one of the things that we've implemented, that we've looked into
is we look at like modeling overall industry level trends and using that as sort of input
616
:to our program level models.
617
:And so it's not quite like driving priors from industry.
618
:It's more like
619
:we actually know across all of the commercial auto programs, for example, what the sort of
industry level loss ratio is.
620
:And if we can understand, like that's where we might have some general idea of how
economic factors might influence something at that high of a scale.
621
:like the interest rate environment or like the location of the industry and other things
like that.
622
:We've built some models of like industry level trends that are then used as, so it's like
given we can predict like an industry loss ratio for the next so many accident years, we
623
:can use that information in our program level models and say like, how much do we think we
need to weight, you know, the industry where the industry is trending versus what we see
624
:in this program.
625
:That's kind of how we've approached that problem historically.
626
:I'd say we approach it that way mostly just because it's really hard at the level of
granularity that a lot of the programs that we deal with, like they're pretty small
627
:relative to industry at large.
628
:And so it's often hard to observe like general industry trends in the data, especially
when we have relatively few historic data points.
629
:hard to do it in a data -driven way.
630
:that's the big way that we've approached that problem is to kind of...
631
:We can better understand industry and understand how economic factors influence where the
industry is trending.
632
:We can then use that information in our program level analysis.
633
:And so we do have some models that do that.
634
:Yeah, fascinating.
635
:Fascinating.
636
:I really love that.
637
:That's super interesting because, by definition, these events are extremely low frequency.
638
:But at the same time, can have a huge magnitude.
639
:you would be tempted to just forget about them because of their low frequency.
640
:But the magnitude means you can't really forget about them.
641
:so that's really weird.
642
:think also having done that innovation framework is actually very helpful because you can
actually accommodate that in the model.
643
:Yeah.
644
:And I think that too, another thing that
645
:This is kind of interesting about the kind of insurance portfolios that we deal with is
that some of this is actually on the underwriters or the management team who's actually
646
:like writing the insurance policies.
647
:And so a lot of times, like those folks are the ones who are way ahead of the game in
terms of like, I think there's this really what we might call a long tail risk or
648
:something like historically.
649
:Like workers compensation asbestos was an example of this where it something that was
introduced in a bunch of houses.
650
:It was used everywhere as an insulator and you know, decades down the road come to find
out that this stuff is causing cancer and doing horrible things.
651
:And like those long tailed risks are, they're pretty rare.
652
:You don't come by them often.
653
:But it's something that a lot of times the underwriters who are
654
:kind of pricing the policies and writing the insurance policies, they are sort of the
frontline defense for that because they're on the lookout for all of these long tailed
655
:risks and taking that into account when like pricing the policies, for example, or when
writing policies themselves to exclude certain things if they think it shouldn't apply.
656
:so oftentimes like that
657
:that kind of makes its way into the perspective we have on modeling because when we're
modeling a loss ratio, for example, our perspective is that we're almost trying to
658
:evaluate the performance of the management team because they're responsible for actually
writing the policies and marketing their insurance product and all of that.
659
:And we view ourselves as looking at the historic information is just like their track
record.
660
:so, I mean, that doesn't stop big economic things from.
661
:from changing that track record.
662
:But that's something that's kind of influenced how we think about our models, at least
from a generative perspective.
663
:yeah, I think it's definitely important to have that perspective when you're in such a
case where the data that you're getting is made and it kind of arises in a rather
664
:complicated way.
665
:Yeah, fantastic points, I completely agree with that.
666
:And are there some metrics or, well, techniques we've talked about them, but are there any
metrics, if any, that you find most effective for evaluating the performance of these
667
:Bayesian time series models?
668
:Yeah, no, think, yeah, historically we've done a lot of the sort of log likelihood based
metrics.
669
:So we use ELPD for a lot of our decision making.
670
:So if we're exploring different models and we're doing our stacking workflow and all of
that, at the end of the day, if we're deciding whether it's worth including another
671
:candidate model in the stacking model in production, we'll often compare like what we
currently have in production to the new proposed thing, which could be a single model.
672
:It could be some stacked models or what have you.
673
:Typically we're using ELPD and we also look at things like
674
:like RMSE and mean absolute error.
675
:We tend to not rely necessarily on any given metric just because sometimes, especially
with ELPD, with the types of models we work with, there are some times where you can get
676
:pretty wild values for ELPD that can really bias like.
677
:Like at the end of the day, I guess this gets a little technical, but you might have an
LPD score for each data point.
678
:And if one of those data points is quite off, when you take the sum to get your total
model performance metric, it can often, can sometimes, it acts like any outlier and can
679
:kind of make the sum go in one direction quite a bit.
680
:so sometimes the LPD might be very sensitive to...
681
:like outlier data points compared to things like RMSE.
682
:So you might be actually, and then the reason is just because like you might be, your
prediction might be pretty close, like in an absolute scale, but like if your uncertainty
683
:is really low in your prediction, like what ELPD really is measuring is like the height of
your posterior density, prediction of the density at where the data point is.
684
:And so if you're too certain, your data points like way out in the tail of some
distribution,
685
:and it ends up getting this crazy value even though RMSE might be pretty good because on
average you're pretty close actually.
686
:So we have had to do some forays into more robust ways to compute or to estimate ELPD.
687
:We've done some research on that and sometimes we'll use those metrics in production where
we will say instead of
688
:instead of taking a sum of the ELPD values across all your data points, out of sample data
points will fit like a T distribution to all of those data points.
689
:And that's one way that like the expectation of that T distribution is not going to be as
influenced by some extreme outliers.
690
:You also get the benefit of, you get like a degrees of freedom parameter estimated from
the T distribution that way.
691
:And that can be a sort of diagnostic because if it's too low, then you're approaching like
a Cauchy distribution that doesn't have an expectation.
692
:It doesn't have a variance.
693
:so, so you can, we've explored methods like that that we'll sometimes use in production
just because we do so many tests.
694
:It's a shame to like not be able to do a comparison because there's like a few data points
out of the thousands of data points that we have in our historic database that kind of
695
:throw everything off and make it such that
696
:there's no consensus on which model is performing better.
697
:And so yeah, that's a long way of saying we mostly focus on ELPD, but use some other like
more absolute metrics that are easily interpretable and then also some what we kind of
698
:think of as these more robust variants of ELPD, which I think at some point I think we'll
try to write a paper on it, see what other people think because one of those things that
699
:comes up, you come up with a solution to something that you think is a pretty big problem
and then very curious what other people might actually think about that or if they see any
700
:big holes in the approach.
701
:so, yeah, maybe at some point we'll have a paper out on that.
702
:We'll see.
703
:Yeah, that sounds like fun.
704
:But actually, I think it's a good illustration of something I always answer my students
who come from the statistic framework.
705
:and they tend to be much more oriented towards metrics and tests.
706
:And that's always weird to me because I'm like, you have posterior samples, you have
distribution for everything.
707
:Why do you want just one number?
708
:And actually you worked hard to get all those posterior samples in distribution.
709
:So why do want to throw them out the window as soon as you have them?
710
:I'm curious.
711
:So yeah, You need to make a decision, right?
712
:Yeah.
713
:And often they ask that like something related to, but what's the metric to know that
basically the model is good.
714
:You know, so how do I compute R squared, for instance?
715
:Right.
716
:And I always give an answer that must be very annoying, that's like, I understand that you
want a statistics, you know, a metric, a statistic.
717
:That's great.
718
:But it's just a summary.
719
:It's nothing magic.
720
:So what you should probably do in all of the cases is using a lot of different metrics.
721
:And that's just what you answered here is like, you don't have one go -to metric that's
supposed to be a magic number, and then you're good.
722
:It's like, no, you're looking at different metrics because each metric gives you an
estimation of a different angle of the same model.
723
:And a model is going to be good at some things
724
:but not at others, right?
725
:It's a bit like an athlete.
726
:An athlete is rarely extremely complete because it has to be extremely specialized.
727
:So that means you have trade -offs to make.
728
:so your model, often you have to choose, well, I want my model to be really good at that.
729
:Don't really care about it being really good at that.
730
:But then if your metric is measuring the second option, your model is gonna appear really
bad, but you don't really care about that.
731
:what you end up doing as the modeler is looking at the model from different perspectives
and angles.
732
:And that will also give you insights about your model because often the models are huge
and multi -dimensional and you just have a small homo sapiens brain that cannot see beyond
733
:three dimensions, right?
734
:So you have to time down everything and basically...
735
:I'm always saying, look at different metrics.
736
:Don't always look at the same one.
737
:And maybe also sometimes invent your own metric, because often that's something you're
interested in.
738
:You're interested in a very particular thing.
739
:Well, just invent your own metric, because you can always compute it, because it's just
posterior samples.
740
:And in the end, posterior samples, you just count them, and you see how it goes.
741
:That's not that hard.
742
:No, I think that's great.
743
:And it's great to get that.
744
:in folks heads when they're like starting to learn about this stuff.
745
:It's like I can't even, I can't even count like how many, know, classic machine learning
papers I've seen where you just have tables of bolded metrics with the model with the
746
:lowest RMSEs the best, right?
747
:And so therefore it's chosen.
748
:You know, I think that perspective can, it can make it a little harder to actually
understand your models for sure.
749
:And yeah, because there's even other things like we've, it reminds me like, yeah, I we
look at like we do simulation based calibration, like prior sensitivity analysis, like all
750
:of these things that aren't necessarily tied to a performance metric, but they're tied to
how well you can interpret your model and how much you can trust it in the parameters that
751
:it's outputting.
752
:And so I think like all of those should also definitely be considered.
753
:And, you know, another thing that we encounter quite a lot is like,
754
:there's a cost to productionize these models.
755
:Like if we have a new model and it performs better technically by a small amount, like is
it really worth it if it's like a very complicated, hard to interpret and hard to
756
:maintain?
757
:And I think sometimes the answer is no, actually what we have is good enough.
758
:And so we don't actually need this more complicated, you know, hard to work with model.
759
:And that's something that I feel is
760
:probably more common in industry settings where you're expected to maintain and reuse
these models repeatedly versus maybe more in academic work where where you like research
761
:is the primary objective and maybe you don't need to think as much about like model
maintainability or productionization things like that.
762
:And so I feel like having a holistic perspective on how your models evaluated is I think
very important and
763
:and definitely something that not any single metric is going to allow you to do that.
764
:Yeah, definitely.
765
:I'm really happy to hear that you guys use simulation -based calibration a lot because
that's quite new, but it's so useful.
766
:It's very useful.
767
:Yeah.
768
:It's nice to figure out if you have problems with your model before you fit it to real
data.
769
:Yeah.
770
:Yeah, I'm curious about how you do that.
771
:But first, for folks who want some detail about that, you can go back and listen to
episodes 107 with Marvin Schmidt.
772
:We talked about amortized Bayesian inference and why that's super useful for simulation
-based calibration.
773
:And also episode 109 with Sonja Winter, where we actually go into how would you implement
simulation -based calibration.
774
:why that's useful.
775
:So it's a bit of context here if you don't know what that is about.
776
:And now there are chapters to the episodes.
777
:So if you go to the website, learnbasedats .com, you go to the episode page, you'll have
the chapters of the episodes and you can directly click to the timestamp and you'll see,
778
:you'll be able to jump directly to the...
779
:episode to the part of the episode where we talk about that stuff in particular And of
course you have that also on the YouTube channel.
780
:So now if you go to any YouTube episodes you click on the timestamp you're interested in
and the video will just get there.
781
:So that's pretty cool to reference back to something, you know, you're like, I've heard
that somewhere In the episodes, but I don't remember exactly where so yeah
782
:like something you can do that I use actually quite a lot.
783
:You go to LearnBaseStats.
784
:I'm going to be using this now.
785
:This is a good tip.
786
:You can do Control -F.
787
:You do Control -F and you look for the terms you're interested in and it will show up in
the transcript because you have the transcript now also on on each episode page on
788
:LearnBaseStats .com.
789
:You look at the timestamp and then with the timestamp you can infer which chapter this is
talked in and then you get back to the
790
:to the part of the episode you're interested in much faster.
791
:yeah.
792
:Yeah, very helpful because searching in podcasts has historically been a challenging
problem.
793
:yeah.
794
:Now that's getting much better.
795
:So yeah, definitely use that.
796
:I do that all the time because I'm like, wait, we talked about that.
797
:I remember it's in that episode, but I don't remember when.
798
:So I use that all the time.
799
:So yeah, maybe like I know we're running like we're already at
800
:when I were in 15, but just can you talk a bit about SBC simulation based calibration?
801
:How do you guys use that in the trenches?
802
:very curious about that.
803
:That's a good question.
804
:Yeah.
805
:So we have like for pretty much everything we do, we have like pretty custom models that
we use pretty custom software.
806
:So we have like our own internal software that we've written to like make it easy for
807
:the style of models that we use.
808
:yeah, for any model that we do research on or any model that we end up deploying in
production, typically, yeah, we start with simulation -based calibration and prior
809
:sensitivity analysis and that sort of thing.
810
:And so with simulation -based calibration, typically, because described our workflow
earlier where it's kind of like multi -stage.
811
:but there are multiple different models.
812
:Like there are a couple of different models along that pipeline.
813
:And so typically we will do SBC for each of those individual component models just to see
if there are any clear issues that arise from any kind of sub component of our whole
814
:pipeline.
815
:And then we will also try to do it at least for...
816
:Like in some cases we might be able to do it for the whole pipeline, but typically you
might look at it for each individual model and then we'll, for the whole pipeline, we
817
:might look more like it, like the calibration of the posterior predictions, for example,
against the true holdout points.
818
:But yeah, SVC is really nice because it, I mean, oftentimes we do want to be able to
interpret the parameters in our models.
819
:Let's say like if you really don't care about interpretation, maybe it's.
820
:it's maybe not as motivating to go through the whole SBC process.
821
:But in our case, oftentimes we'll have parameters that represent like how much momentum a
program has, how much reversion it has, like where the average program level loss ratio is
822
:that gets reverted back to is sitting, which is an important quantity.
823
:And we want to know that when we get those numbers out of our models and the parameter
values that we can actually interpret them in a meaningful way.
824
:And so SBC is, yeah, a great way to be able to look and see like, okay, are we able to
actually estimate parameters in our models in an unbiased and an interpretable way?
825
:But then also like, have we programmed the models correctly?
826
:I think it's another big thing.
827
:SBC helps you resolve because often, you
828
:The only time you really know if your model is actually coded up the right way is if you
simulate some fake data with known parameter values and then try to recover them with the
829
:same model.
830
:And SBC is sort of just like a comprehensive way to do that.
831
:I remember like before I read this SBC paper I used to back in my PhD years, like we
would, you know, pick a random set of parameter values for these models and simulate some
832
:data and refit just that single set of parameters and see
833
:Like, okay, like are they falling in the posteriors?
834
:And I feel like SBC is sort of just like a very nice way to do that in a much more
principled way.
835
:And so I would definitely encourage regular use of SBC, even though, yeah, it takes a
little bit more time, but it saves you more headaches later on down the road.
836
:Yeah, I mean, SBC is definitely just a industrialized way of doing
837
:what you described, right?
838
:Just fixing the parameters, sampling the model, and then seeing if the model was able to
recover the parameters we used to fit it, basically.
839
:yeah.
840
:From these parameters, you sample prior predictive samples, which you use as the data to
fit on the model.
841
:yeah, multi -spatial inference is super useful for that, because then once you've trained
the neural network, it's free to get posterior samples.
842
:But two things.
843
:First, that'd be great if you could add that SPC paper you mentioned to the show notes,
because I think it's going to be interesting to listeners.
844
:second, how do you do that concretely?
845
:So when you do SPC, you're going to fit the model 50 or 100 times in a loop.
846
:That's basically how you do it right now.
847
:Yeah, usually we'll have will fit probably probably like 100 to 500 times closer to 500.
848
:Usually a lot of our models actually don't take too long to fit.
849
:Most of the most of the fitting time that is involved is in like the R &D process, like
backtesting, training the stacking models and all that.
850
:But once you're like for the individual models to fit to data, they're pretty quick most
of the time.
851
:So we we do a decent number of samples usually.
852
:And yeah, it's sort of like, well,
853
:We have like the way that it's programmed out in our internal software is that will like
refit the model basically to like save out all the sort of rank statistics that we need or
854
:like the location of the the percentiles of the the true values in the posterior predicted
values for those parameters and just store all those like going over in a loop.
855
:I think we've
856
:We might have some stuff that we've messed around with like parallelizing that, but it
ends up usually just being faster to just parallelize the MCMC chains instead.
857
:So a lot of times we just run this stuff locally because the models fit so quick.
858
:That's usually how we approach it.
859
:But yeah, so it's like if you can set a single set of parameter values and do a parameter
recovery simulation, SBC is basically just a loop on top of that.
860
:it's not a lot.
861
:too much overhead, really.
862
:The overhead is in the added time it takes, not necessarily the added amount of code that
it takes.
863
:Just kind of nice, I think.
864
:Yeah, Yeah, the code is trivial.
865
:It's just like you need to let the computer basically run a whole night.
866
:Yeah, go make some tea, get a coffee.
867
:Yeah, to do all the simulations.
868
:mean, that's fine.
869
:keep your house warm in the winter.
870
:Usually it fits quite fast, the model, because you're using the prior predictive samples.
871
:and as the data so it's not too weird.
872
:In general, you have a rubber structure in your generative model.
873
:So yeah, definitely.
874
:Yeah, no, that's a point.
875
:Please nurse to do that.
876
:Yeah, sure.
877
:Yeah.
878
:And if you don't have completely stupid priors, Yeah, you will find that you probably do.
879
:Yeah.
880
:If you're using extremely wide priors, then yeah, your your data is gonna look very weird.
881
:a good portion of the time.
882
:so yeah, like then the model fitting is going to be longer, but then.
883
:Yeah.
884
:No, that's a good point.
885
:Yeah.
886
:It didn't bring up, but yeah, for, for helping come up with reasonable priors, SBC's
another, so it's a good way to do that because if it works, then that's a good sign.
887
:If it's going off the rails, then probably your priors would be the first place to look
other than perhaps you'll get the code.
888
:Yeah.
889
:Yeah.
890
:No, exactly.
891
:And that's why SBC is really cool.
892
:I think it's like, as you were saying,
893
:Because then you have also much more confidence in your model when you actually start
fitting it on real data.
894
:So maybe one last question for you, Nate.
895
:I have so many questions for you.
896
:It's like you do so many things, we're closing up to the one hour and a half.
897
:So I want to be respectful of your time.
898
:But I'm curious where you see the future of
899
:patient modeling, especially in in your field, so insurance and financial markets,
particularly with respect to new technologies like, you know, the new machine learning
900
:methods and especially generative AI.
901
:Yeah, that's a great question.
902
:I
903
:I'm of two minds, think.
904
:A part of me, from doing some Bayesian modeling and guess kind of like healthcare before
this and now more in like finance insurance side, I think there's like what you see in the
905
:press about like all the advances in generative AI and all of that.
906
:And then there's like the reality of the actual data structures and organization that you
see in the wild.
907
:And I think...
908
:I think there's still like a lot of room for more of what you might think of as like the
classical kind of workflow where people are, you know, not really necessarily relying on
909
:any really complicated infrastructure or modeling techniques, but more following like your
traditional principle Bayesian workflow.
910
:And I think especially in the insurance industry, the insurance industry is like very
heavily regulated.
911
:And like if you're doing any pricing for insurance, for example,
912
:you basically have to use a linear model.
913
:There's really very little deviation you can get from there.
914
:And so like, yeah, you could do base there, but you can't really, I think more of like
the, what we might think of when we think of like AI types of technologies.
915
:think there's potentially room for that and like within organizations, but for some of the
core modeling work that's done that influences decisions that are made, I think.
916
:there's still a ton of room for more of these classical statistics types of approaches.
917
:That being said, I think there's a lot of interest in bays at scale and more modern
machine learning types of contexts.
918
:And I think there's a lot of work that's going on with in -law and variational bays and
like
919
:the Stan team just released Pathfinder, which is kind of a new algorithm, like on the
variational side of things.
920
:And I think when I think of like Bayes at scale and like Bayesian machine learning types
of applications, I think there's probably a lot of interesting work that can be done in
921
:that area.
922
:think there's a lot of interesting future potential for those methods.
923
:I have less experience with them myself, so I can't really speak to
924
:to them in too much detail.
925
:But I also think there's a lot of interesting things to explore with full bays.
926
:As we have more compute power, it's easier to, for example, run a model with many chains
with relatively few samples.
927
:And so I think with distributed computing, I think it would be great to have a future
where we can still do full bays.
928
:like, you know, get our full posterior's with some, some variant of MCMC, but in a, in in
faster way, just with more compute.
929
:And so I think, yeah.
930
:So, so I guess all that to say, I think that there's going to be a long time before, you
know, the classical statistics, like modeling workflow becomes obsolete.
931
:I don't see that happening anytime soon, but I think in terms of like using Bayes and
other things at scale, there's a lot of
932
:really exciting different methods that are being explored that I haven't actually myself
had any real exposure to in like work setting or applied setting because the problems that
933
:I've worked on have kind of retained there.
934
:I can still mostly fit the models on my laptop or on like some EC2 instance that some
computer that doesn't require too much compute.
935
:So yeah, I guess that's.
936
:That's my current perspective.
937
:see how it changes.
938
:Yeah, yeah.
939
:mean, these are good points for sure.
940
:mean, you're still going to need to understand the models and make sure the assumptions
make sense and understand the edge cases, the different dimensions of the models and
941
:angles as we talked about a bit earlier.
942
:yeah, I think that's a really tremendous asset.
943
:and a kick -ass sidekick.
944
:So for sure, that's extremely useful right now.
945
:I can't wait to see how much progress we're gonna make on that front.
946
:I really dream about having a Jarvis, like Iron Man, like Tony Stark has Jarvis, and then
it's like, it's extreme.
947
:That'd be perfect, but you're basically outsource a lot of that stuff that you're not very
good at and you focus on the thing you're really extremely good at and efficient and
948
:productive.
949
:Yeah, no, definitely think that like a lot of the generative AI types of tools definitely
can aid with productivity for sure.
950
:Like, I can't tell you how many times I've just been like, hey, tell me how to do this
with Pandas because I don't want to figure it out.
951
:Similarly with like Plotly or stuff like that.
952
:feel there are certain parts of the workflow where Google or Stack Overflow is no longer
the first line of defense, right?
953
:And I think a lot of that stuff that I don't like to spend time on sometimes can be sped
up by a lot of these tools, which is really nice to have, I would say.
954
:But yeah, we'll definitely be curious to see.
955
:if Bayes underlies any of some of these methods going forward.
956
:I know there's an interest in it, but the scalability concerns have so far maybe made that
a little challenging.
957
:Although I don't know in your case, in my case, I've never had a project where we were
like, no, actually we can't use Bayes here because the sketch is too big.
958
:No, I agree.
959
:think I've been similarly for me.
960
:Usually there's a way.
961
:And I think, yeah, I think there are definitely problems where that gets challenging, but
at the same time, like if it's getting challenging for Bayes, it's probably gonna be
962
:challenging for other methods as well, I think.
963
:And then you deal with other issues in these cases too.
964
:And so I think, yeah, I've also been kind of biased by this, because a lot of times I'm
working with rather small datasets.
965
:at least in terms of how much memory they're taking up on my computer or something like
that.
966
:They're small enough that we can do some fun modeling and not have to worry too much.
967
:Yeah.
968
:That's a good point.
969
:But yeah, definitely I'm still waiting for a case where we'll be like, yeah, no, actually
we cannot use base here.
970
:Right.
971
:Yeah.
972
:That's actually interesting.
973
:Hopefully it never happens.
974
:Right.
975
:Yeah.
976
:That's the dream.
977
:And so, Nate, let's call it a show.
978
:I've already taken a lot of your time and energy.
979
:I'm guessing you need a coffee.
980
:yeah.
981
:I'll probably go get some tea after this.
982
:Maybe a Red Bull.
983
:We'll see how I'm feeling.
984
:This has been a great time.
985
:Yeah.
986
:It was great.
987
:I mean, of course, I'm going to ask you the last two questions.
988
:I escaped against the end of show.
989
:So first one, if you had unlimited time and resources, which problem would you try to
solve?
990
:Yeah.
991
:So I thought a lot about this because I've listened to the podcast for so long and I
contemplated every time.
992
:I feel like I always miss the other listeners' responses because I'm lost in my thoughts
about like, how would I answer this?
993
:And I think probably, I think I would want to do
994
:some work in like mental health, which is sort of my, the field that I grew up in, right?
995
:Like there's a lot of like open problems in like psychiatry, clinical psychology, both in
terms of like how we measure like what a person is experiencing, like mental illness in
996
:general, how we dissociate different types of.
997
:disorders and things like that, but then also in terms of treatment selection, as well as
like, your treatments like automated treatments that are maybe scaffolded through like
998
:apps first, or that have different types of models rather than just like the face to face
therapy model.
999
:And I think, yeah, if I had unlimited resources, unlimited time and funding, would be
exploring
:
01:34:30,069 --> 01:34:43,733
exploring kind of solutions to that, I guess solutions isn't even the right word, ways to
kind of approach mental health crisis and how to measure, both better measure and better
:
01:34:43,733 --> 01:34:46,534
get people into treatments that they need.
:
01:34:46,534 --> 01:34:54,996
And some of the work I was doing at AVO before was related to this, but it's a
surprisingly hard field to get funding for.
:
01:34:55,056 --> 01:34:58,117
Just because there's a lot of barriers.
:
01:34:59,637 --> 01:35:02,838
Working in like healthcare is a hard thing to navigate.
:
01:35:04,138 --> 01:35:11,500
And there's a lot of snake oil treatments out there that seem to suck up a lot of the
interest in funding.
:
01:35:11,640 --> 01:35:16,512
And so I think, you know, if I didn't have to worry about that, there'd be a lot of
interesting things to do.
:
01:35:16,512 --> 01:35:23,384
But yeah, that'd be that I think that would be what I would focus my energy on.
:
01:35:23,464 --> 01:35:27,765
Yeah, definitely a worthwhile quest.
:
01:35:28,005 --> 01:35:36,872
And since you never get the guest's answers, I think you're the first one to answer that.
:
01:35:37,774 --> 01:35:41,887
It goes with my Freudian sips mug here.
:
01:35:41,887 --> 01:35:45,880
It's my favorite.
:
01:35:47,762 --> 01:35:55,879
And second question, if you could have dinner with any great scientific mind, dead, alive
or fictional, who would it be?
:
01:35:56,071 --> 01:35:59,863
Yeah, this one was a lot harder for me to think about.
:
01:36:00,604 --> 01:36:04,306
But I came to what I think would be the answer.
:
01:36:05,007 --> 01:36:08,009
And so I'm going to pick a fictional person.
:
01:36:08,350 --> 01:36:22,749
And I don't know if you've read Foundation series or like watched the television series,
but Harry Seldon is the architect of what he calls psycho history, which is essentially
:
01:36:22,749 --> 01:36:25,020
this, science of
:
01:36:26,017 --> 01:36:32,349
predicting like mass behaviors like population level behaviors and he's like developed
this mathematical model.
:
01:36:32,349 --> 01:36:46,083
It allows him to predict, you know, thousands of years into the future how people will be
interacting and like, you know, saves the galaxy all that spoiler alert, but sort of.
:
01:36:46,403 --> 01:36:48,423
I'll leave some ambiguity, right?
:
01:36:49,064 --> 01:36:54,475
But I think that would be the person just because it's kind of an interesting concept.
:
01:36:54,661 --> 01:37:06,238
I think he's an interesting character and like, Psycho History sort of is like, given my
background, I'm just kind of interested in that whole concept.
:
01:37:06,238 --> 01:37:13,632
And so if someone were able to do that, I'd sure like to better understand how exactly
they would be doing it.
:
01:37:13,632 --> 01:37:15,693
Maybe there's phase involved, we'll see.
:
01:37:20,194 --> 01:37:21,854
Yeah, great answer.
:
01:37:22,094 --> 01:37:24,945
And here again, first time I hear that on the show.
:
01:37:24,945 --> 01:37:26,075
That's awesome.
:
01:37:26,075 --> 01:37:29,987
You should definitely put a reference to that show in the show notes.
:
01:37:29,987 --> 01:37:32,097
That sounds like fun.
:
01:37:32,097 --> 01:37:34,718
I'm definitely going to check that out.
:
01:37:36,579 --> 01:37:39,899
And I'm sure a lot of listeners also will.
:
01:37:40,420 --> 01:37:41,780
yeah, definitely.
:
01:37:42,280 --> 01:37:43,481
Well, awesome.
:
01:37:43,481 --> 01:37:44,811
Thanks again.
:
01:37:44,811 --> 01:37:46,052
Nate, that was a blast.
:
01:37:46,052 --> 01:37:47,783
I really learned a lot.
:
01:37:48,063 --> 01:37:49,895
And that was great.
:
01:37:49,895 --> 01:37:55,548
think you have an updated episode about model averaging, model comparison.
:
01:37:55,569 --> 01:37:59,361
I hope, Stéphane, you were happy with how it turned out to be.
:
01:37:59,361 --> 01:38:01,653
Me too.
:
01:38:01,653 --> 01:38:09,359
Well, as usual, I'll put resources and a link to your website in the show notes for those
who want to dig deeper.
:
01:38:09,359 --> 01:38:13,331
Thank you again, Nate, for taking the time and being on this show.
:
01:38:14,165 --> 01:38:15,556
Awesome, yeah, thanks a lot for having me.
:
01:38:15,556 --> 01:38:20,208
I had a blast and yeah, I forward to being a continued listener.
:
01:38:20,769 --> 01:38:27,413
Yeah, thank you so much for listening to the show for so many years.
:
01:38:27,413 --> 01:38:33,676
Definitely means a lot to And you're welcome back anytime in the show, course.
:
01:38:34,297 --> 01:38:35,637
Yeah, just let me know.
:
01:38:39,117 --> 01:38:42,820
This has been another episode of Learning Bayesian Statistics.
:
01:38:42,820 --> 01:38:53,309
Be sure to rate, review, and follow the show on your favorite podcatcher, and visit
learnbaystats .com for more resources about today's topics, as well as access to more
:
01:38:53,309 --> 01:38:57,392
episodes to help you reach true Bayesian state of mind.
:
01:38:57,392 --> 01:38:59,354
That's learnbaystats .com.
:
01:38:59,354 --> 01:39:02,216
Our theme music is Good Bayesian by Baba Brinkman.
:
01:39:02,216 --> 01:39:04,198
Fit MC Lance and Meghiraan.
:
01:39:04,198 --> 01:39:07,360
Check out his awesome work at bababrinkman .com.
:
01:39:07,360 --> 01:39:08,555
I'm your host.
:
01:39:08,555 --> 01:39:09,606
Alex Andorra.
:
01:39:09,606 --> 01:39:13,709
can follow me on Twitter at Alex underscore Andorra, like the country.
:
01:39:13,709 --> 01:39:21,014
You can support the show and unlock exclusive benefits by visiting Patreon .com slash
LearnBasedDance.
:
01:39:21,014 --> 01:39:23,396
Thank you so much for listening and for your support.
:
01:39:23,396 --> 01:39:25,688
You're truly a good Bayesian.
:
01:39:25,688 --> 01:39:29,200
Change your predictions after taking information in.
:
01:39:29,200 --> 01:39:35,873
And if you're thinking I'll be less than amazing, let's adjust those expectations.
:
01:39:35,873 --> 01:39:49,029
Let me show you how to be a good Bayesian Change calculations after taking fresh data in
Those predictions that your brain is making Let's get them on a solid foundation