Artwork for podcast Learning Bayesian Statistics
#113 A Deep Dive into Bayesian Stats, with Alex Andorra, ft. the Super Data Science Podcast
Visualization Better Practices Episode 11322nd August 2024 • Learning Bayesian Statistics • Alexandre Andorra
00:00:00 01:30:51

Share Episode

Shownotes

Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!

Visit our Patreon page to unlock exclusive Bayesian swag ;)

Takeaways:

  • Bayesian statistics is a powerful framework for handling complex problems, making use of prior knowledge, and excelling with limited data.
  • Bayesian statistics provides a framework for updating beliefs and making predictions based on prior knowledge and observed data.
  • Bayesian methods allow for the explicit incorporation of prior assumptions, which can provide structure and improve the reliability of the analysis.
  • There are several Bayesian frameworks available, such as PyMC, Stan, and Bambi, each with its own strengths and features.
  • PyMC is a powerful library for Bayesian modeling that allows for flexible and efficient computation.
  • For beginners, it is recommended to start with introductory courses or resources that provide a step-by-step approach to learning Bayesian statistics.
  • PyTensor leverages GPU acceleration and complex graph optimizations to improve the performance and scalability of Bayesian models.
  • ArviZ is a library for post-modeling workflows in Bayesian statistics, providing tools for model diagnostics and result visualization.
  • Gaussian processes are versatile non-parametric models that can be used for spatial and temporal data analysis in Bayesian statistics.

Chapters:

00:00 Introduction to Bayesian Statistics

07:32 Advantages of Bayesian Methods

16:22 Incorporating Priors in Models

23:26 Modeling Causal Relationships

30:03 Introduction to PyMC, Stan, and Bambi

34:30 Choosing the Right Bayesian Framework

39:20 Getting Started with Bayesian Statistics

44:39 Understanding Bayesian Statistics and PyMC

49:01 Leveraging PyTensor for Improved Performance and Scalability

01:02:37 Exploring Post-Modeling Workflows with ArviZ

01:08:30 The Power of Gaussian Processes in Bayesian Modeling

Thank you to my Patrons for making this episode possible!

Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser, Julio, Edvin Saveljev, Frederick Ayala, Jeffrey Powell, Gal Kampel, Adan Romero, Will Geary, Blake Walters, Jonathan Morgan and Francesco Madrisotti.

Links from the show:

Transcript

This is an automatic transcript and may therefore contain errors. Please get in touch if you're willing to correct them.

Transcripts

Speaker:

In this special episode, the roles are reversed.

2

:

as I step into the guest seat to explore the intriguing world of Bayesian stats.

3

:

Originally aired as episode 793 on the fantastic Super Data Science podcast hosted by John

Crone, this conversation is too good not to share with all of you here on learning

4

:

Bayesian statistics.

5

:

So join us as we delve into how Bayesian methods elegantly handle complex problems, make

efficient use of prior knowledge and excel with limited data.

6

:

the foundational concepts of patient statistics, highlighting their distinct advantages

over traditional methods, particularly in scenarios fraught.

7

:

with uncertainty and sparse data.

8

:

A highlight of our discussion is the application of Gaussian processes where I explain

their versatility in modeling complex, non -linear relationships in data.

9

:

I share a fascinating case study involving an NGO in Estonia illustrating how Bayesian

approaches can transform limited polling data into profound insights.

10

:

So whether you're a seasoned statistician or just starting out, this episode is packed

with practical advice on embracing Bayesian stats and of course

11

:

I strongly recommend you follow the Super Data Science Podcast.

12

:

It's really a can't -miss resource for anyone passionate about the power of data.

13

:

This is Learning Vision Statistics, episode 113, originally aired on the Super Data

Science Podcast.

14

:

Welcome to Learning Bayesian Statistics, a podcast about Bayesian inference, the methods,

the projects, and the people who make it possible.

15

:

I'm your host, Alex Andorra.

16

:

You can follow me on Twitter at alex -underscore -andorra.

17

:

like the country.

18

:

For any info about the show, learnbasedats .com is Laplace to be.

19

:

Show notes, becoming a corporate sponsor, unlocking Bayesian Merge, supporting the show on

Patreon, everything is in there.

20

:

That's learnbasedats .com.

21

:

If you're interested in one -on -one mentorship, online courses, or statistical

consulting, feel free to reach out and book a call at topmate .io slash alex underscore

22

:

and dora.

23

:

See you around, folks.

24

:

and best patient wishes to you all.

25

:

And if today's discussion sparked ideas for your business, well, our team at PIMC Labs can

help bring them to life.

26

:

Check us out at pimc -labs .com.

27

:

Hello my dear Vagans!

28

:

A quick note before today's episode, STANCON 2024 is approaching!

29

:

It's in Oxford, UK this year from September 9 to 13 and it's shaping up to be an

incredible event for anybody interested in statistical modeling and vaginal inference.

30

:

Actually, we're currently looking for sponsors to help us offer more scholarships and make

STANCON more accessible to everyone and we also encourage you

31

:

to buy your tickets as soon as possible.

32

:

Not only will this help with making a better conference, but this will also support our

scholarship fund.

33

:

For more details on tickets, sponsorships, or community involvement, you'll find the

Stencon website in the show notes.

34

:

We're counting on you.

35

:

Okay, on to the show now.

36

:

Alex, welcome to the Super Data Science podcast.

37

:

I'm delighted to have you here.

38

:

Such an experienced podcaster.

39

:

It's going to be probably fun for you to get to be the guest on the show today.

40

:

Yeah.

41

:

Thank you, John.

42

:

First, thanks a lot for having me on.

43

:

I knew about your podcast.

44

:

I was both honored and delighted when I got your email to come on the show.

45

:

I know you have had very...

46

:

Honorable guests before like Thomas Vicky.

47

:

so I will try to, to, to be on board, but, I know that it's going to be hard.

48

:

Yeah.

49

:

Thomas, your co -founder at, Pi MC labs is, was indeed a guest.

50

:

was on episode number 585.

51

:

but that is not what brought you here.

52

:

Interestingly, the connection.

53

:

So you asked me before we started recording how I knew about you.

54

:

And so a listener actually suggested to you as a guest.

55

:

So.

56

:

Doug McLean.

57

:

Thank you for the suggestion.

58

:

Doug is lead data scientist at Tesco bank in the UK.

59

:

And he reached out to me and said, can I make a suggestion for a guest?

60

:

Alex Andora, like the country, I guess you say that you say that.

61

:

Cause he put it in quotes.

62

:

He's like, Andora, like the country hosts the learning patient statistics podcast.

63

:

It's my other all time favorite podcast.

64

:

So there you go.

65

:

my God.

66

:

Doug, I'm blushing.

67

:

says he'd be a fab guest for your show and not least because he moans from time to time

about not getting invited onto other podcasts.

68

:

Did I?

69

:

my God.

70

:

I don't remember.

71

:

But maybe that was part of a secret plan, Maybe a secret marketing LBS plan and well.

72

:

That works perfectly.

73

:

When I read that, I immediately reached out to you to see if you'd want to go, but that

was so funny.

74

:

And he does say, says, seriously though, he'd make a fab guest for his wealth of knowledge

on data science and on Bayesian statistics.

75

:

And so, yes, we will be digging deep into Bayesian statistics with you today.

76

:

you're the co -founder and principal data scientist of the popular Bayesian statistical

modeling platform, PI MC, as we already talked about with your co -founder Thomas Wiki.

77

:

That is an excellent episode.

78

:

If you want to go back to that and get.

79

:

different perspective, obviously different questions we've made sure.

80

:

But so if you're really interested in Bayesian statistics, that is a great one to go back

to.

81

:

Yeah, in addition to that, you obviously also have the Learning Bayesian Stats podcast,

which we just talked about, and you're an instructor on the educational site, Intuitive

82

:

Bayes.

83

:

So tons of Bayesian experience.

84

:

Alex, through this work, tell us about what

85

:

Bayesian methods are and what makes them so powerful and versatile?

86

:

Yeah.

87

:

so first, thanks a lot.

88

:

Thanks a lot, dog, for the recommendation and for listening to the show.

89

:

am, I am absolutely honored.

90

:

and, yeah, go and listen again to Thomas's episode.

91

:

Thomas is always a great guest.

92

:

So I definitely recommend anybody to, to go and listen to him.

93

:

now what about Bayes?

94

:

Yeah.

95

:

You know, it's been a long time since someone has asked me that, because I have a Bayesian

podcast.

96

:

Usually it's quite clear I'm doing that.

97

:

people are like afraid to ask it at some point.

98

:

So instead of giving you kind of like a, because our two avenues here, usually I could

give you the philosophical answer and why epistemologically Bayes stats makes more sense.

99

:

but I'm not going to do that.

100

:

That sounds so interesting.

101

:

Yeah, it is, we can go into that.

102

:

I think a better introduction is just a practical one.

103

:

And that's the one that most people get to know at some point, which is like you're

working on something and you're interested in uncertainty estimation and not only in the

104

:

point estimates and your data are crap and you don't have a lot of them and they are not

reliable.

105

:

What do you do?

106

:

And that happens to a lot of PhD students.

107

:

That happened to me when I started trying to do electoral forecasting.

108

:

was at the time working at the French Central Bank doing something completely different

from what I'm doing today.

109

:

But I was writing a book about the US at the time, 2016 it was, and it was a pretty

consequential election for the US.

110

:

was following it.

111

:

really, really closely.

112

:

And I remember it was July 2016 when I discovered 538 models.

113

:

And then the nerd in me was awoken.

114

:

It was like, my God, this is what I need to do.

115

:

You know, that's my way of putting more science into political science, which was my

background at the time.

116

:

And when you do electoral forecasting polls are extremely noisy.

117

:

They are not a good representation of what people think, but they are the best ones we

have.

118

:

are not a lot of them, at least in France, in the US much more.

119

:

It's limited.

120

:

It's not a reliable source of data basically.

121

:

And you also have a lot of domain knowledge, which in the Bayesian Royal realm, we call

prior information.

122

:

And so that's a perfect setup for Bayesian stats.

123

:

So that's basically, I would say what Bayesian stats is.

124

:

And that's the powerful, the power of it.

125

:

You don't have to rely only on the data because sure you can let the data speak for

themselves, but what if the data are unreliable?

126

:

Then you need something to guard against that and patient stats are a great way of doing

that.

127

:

And the cool thing is that it's a method.

128

:

It's like you can apply that to any topic you want, any field you want.

129

:

that's what...

130

:

I've done at PMC Labs for a few years now with all the brilliant guys who are over there.

131

:

You can do that for marketing, for electoral forecasting, of course.

132

:

Agriculture, that was quite ironic when we got some agricultural clients because

historically, agriculture is like the field of frequency statistics.

133

:

That's how Ronald Fisher developed the p -value, the famous one.

134

:

So when we had that, we're like, yes, we got our revenge.

135

:

And of course, it's also used a lot in sports, sports modeling, things like that.

136

:

So yeah, it's like that's the practical introduction.

137

:

Nice.

138

:

Yeah.

139

:

A little bit of interesting history there is that sub -Asian statistics is an older

approach than the frequentist statistics that is so common and is the standard that is

140

:

taught in college, so much so

141

:

that is just called statistics.

142

:

You do an entire undergrad in statistics and not even hear the word Bayesian because

Fisher so decidedly created this monopoly of this one kind of approach, which for me,

143

:

learning for Quenta statistics say, I think I guess it was first year undergrad in science

that I studied and

144

:

in that first year course, that idea of a P value always seemed odd to me.

145

:

Like how is it that there's this art?

146

:

This is such an arbitrary threshold of significance to have it be, you know, that this is

a one in 20 chance or less that this would be observed by chance alone.

147

:

And this means that therefore we should rely on it, especially as we are in this era of

large data sets and larger and larger and larger data sets.

148

:

You can have no meaningful if with very large data sets like we typically deal with today,

no matter, you're always going to get a significant P value because the slightest tiny

149

:

change, if you take, you know, web scale data, everything's going to have be statistically

significant.

150

:

Nothing won't be.

151

:

so it's such a weird paradigm.

152

:

And so that was discovering Bayesian statistics and machine learning as well.

153

:

And seeing how

154

:

Those areas didn't have P values interested me in both of those things.

155

:

it's a, yeah, Fisher.

156

:

It's interesting.

157

:

mean, I guess with small data sets, eight, 16, that kind of scale, guess it kind of made

some sense.

158

:

And you know, you pointed out there, I think it's this prior that makes Bayesian

statistics so powerful being able to incorporate prior knowledge, but simultaneously

159

:

that's also what makes for Quentus uncomfortable.

160

:

They they're like, we want only the data.

161

:

As though, you know, the particular data that you collect and the experimental design,

there are so many ways that you as the human are influencing, you know, there's no purity

162

:

of data anyway.

163

:

And so priors are a really elegant way to be able to adjust the model in order to point it

in the right direction.

164

:

And so a really good example that I like to come to with Bayesian statistics is that you

can

165

:

You can allow some of your variables in the model to tend towards wider variance or

narrower variance.

166

:

So if there are some attributes of your model where you're very confident, where you know

this is like, you know, this is like a physical fact of the universe.

167

:

Let's just have a really narrow variance on this and the model won't be able to diverge

much there.

168

:

But that then gives a strong focal point within the model.

169

:

around which the other data can make more sense, the other features can make more sense,

and you can allow those other features to have wider variance.

170

:

And so, I don't know, this is just one example that I try to give people when they're not

sure about being able to incorporate prior knowledge into a model.

171

:

Yeah, yeah, no, these are fantastic points, John.

172

:

So, yeah, to build a net, I'm...

173

:

I'm a bit, of course, I'm a nerd.

174

:

So I love the history of science.

175

:

I love the epistemological side.

176

:

A very good book on that is Bernoulli's Fallacy by Aubrey Clayton.

177

:

Definitely recommend his book.

178

:

He was on my podcast, episode 51.

179

:

So if people want to give that a listen.

180

:

Did you just pull that 51 out from memory?

181

:

Yeah, yeah, I kind of know like, but I have less episodes than you.

182

:

So it's like, you know, each episode is like kind of my baby.

183

:

So I'm like, yeah, 51 is Aubrey Clayton.

184

:

Yeah.

185

:

my goodness.

186

:

That's crazy.

187

:

That's also how my brain works.

188

:

numbers.

189

:

But yeah.

190

:

And actually episode 50 was with Sir David Spiegel Halter.

191

:

I think the only night we got on the podcast and David Spiegel Halter exceptional.

192

:

exceptional guest, very, very good pedagogically.

193

:

Definitely recommend listening to that episode two, which is very epistemologically heavy

so far for people who like that, the history of science, how we got there.

194

:

Because as you were saying, is actually older than Stantz, but people discovered later.

195

:

So it's not because it's older, that's better, right?

196

:

But it is way older actually by a few centuries.

197

:

So yeah, fun stories here.

198

:

could talk about that still, but to get back to what you were saying, also as you were

very eloquently saying, data can definitely be biased.

199

:

Because that idea of like, no, we only want the data to speak for themselves.

200

:

as I was saying, yeah, what if the data are unreliable?

201

:

But as you were saying, what if the data are biased?

202

:

And that happens all the time.

203

:

And worse.

204

:

I would say these biases are most of the time implicit in the sense that either they are

hidden or most of the time they just like you don't even know you are biased in some

205

:

direction most of the time because it's a result of your education and your environment.

206

:

So the good thing of priors is that it forces your assumptions, your hidden assumptions to

be explicit.

207

:

And that I think is very interesting also, especially when you work on models which are

supposed to have a causal explanation and which are not physical models, but more social

208

:

models or political scientific models.

209

:

Well, then it's really interesting to see how two people can have different conclusions

based on the same data.

210

:

It's because they have different priors.

211

:

And if you force them to explicit these priors in their models, they would definitely have

different priors.

212

:

then...

213

:

then you can have a more interesting discussion actually, think.

214

:

So there's that.

215

:

And then I think the last point that's interesting also in that, like why would you be

interested in this framework is that also, causes are not in the data.

216

:

Causes are outside of the data.

217

:

The causal relation between X and Y, you're not going to see it in the data because if you

do a regression of

218

:

education on income, you're going to see an effect of education on income.

219

:

But you as a human, you know that if you're looking at one person, the effect has to be

education has an impact on income.

220

:

But the computer could like might as well just do the other regression and regress income

and education and tell you, income causes education.

221

:

But no, it's not going that way.

222

:

the statistical relationship goes both ways, but the causal one

223

:

only goes one direction.

224

:

And that's a hidden reference to my favorite music band.

225

:

But yeah, it only goes one direction, and it's not in the data.

226

:

And you have to have a model for that.

227

:

And a model is just a simplification of reality.

228

:

We try to get the simple enough model that's usually not simple, but that's a

simplification.

229

:

If you say it's a construction and simplification, that's already a prior in a way.

230

:

you you might as well just go all the way and explicit all your priors.

231

:

Well said.

232

:

Very interesting discussion there.

233

:

You used a term a number of times already in today's podcast, which maybe is not known to

all of our listeners.

234

:

is epistemology?

235

:

What does that mean?

236

:

right.

237

:

Yeah, very good question.

238

:

Yeah.

239

:

So epistemology in like, in a sense that's the science of science.

240

:

It's understanding

241

:

how we know what we say we know.

242

:

So, for instance, how do we know the earth is round?

243

:

How do we know about relativity?

244

:

Things like that.

245

:

it's the scientific discipline that's actually very close to also philosophy.

246

:

That's, think, actually a branch of philosophy.

247

:

And that's trying to...

248

:

come up with methods to understand how we can come up with new scientific knowledge.

249

:

And by scientific here, we usually mean reliable and reproducible, but also falsifiable.

250

:

Because for hypothesis to be scientific, it has to be falsifiable.

251

:

so yeah, basically that's that.

252

:

Lots of extremely interesting things here, but yeah, that's like basically how do we know?

253

:

what we know, that's the whole trying to define the scientific method and things like

that.

254

:

Going off on a little bit of a tangent here, but it's interesting to me how I think among

non -scientists lay people in the public.

255

:

Science is often seen to be infallible, as though science is real.

256

:

Science is the truth.

257

:

There's a lot of, since that 2016 election, there are lots of, people have lawn signs in

the US that say, that basically have a list of liberal values, most of which

258

:

I'm a huge fan of.

259

:

And of course I like the sentiment, this idea, you know, that they're supporting science

on this sign, on the sign as well.

260

:

But it says the sign, the way that they phrase it is science is real.

261

:

And the implication there for me, every time I see the sign is that, you know, and I think

that could be, for example, related to vaccines, I think, you know, around, you know,

262

:

there was a lot of conflict around vaccines and what their real purpose is and, you know,

and then so.

263

:

the lay liberal person is like, you know, this is science, know, trust science, it's real.

264

:

Whereas from the inside, it's, you you pointed it out already there, but it's this

interesting irony that the whole point of science is that we're saying, I'm, I'm, I'm

265

:

never confident of anything.

266

:

I'm always open to this being wrong.

267

:

Yeah.

268

:

Yeah.

269

:

No, exactly.

270

:

and I think that's, that's kind of the distinction.

271

:

That's often made in epistemology actually between science on one hand and research on the

other end, where research is science in the making.

272

:

Science is like the collective knowledge that we've accumulated since basically the

beginning of modern science, at least in the Western hemisphere, so more or less during

273

:

the Renaissance.

274

:

Then research is people making that science because...

275

:

people have to do that and how do we come up with that?

276

:

so, yeah, like definitely I'm one who always emphasizes the fact that, yeah, now we know

the Earth is round.

277

:

We know how to fly planes, but there was a moment we didn't.

278

:

And so how do we come up with that?

279

:

And actually, maybe one day we'll discover that we were doing it kind of the wrong way,

you know, flying planes, but it's just like, for now it works.

280

:

We have the

281

:

best model that we can have right now with our knowledge.

282

:

But maybe one day we'll discover that there is a way better way to fly.

283

:

And it was just there staring at us and it took years for us to understand how to do that.

284

:

yeah, like as you were saying, but that's really hard line to walk because you have to

say, yeah.

285

:

Like these knowledge, these facts are really trustworthy, but you can never trust

something 100 % because otherwise mathematically, if you go back to base formula, you

286

:

actually cannot update your knowledge.

287

:

you, if you have a 0 % prior or 1 % prior, like mathematically, you cannot apply base

formula, which tells you, well, based on new data that you just observed the most

288

:

rational way of updating your belief is to believe that with that certainty.

289

:

But if you have zero or 100%, it's never going to be updated.

290

:

So you can say 99 .9999 % that what we're doing right now by flying is really good.

291

:

But maybe, like, you never know.

292

:

There is something that will appear.

293

:

And physics is a real...

294

:

We've all seen UFOs, Alex.

295

:

We know that there's better ways to fly.

296

:

Yeah.

297

:

Exactly.

298

:

Yeah, but yeah, think physics is actually a really good field for that because it's always

evolving and it's always coming up with really completely crazy paradigm shifting

299

:

explanation like relativity, special relativity, then general relativity just a century

ago that didn't exist.

300

:

And now we start to understand a bit better, but even now we don't really understand how

to

301

:

how to blend relativity and gravity.

302

:

so that's extremely interesting to me.

303

:

But yeah, I understand that politically from a marketing standpoint, it's hard to sell,

but I think it's shooting yourself in the foot if you're saying, yeah, is always like,

304

:

science works, I agree, science works, but it doesn't have to be 100 % true and sure.

305

:

for it to work.

306

:

That's why placebo works.

307

:

Placebos work, right?

308

:

It's just something that works even though it doesn't have any actual concrete evidence

that it's adding something, but it works.

309

:

yeah, like I think it's really shooting yourself in the foot by saying that no, that's

100%.

310

:

Like if you question science, then you're anti -science.

311

:

No.

312

:

Actually, it's the whole scientific methods to be able to ask questions all the time.

313

:

The question is how do you do that?

314

:

Do you apply the scientific method to your questions or do you just question anything like

that without any method?

315

:

And just because you fancy questioning that because it goes against your belief to begin

with.

316

:

So yeah, that's one thing.

317

:

And then I think another thing that you said I think is very interesting is,

unfortunately, I think the way of teaching science and communicating around it,

318

:

is not very incarnated.

319

:

It's quite dry.

320

:

You just learn equations and you just learn that stuff.

321

:

Whereas science was made by people and is made by people who have their biases, who have

extremely violent conflicts.

322

:

Like you were saying, Fisher was just a huge jerk to everybody around him.

323

:

I think it would be interesting to

324

:

get back to a bit of that human side to make science less dry and also less intimidating

thanks to that.

325

:

Because most of the time when I tell people what I do for a living, they get super

intimidated and they're like, my God, yeah, I hate math, I hate stats and stuff.

326

:

But it's just numbers.

327

:

It's just a language.

328

:

it's a bit dry.

329

:

For instance, if there is someone who is into movies, who does movies in your audience.

330

:

I want to know why there is no movie about Albert Einstein.

331

:

There has to be a movie about Albert Einstein.

332

:

Like not only huge genius, but like extremely interesting life.

333

:

Like honestly, it makes for great movie.

334

:

was working in a a dramatized biopic.

335

:

mean?

336

:

Yeah.

337

:

Yeah.

338

:

I mean, it's like his life is super interesting.

339

:

Like he revolutionized the field of two fields of physics and actually chemistry.

340

:

In 1905, it's like his big year, and he came up with the ideas for relativity while

working at the patent bureau in Bern in Switzerland, which was an extremely boring job.

341

:

In his words, it was an extremely boring job.

342

:

Basically, having that boring job allowed him to do that being completely outside of the

academic circles and so on.

343

:

It's like he makes for a perfect movie.

344

:

I don't understand why it's not there.

345

:

And then I sing on the cake.

346

:

He had a lot of women in his life.

347

:

So it's like, you know, it's perfect.

348

:

Like you have you have the sex you have, you have the drama, you have revolutionizing the

field, you have Nobel prizes.

349

:

And he and then he became a like a pop icon.

350

:

I don't know where the movies.

351

:

Yeah, it is wild.

352

:

Actually, now that you pointed out, it's kind of surprising that there aren't movies about

him all the time.

353

:

Like Spider -Man.

354

:

Yeah, I agree.

355

:

Well, there was one about Oppenheimer last year.

356

:

Maybe that started to trend.

357

:

see.

358

:

Yeah.

359

:

So in addition to the podcast, you also, I mentioned this at the outset, I said that your

co -founder and principal data scientist of

360

:

the popular Bayesian stats modeling platform, PyMC.

361

:

So like many things in data science, it's uppercase P, lowercase y for Python.

362

:

What's the MC, PyMC, one word, and C are capitalized.

363

:

Yeah.

364

:

So it's very confusing because it stands for Python and then MC is Monte Carlo.

365

:

So I understand.

366

:

But why Monte Carlo?

367

:

It's because it comes from Markov chain Monte Carlo.

368

:

So actually it should be pie MCMC or pie MC squared, which is what I'm saying since the

beginning.

369

:

anyways, yeah, it's actually, it's actually buying C squared.

370

:

so for Markov chain Monte Carlo and Markov chain Monte Carlo is one of the main ways that

all of their algorithms now, new ones, but like the blockbuster algorithm to run a patient

371

:

models is to use MCMC.

372

:

Yeah.

373

:

So in the same way that stochastic gradient descent is like the de facto standard for

finding your model weights in machine learning, Markov chain Monte Carlo is kind of the

374

:

standard way of doing it with a Bayesian network.

375

:

Yeah.

376

:

Yeah.

377

:

Yeah.

378

:

And, so now there are newer versions, more efficient versions.

379

:

That's, that's basically the name of the game, right?

380

:

Making the efficient, the algorithm more and more efficient.

381

:

but the first algorithm.

382

:

days back, I think it was actually invented during the project Manhattan.

383

:

during the world during World War Two.

384

:

Game of the day.

385

:

Yeah.

386

:

And lots of physicists actually, statistical physics is a film that's contributed a lot to

MCMC.

387

:

so yeah, like physicists who came to the field of statistics and trying to make the

algorithms more efficient for their models.

388

:

And yeah, so they buy

389

:

They have contributed a lot.

390

:

The field of physics has contributed a lot of big names and people to great leaps into the

realm of more efficient algorithms.

391

:

I don't know who your audience is, but that may sound boring.

392

:

Yeah, the algorithm, it's like the workhorse.

393

:

But it's extremely powerful.

394

:

And that's also one of the main reasons why patients' statistics are

395

:

increasing in popularity lately because

396

:

I'm going to argue that it's always been the best framework to do statistics, to do

science, but it was hard to do with pen and paper because the problem is that you have a

397

:

huge nasty integral on the numerator, on the denominator, sorry.

398

:

And this integral is not computable by pen and paper.

399

:

So for a long, long time, patient statistics combined to features, you know, like

campaigns.

400

:

PR campaigns, patients S6 was relegated to the margins because it was just super hard to

do.

401

:

so for other problems, other than very trivial ones, it was not very applicable.

402

:

But now with the advent of personal computing, you have these incredible algorithms like,

so now most of time it's HMC, Hamiltonian Monte Carlo.

403

:

That's what we use under the hood with PIMC.

404

:

But if you use Stan, if you use NumPyro, it's the same.

405

:

And thanks to these algorithms, now we can make extremely powerful models because we can

approximate the posterior distributions thanks to, well, computing power.

406

:

A computer is very good at computing.

407

:

I think that's why it's called that.

408

:

Yes.

409

:

And so that reminds me of deep learning.

410

:

It's a similar kind of thing where the applications we have today, like your chat GPT or

whatever your favorite large language model is these amazing video generation like Sora,

411

:

all of this is happening thanks to deep learning, which is an approach we've had since the

fifties, certainly not as old as Bayesian statistics, but similarly it has been able to

412

:

take off with much larger data sets and much more compute.

413

:

Yeah.

414

:

Yeah.

415

:

Yeah.

416

:

Yeah, very good point.

417

:

And I think that's even more the point in deep learning.

418

:

for sure.

419

:

Because Beijing stats doesn't need the scale, but the way we're doing deep learning for

now definitely need the scale.

420

:

Yeah, yeah.

421

:

Scale of data.

422

:

Yeah, exactly.

423

:

Yeah, sorry.

424

:

Yeah, the scale.

425

:

Because there two scales, data and...

426

:

Yeah, you're right.

427

:

Yeah, and for like model parameters.

428

:

And so that has actually, I mean, tying back to something you said near the beginning of

this episode is that actually one of the advantages of Beijing statistics is that you can

429

:

do it with very few data.

430

:

Yeah.

431

:

maybe fewer data than with a frequentist approach or machine learning approach.

432

:

Because you can bake in your prior assumptions and those prior assumptions give some kind

of structure, some kind of framework for your data to make an impact through.

433

:

Yeah, completely.

434

:

So for our listeners who are listening right now, if they are keen to try out Bayesian

statistics for the first time, why should they reach for PyMC?

435

:

Which, as far as I know, is the most used.

436

:

Bayesian framework, period.

437

:

And certainly in Python.

438

:

and then the second I'm sure is Stan.

439

:

yeah.

440

:

Yeah.

441

:

And, so, yeah, why, should somebody use pyMc and maybe even more generally, how can they

get started if they haven't done any Bayesian statistics before at all?

442

:

Yeah.

443

:

Yeah.

444

:

Yeah.

445

:

Fantastic question.

446

:

I think it's a, yeah, it's a very good one because, that can also be very intimidating.

447

:

And actually that can be a paradox of choice.

448

:

know, where now we're lucky to live in a world where we actually have a lot of

probabilistic programming languages.

449

:

So you'll see that sometimes that's called PPL and that's what's a PPL.

450

:

It's basically PINC.

451

:

It's a software that enables you to write down Bayesian models and sample from them.

452

:

Okay.

453

:

So it's just a fancy word to say that.

454

:

Yeah, my main advice is don't overthink it.

455

:

Like if you're proficient in R, then probably you want to try, I would definitely

recommend trying BRMS first because it's built on top of Stan and Stan is extremely good.

456

:

It's built by extremely good modelers and statisticians.

457

:

Lots of them have been on my podcast.

458

:

So if you're curious, just, just go there and you go on the website, you look for Stan and

you'll get a lot of them.

459

:

the best one is most of the time, Andrew Gellman, absolutely amazing to have him on the

show.

460

:

He, he always explains stuff extremely clearly.

461

:

but I also had Bob Carpenter, for instance, Matt Hoffman.

462

:

so anyways, if you know, or.

463

:

yeah.

464

:

Have you ever had Rob Tran Gucci on the show, or do you know who he is?

465

:

I know, but I have never had him on the show.

466

:

So, but I'd be happy to.

467

:

Yeah.

468

:

If you know him, I'll make an introduction for you.

469

:

He was on our show in episode number 507.

470

:

And that was our first ever Beijing episode.

471

:

And it was the most popular episode of that year, 2021, the most popular episode.

472

:

And it was interesting because also up until that time, at least with me hosting, 2021 was

my first year hosting the show.

473

:

And it was by far our longest episode.

474

:

I was like, that was kind of concerning for me.

475

:

was like, this was a super technical episode, super long.

476

:

I was like, how is this going to resonate?

477

:

It turns out that's what our audience loves.

478

:

And that's something we've been leaning into a bit in 2024 is more technical, longer.

479

:

Well, that's good to know.

480

:

Yeah.

481

:

I'll make an intro for Rob.

482

:

Anyway, you were saying I could do an intro for you.

483

:

Yeah, I know.

484

:

But yeah.

485

:

Great, great interruption for sure.

486

:

I'm happy to have that introduction made.

487

:

a lot.

488

:

Yeah, so I was saying, if you're proficient in R, definitely give a try to BRMS.

489

:

It's built on top of Stan.

490

:

Then when you outgrow BRMS, go to Stan.

491

:

If you love Stan, but you're using Python, there is PyStan.

492

:

I've never used that personally, but I'm pretty sure it's good.

493

:

and then, but I would say if you're proficient in Python and don't really want to go to R

then yeah, like you probably want to give a try to, to PIMC or to NumPyro.

494

:

You know, give that a try, see what, what resonate most with you, the, API most of the

time, because if you're going to make models like that, you're going to spend a lot of

495

:

time on your code and on your models.

496

:

And, as most of your audience probably know, like

497

:

The models always fail unless it's the last one.

498

:

So, yeah, you have to love really the framework you're using and find it intuitive.

499

:

Otherwise, it's going to be hard to keep it going.

500

:

If you're really, really a beginner, I would also recommend on the Python realm, give a

try to Bambi, which is the equivalent of BRMS, but in Python.

501

:

So Bambi is built on top of Climacy.

502

:

And what it does, it does a lot of the choices for you.

503

:

It makes a lot of the choices for you under the hood.

504

:

So priors, stuff like that, which can be a bit overwhelming to beginners at the beginning.

505

:

But then when you outgrow Bambi, you want to make more complicated models, then go to

BiMC.

506

:

Bambi, that's a really cute name for a model that's just like, it just drops out of its

mother and can barely stand up straight.

507

:

Yeah, And the guys working on Bambi, so Tommy Capretto, Osvaldo Martino.

508

:

So they are like, yeah, really great guys.

509

:

Both Argentinians, actually.

510

:

And yeah, like they are fun guys.

511

:

I think the website for Bambi is bambinos .github .com.

512

:

yeah, like these guys.

513

:

These guys are fun.

514

:

But yeah, it's definitely a great framework.

515

:

And actually, this week, we released with Tommy Capretto and Ravin Kumar.

516

:

We actually released an online course, our second online course that we've been working on

for two years.

517

:

So we are very happy to have released it.

518

:

But we're also very happy with the course.

519

:

That's why it took so long.

520

:

It's a very big course.

521

:

And that's exactly what we do.

522

:

We take you from beginner.

523

:

We teach you Bambi, teach you Pinsene and you go up until advanced.

524

:

That's called advanced regression.

525

:

So we teach you like all things regression.

526

:

What's the course called?

527

:

Advanced regression.

528

:

Yeah.

529

:

Advanced regression on the intuitive base platform that you were kind enough to mention at

the beginning.

530

:

Nice.

531

:

Yeah.

532

:

I'll be sure to include that in the show notes.

533

:

And so even though it's called advanced regression, you start us off with an introduction

to Bayesian statistics and we start getting our

534

:

with Bambi before moving on to PyMC, yeah?

535

:

Yeah, yeah, yeah.

536

:

So you have a regression refresher at the beginning.

537

:

If you're a complete, complete beginner, then I would recommend taking our intro course

first, which is really here.

538

:

It's really from the ground up.

539

:

The advanced regression course, well, ideally you would do that after the intro course.

540

:

But if you're already there in your learning curve, then you can start with the intro

course.

541

:

It makes a bit more assumption on

542

:

on the student's part, like, yeah, they have heard about Bayesian stats.

543

:

They are aware of the ideas of priors, likelihood, posteriors.

544

:

But we give you a refresher about the classic progression.

545

:

it's like when you have a normal likelihood.

546

:

And then we teach you how to generalize that framework to data that's not normally

distributed.

547

:

And we start with BAMBEE.

548

:

We show you how to do the equivalent models in PIMEC.

549

:

And then at the end, the model became, becomes, become like much more complicated, then we

just show it in point C.

550

:

Nice.

551

:

That is super, super cool.

552

:

I hope to be able to find time to dig into that myself soon.

553

:

It's one of those things.

554

:

yeah.

555

:

You and I were lamenting this before the show, podcasting of itself can take up so much

time on top of, in both of our cases, we have full -time jobs.

556

:

This is something that we're doing as a hobby.

557

:

And it means that I'm constantly talking to amazingly interesting people like you who have

developed fascinating courses that I want to be able to study.

558

:

And it's like, when am going to do that?

559

:

Like book recommendations alone.

560

:

Like I barely get to read books anymore.

561

:

That was something like since basically the pandemic hit.

562

:

I, and it's, it's so embarrassing for me because I, I identify in my mind as a book

reader.

563

:

And sometimes I even splurge.

564

:

I'm like, wow, I've got to get like these books that I absolutely must read.

565

:

And they just collect in stacks around my apartment.

566

:

Like, yeah.

567

:

Yeah.

568

:

Yeah.

569

:

I mean, that's hard for sure.

570

:

yeah, it's something I've also been trying to, get under control a bit.

571

:

yeah, like I find, so a guy who does good work, I find on that is, can you port

572

:

Yes, Cal Newport, of course.

573

:

I've been collecting his books too.

574

:

Yeah, that's the irony.

575

:

So he's got a podcast.

576

:

don't know about you, but me, I listen to tons of podcasts.

577

:

So the audio format is really something I love.

578

:

So podcasts and audio books.

579

:

yeah, that can be your entrance here.

580

:

Maybe you can listen to more books if you don't have time to write.

581

:

Yeah, it's an interesting...

582

:

don't really have a commute.

583

:

and I often, use like, you know, when I'm traveling to the airport or something, I use

that as an opportunity to like do catch up calls and that kind of thing.

584

:

So it's interesting.

585

:

I, I, I almost listened to no other podcasts.

586

:

The only show I listened to is last week in AI.

587

:

I don't know if you know that show.

588

:

Yeah.

589

:

Yeah.

590

:

Yeah.

591

:

Great show.

592

:

I like them a lot.

593

:

put a lot of work into Jeremy and Andre do a lot of work to get.

594

:

Kind of all of the last week's news constant in there.

595

:

so it's impressive.

596

:

It allowed me to flip from being this person where prior to finding that show and I found

it cause Jeremy was a guest on my show.

597

:

was an amazing guest by the way.

598

:

don't know if he'd have much to say about Bayesian statistics, but he's an incredibly

brilliant person is so enjoyable to listen to.

599

:

and, and someone else that I'd love to make an intro for you.

600

:

He's, he's become a friend over the years.

601

:

Yeah, for sure.

602

:

But yeah, last week in AI, they, I don't know why I'm talking about it so much, but they,

I went from being somebody who would kind of have this attitude when somebody would say,

603

:

if you heard about this release or that, or, I'd say, you know, just because I work in AI,

I can't stay on top of every little thing that comes out.

604

:

And now since I started listening to last week in AI about a year ago, I don't think

anybody's caught me off guard with some, with some new release.

605

:

I'm like, I know.

606

:

Yeah, well done.

607

:

Yeah, that's good.

608

:

Yeah, but that makes your life hard.

609

:

Yeah, for sure.

610

:

If you don't have a commute, come on.

611

:

But I'd love to be able to completely submerge myself in Bayesian statistics.

612

:

is a life goal of mine, is to be able to completely, because while I have done some

Bayesian stuff and in my PhD, I did some Markov chain Monte Carlo work.

613

:

And there's just obviously so much flexibility and nuance to this space.

614

:

can do such beautiful things.

615

:

I have a huge fan of Bayesian stats.

616

:

And so yeah, it's really great to have you on the show talking about it.

617

:

So, Pi MC, which we've been talking about now, kind of going back to our, back to our

thread.

618

:

Pi MC uses something called Pi tensor to leverage GPU acceleration and complex graph

optimizations.

619

:

Tell us about PyTensor and how this impacts the performance and scalability of Bayesian

models.

620

:

Yeah.

621

:

Great question.

622

:

basically the way PyMAC is built is we need a backend.

623

:

And historically this has been a complicated topic because the backend, then we had to do

the computation.

624

:

Otherwise you have to do the computations in Python.

625

:

And that's slower than doing it in C, for instance.

626

:

And so we have still that C backend.

627

:

That's kind of a historical remnant, but more and more we're using.

628

:

when I say we, I don't do a lot of PyTensor code to be honest.

629

:

mean, contributions to PyTensor.

630

:

I mainly contribute to PyC.

631

:

PyTensor is spearheaded a lot by Ricardo Viera.

632

:

Great.

633

:

great guy, extremely good modeler.

634

:

basically the idea of PyTensor is to kind of outsource the computation basically that PymC

is doing.

635

:

And then, especially when you're doing the sampling, PyTensor is going to delegate that to

some other backends.

636

:

And so now instead of having just the C backend,

637

:

you can actually sample your PIMC models with the number backend.

638

:

How do you do that?

639

:

You use another package that's called nutpy that's been built by Adrian Seybold, extremely

brilliant guy again.

640

:

I'm like surrounded by guys who are much more brilliant than me.

641

:

And that's how I learned basically.

642

:

I just ask them questions.

643

:

That's what I feel like in my day job at Nebula, my software company.

644

:

just like, Yeah, sorry.

645

:

I'm just completely interrupting you.

646

:

Yeah, no, same.

647

:

And so, yeah.

648

:

So Adrian basically re -implemented HMC with NutPy, but using Numba and Rust.

649

:

And so that goes way faster than just using Python or even just using C.

650

:

And then you can also sample your models with two other backends that we have that's

enabled by PyTensor that then basically compiles the graph of the model and then delegates

651

:

these operations, computational operations, to the sampler.

652

:

And then the sampler, as I was saying, can be the one from NutPy, which is in Rust and

Numba.

653

:

And otherwise, it can be the one from NumPyro.

654

:

actually, you can call the NumPyro sampler with a PIMC model.

655

:

And it's just super simple.

656

:

Like in pm .sample, you're just like, there's a keyword argument that's nuts underscore

sampler and you just say nutpy or NumPyro.

657

:

And I tend to use NumPyro a lot when I'm doing Gaussian processes because I don't know

why, but so most of the time using nutpy, but when I'm doing Gaussian processes somewhere

658

:

in the model, I tend to use NumPyro because like for some reason in their routine, in

their

659

:

algorithm, there is some efficiency they have in the way they compute the matrices.

660

:

And GPs are basically huge matrices and dot products.

661

:

so yeah, like usually NumPyRoll is usually very efficient for that.

662

:

And you can also use Jax now to sample your model.

663

:

So we have like these different backends and it's enabled because PyTensor is that

664

:

backend that nobody sees most of the time you're not implementing a Python or operation in

your models.

665

:

Sometimes we do that on PNC Labs when we're working on a very custom operation, but

usually it's done under the hood for you.

666

:

And then Python compiles the graph, the symbolic graph, can dispatch that afterwards to

whatever the best way of computing the posterior distribution afterwards is.

667

:

Nice.

668

:

You alluded there.

669

:

to something that I've been meaning to get to asking you about, which is the Pi MC labs

team.

670

:

So you have Pi MC, the open source library that anybody listening can download.

671

:

And of course I haven't shown us for people to download and they can get rolling on doing

their Bayesian stats right now, whether they're, it's already something they have

672

:

expertise in or not.

673

:

Hi, MC labs.

674

:

It sounds like you're responsible and just fill us in, but I'm kind of, gathering that the

team there is responsible both for developing.

675

:

IMC, but also for consulting because you kind of you mentioned there, you know, sometimes

we might do some kind of custom implementation.

676

:

So first of all, yeah, tell us a little bit about PI MC labs.

677

:

And then it'd be really interesting to hear one or more interesting examples of how

Bayesian statistics allows some client or some use case, allows them to do something that

678

:

they wouldn't be able to do with another approach.

679

:

Yeah.

680

:

So yeah, first, go install PyMC on GitHub and open PRs and stuff like that.

681

:

We always love that.

682

:

And second, yeah, exactly.

683

:

PyMC is kind of an offspring of PyMC in the sense that everybody on the team is a PyMC

developer.

684

:

So we contribute to PyMC.

685

:

This is open source.

686

:

This is

687

:

free.

688

:

This is free and always will be as it goes.

689

:

But then on top of that, we do consulting.

690

:

what's that about?

691

:

Well, most of the time, these are clients who want to do something with PMC or even more

general with patient statistics.

692

:

And they know we do that and they do not know how to do that either because they don't

have the time or to

693

:

train themselves or they don't want to, or they don't have the money to hire a Bayesian

modeler full time, various reasons.

694

:

But basically, yeah, like they are stuck in at some point in the modeling workflow, they

are stuck.

695

:

It can be at the very beginning.

696

:

It can be, well, I've tried a bunch of stuff.

697

:

I can't make the model converge and I don't know why.

698

:

So it can be like a very wide array of situations.

699

:

Most of the time people know.

700

:

us because like me for the podcast or for PMC, most of the other guys for PMC or for other

technical writing that they do around.

701

:

So basically that's like, that's not really a real company, but just a bunch of nerds if

you want.

702

:

But no, that's a real company, but we like to define us as a bunch of nerds because like

that's how it really started.

703

:

And it, in a sense of

704

:

you guys actually consulting with companies and making an impact in that sense, it is

certainly a company.

705

:

Yeah.

706

:

So yeah.

707

:

So tell us a bit about projects.

708

:

mean, you don't need to go into detail with client names or whatever, if that's

inappropriate, but it would be interesting to hear some examples of use cases, use cases

709

:

of Bayesian statistics in the wild, enabling capabilities that other kinds of modeling

approaches wouldn't.

710

:

Yeah.

711

:

Yeah.

712

:

Yeah.

713

:

No, definitely.

714

:

Yeah, so of course I cannot enter into the details, but I can definitely give you some

ideas.

715

:

When I can actually enter into the details is a project we did for an NGO in Estonia,

where they were getting polling data.

716

:

So every month they do a poll of Estonian citizens about various questions.

717

:

These can be horse -pull.

718

:

races, horse races, polls.

719

:

but this can be also, you know, news questions like, do you think Estonia should ramp up

the number of soldiers at the border with Russia?

720

:

do you think, same sex marriage should be legal?

721

:

Things like that.

722

:

I hear some Overton window coming on.

723

:

No, that's what I thought.

724

:

I thought we might go there.

725

:

Yeah, this is now I'm completely taking you off on a sidetrack, but Serge Macisse, our

researcher came up with a great question for you because you had Alan Downey on your show.

726

:

He's an incredible guest.

727

:

I absolutely loved having him on our program.

728

:

So he was on here in episode number 715.

729

:

And in that episode, we talked about the Overton window, which is related to what you're

kind of just talking about.

730

:

So kind of, you know, people

731

:

What is, how does society think about say same sex marriage?

732

:

Where, know, if you looked a hundred years ago or a thousand years ago or 10 ,000 years

ago or a thousand years into the future or 10 years into the future at each of those

733

:

different time points, there's a completely, well, maybe not completely different, but

there's a there's a varying range of people who think, you know, what's acceptable or

734

:

what's not acceptable.

735

:

And so, and this is

736

:

You we were talking earlier in the episode about bias, so it kind of ties into this.

737

:

You, you know, you might have your idea as a listener to the show, you might be a

scientist or an engineer and you think, I am unbiased, you know, I, know the real thing

738

:

and, but you don't because you are a product of your times.

739

:

And the Overton window is kind of a way of describing this on any given issue.

740

:

There is some range.

741

:

And it would fit a probability distribution where, you know, there's some people on a far

extreme one way and some people on a far extreme the other way.

742

:

But in general, all of society is moving in one direction, typically in a liberal

direction on a given social issue.

743

:

And this varies by region.

744

:

It varies by age.

745

:

Anyway, I think Overton windows are really fascinating.

746

:

and, yeah.

747

:

so

748

:

Completely derailed your conversation, but I have feeling you're going to have something

interesting to say.

749

:

Yeah, no, mean, that's related to that for sure.

750

:

yeah, basically, and that's really because yeah, we like I had also Alan Downey on the

show for his latest book.

751

:

And that was also definitely about that.

752

:

So probably overthinking it was the book.

753

:

Yeah.

754

:

Yeah.

755

:

Great.

756

:

Great.

757

:

Great.

758

:

And yeah, great book.

759

:

And so basically they like the, NGO, have, they had these survey data, right?

760

:

And they're like, but there are the clients have questions and their clients are usually

media or politicians.

761

:

it's like, yeah, but I'd like to know on a geographical basis, you know, like in these

electoral districts, what do people think about that?

762

:

Or in these electoral district, female educated.

763

:

of that age, what do they think about same -sex marriage?

764

:

That's hard to do because polling at that scale is almost impossible.

765

:

It costs a ton of money.

766

:

Also, polling is harder and harder because people answer less and less to polls.

767

:

At the same time, the polling data becomes less available and less reliable, but

768

:

you have people who get more interested in what the polls have to say.

769

:

It's hard.

770

:

There is a great method to do that.

771

:

What we did for them is come up with a hierarchical model of the population because

hierarchical models allow you to share information between groups.

772

:

Here the groups could be the age groups, for instance.

773

:

Basically, knowing something what a hierarchical model says,

774

:

is, well, age groups are different, but they are not infinitely different.

775

:

So learning about what someone aged 16 to 24 thinks about same -sex marriage actually

already tells you something about what someone aged 25 to 34 thinks about that.

776

:

And the degree of similarity between these responses is estimated by the model.

777

:

So these models are extremely powerful.

778

:

I love them.

779

:

I teach them a lot.

780

:

And actually in the advanced regression course, the last lesson is all about hierarchical

models.

781

:

And I actually walk you through a simplified version of the model we did at Pintsy Labs

for that NGO called SONC in Estonia.

782

:

So it's like a model that's used in industry for real.

783

:

you learn that.

784

:

That's a hard model, but that's a real model.

785

:

Then once you've done that, you do something that's called post stratification.

786

:

And post stratification is basically a way of debiasing your estimates, your predictions

from the model, and you use census data to do that.

787

:

So you need good data and you need census data.

788

:

But if you have good census data, then you're going to be able to basically reweight the

predictions from your model.

789

:

And that way, if you combine

790

:

post -certification and hierarchical model, you're going to be able to give actually good

estimates of what females, educated, age 25, 34 in this electoral district think about

791

:

that issue.

792

:

And when I say good, I want to say that it's like the confidence intervals are not going

to be ridiculous.

793

:

It's not going to tell you

794

:

Well, these population think is opposed to gay marriage with a probability of 20 to 80%,

which just covers basically everything.

795

:

So that's not very actionable.

796

:

No, the model has like, it's more uncertain, of course, but it has a really good way of

giving you something actually actionable.

797

:

So that was a big project.

798

:

I can dive into some others if you want, but that...

799

:

That takes some, I don't want to deride the interview.

800

:

That's great and highly illustrative.

801

:

It gives that sense of how with a Bayesian model, you can be so specific about how

different parts of the data interrelate.

802

:

So in this case, for example, you're describing having different demographic groups that

have some commonality, like all the women, but different age groups of women as a sub

803

:

node, as sub nodes of women in general.

804

:

That way you're able to use the data from each of the subgroups to influence your higher

level group.

805

:

And actually something that might be interesting to you, Alex, is that my introduction to

both our programming and I guess, well, to hierarchical modeling is Gelman and Hill's

806

:

book, which yeah, obviously Andrew Gelman you've already talked about on the show.

807

:

Jennifer Hill also

808

:

brilliant causal modeler and has also been on the Super Data Science podcast.

809

:

And that was episode number 607.

810

:

Anyway, we're getting into lots of, there's lots of listening for people to do out there

between your show and mine based on guests that we've talked about on the program.

811

:

Hopefully lots of people with long commutes.

812

:

So yeah, fantastic.

813

:

That's a great example.

814

:

Alex.

815

:

Another library, open source library, in addition to PyMC that you've developed is Rviz,

which has nothing to do with the programming language R.

816

:

So it's A -R -V -I -Z or Zed, Rviz.

817

:

And this is for post -modeling workflows in VasionStats.

818

:

So tell us about why, you know, what is post -modeling workflows?

819

:

What does that matter?

820

:

And how does Rviz solve problems for us there?

821

:

Yeah, yeah, great questions.

822

:

And I'll make sure to also, before it related to your previous question, send you some

links with other projects that could be interesting to people like MediaMix Models.

823

:

I've interviewed Luciano Paz on the show.

824

:

We've worked with HelloFresh, for instance, to come up with a MediaMix Marketing Model for

them.

825

:

Luciano talks about that in that episode.

826

:

Also send you a blog post with spatial data, with Gaussian processes.

827

:

That's something we've done for an agricultural client.

828

:

And I already sent you a link of a video webinar we did with that NGO, with that client in

Estonia.

829

:

And we talked, we go a bit deeper into the project.

830

:

And I'll send you of course also the...

831

:

the Learn Based Stats episode because the president, Thermo, the president of that NGO was

on the show.

832

:

Nice.

833

:

Yeah.

834

:

I'll be sure, of course, to include all of those links in the show notes.

835

:

Yeah.

836

:

Yeah.

837

:

Because I guess people come from different backgrounds and so someone is going to be more

interested in marketing, another one more in social science, another one more in spatial

838

:

data.

839

:

So that way people can pick and choose what they are most curious about.

840

:

So obvious.

841

:

Yeah.

842

:

What is it?

843

:

That's basically your friend for any post -model, post -sampling graph.

844

:

And why is that important?

845

:

Because actually models steal the show and they are the star on the show.

846

:

But a model is just one part of what we call the Bayesian workflow.

847

:

And the Bayesian workflow just has one step, which is the modeling.

848

:

But all the other steps don't have to do anything with the model.

849

:

There is a lot of steps before sampling the model and then there is a lot of steps

afterwards.

850

:

And I would argue that these steps afterwards are almost as important as the model.

851

:

Why?

852

:

Because it's what's going to face the customer of the model.

853

:

Your model is going to be consumed by people who most of the time don't know about models

and also often don't care about models.

854

:

That's a shame because I love models, but you know, lots of the time they don't really

care about the model.

855

:

They care about the results.

856

:

And so a big part of your job as the modeler is to be able to convey that information in a

way that someone who is not a stat person, a math person can understand and use in their

857

:

work.

858

:

Whether that is a football coach.

859

:

or a data scientist or someone working in HelloFresh marketing department, you have to

adapt the way you talk to those people and the way you present the results of the model.

860

:

And the way you do that is with amazing graphs.

861

:

So a lot of your time as a modeler is spent thinking out how to decipher what the model

can tell, what the model cannot tell.

862

:

also very important, and with which confidence, and since we're humans, we use our eyes a

lot, and the way to convey that is with plots.

863

:

And so you spend a lot of time plotting stuff as a Bayesian modeler, especially because

Bayesian models don't give you, you one -point estimate.

864

:

They give you full distributions for all the parameters.

865

:

So you get distributions all the way down.

866

:

So,

867

:

That's a bit more complex to wrap your head around at the beginning, but once your brain

is used to that gym, that's really cool because that gives you opportunities for amazing

868

:

plots.

869

:

yeah, like RVs is here for you for that.

870

:

It has a lot of the plots that we use all the time in the Bayesian workflow.

871

:

One, to diagnose your model.

872

:

So to understand if there is any red flag in the convergence of the model.

873

:

And then once you're sure about the quality of your results, then how do you present that

to the customer of the model?

874

:

then, then obvious also has a lot of plots for you here.

875

:

And the cool thing of obvious is that it's platform agnostic.

876

:

What do I mean by that?

877

:

It's that you can run your model in PIMC in a pyro in Stan, and then use obvious because

obvious is expecting a special format of data that all these PPLs can give you, which is

878

:

called the inference data object.

879

:

Once you have that, ARVIS doesn't care where the model was run.

880

:

And that's super cool.

881

:

And also it's available in Julia.

882

:

So that's Python package, but there is the Julia equivalent for people who use Julia.

883

:

So yeah, it's a very good way of starting that part of the workflow, which is extremely

important.

884

:

Nice.

885

:

That was a great tour.

886

:

And of course, I will again have a link to ARVIS in the show notes for people who want to

be using that for your post modeling needs with your vision models.

887

:

including diagnostics like looking for red flags and being able to visualize results and

passes off to whoever the end client is.

888

:

I think it might be in the same panel discussion with the head of that NGO, Tormo Uristo.

889

:

Yes.

890

:

That's my Spanish.

891

:

I'm in Argentina right now, so the Spanish is automatic.

892

:

Actually, I'm relieved to know that you're in Argentina because I was worried that I was

keeping you up way too late.

893

:

No, no, no, no, no.

894

:

Nice.

895

:

So yeah, so in that interview, Tarmu talks about adding components like Gaussian processes

to make models, Bayesian models, time aware.

896

:

What does that mean?

897

:

And what are the advantages and potential pitfalls of incorporating advanced features like

time awareness into Bayesian models?

898

:

Yeah, great research, can see that.

899

:

Great research, Serge Pessis, really.

900

:

Yeah.

901

:

No, that's impressive.

902

:

I had a call with the people from Google Gemini today.

903

:

So they're very much near the cutting edge of developing Google Gemini alongside Cloud 3

for Anthropic and of course, GPT -4, GPT -4 -0, whatever, from OpenAI.

904

:

These are the frontier of LLMs.

905

:

So I'm on a call with...

906

:

half a dozen people from the Google Gemini team.

907

:

And they were insinuating kind of near the end with some of the new capabilities they

have.

908

:

And there are some cool things in there, which I need to spend more time playing around

with like gems.

909

:

I don't know if you've seen this, but the gems in Google Gemini, they allow you to have a

context for different kinds of tasks.

910

:

like, for example, there are some parts of my podcast production workflow where I have

different context, different needs.

911

:

at each of those steps.

912

:

And so it's very helpful with these Google Gemini gems to be able to just click on that

and be like, okay, now I'm in this kind of context.

913

:

I'm expecting the, the LLM to output in this particular way.

914

:

And the Google Gemini people said, well, and maybe you'll be able to use these gems to

kind of be replacing, you know, within the workflow of other people working on your

915

:

podcast, you'll be able to use them to replace.

916

:

I was like, you know, you know, for example, they give the example of research and I was

like,

917

:

I hope that our researcher, for example, is using generative AI tools to assist his work.

918

:

But I think we're quite a ways away with all of the amazing things that alums can do.

919

:

I think we're still quite a ways from like the kind of quality of research that search mrs

can do for this show.

920

:

Yeah, yeah, yeah.

921

:

We're still a ways away.

922

:

Yeah, yeah, no, no, for sure.

923

:

But that sounds like fun.

924

:

Yeah.

925

:

Anyway, sorry, I derailed you again.

926

:

Time warrants.

927

:

Yeah, Indeed.

928

:

And then I love that question because I love GPs.

929

:

So thanks a lot.

930

:

And that was not at all a setup for the audience, processes, GPs, Yeah, I love Gaussian

processes.

931

:

And actually just sent you also a blog post we have on the PrimeCLabs website by Luciano

Paz about how to use Gaussian processes with spatial data.

932

:

So why am I...

933

:

telling you that because Gaussian processes are awesome because they are extremely

versatile.

934

:

It's what's called a non -parametric that allows you to do non -parametric models.

935

:

What does that mean?

936

:

It means that instead of having, for instance, a linear regression where you have a

functional form that you're telling the model, I expect the relationship between X and Y

937

:

to be of a linear form.

938

:

y equals a plus b times x.

939

:

Now, what the Gaussian process is saying is, I don't know the functional for between x and

y.

940

:

I want you to discover it for me.

941

:

So that's one level up, if you want, in the abstraction.

942

:

And so that's saying y equals f of x.

943

:

Find which x it is.

944

:

So you don't want to do that all the time because that's very hard.

945

:

And actually, you need to use quite a lot of domain knowledge on some of the parameters of

the GPs.

946

:

But I want to turn to the details here, but I'll give you some links for the show notes.

947

:

But something that's very interesting to apply GPs on is, well, spatial data, as I just

mentioned, because you don't really know in a plot, for instance, of, so not a plot graph,

948

:

but a plot, a field plot.

949

:

There are some interactions between where you are in the plot and the crops that you're

going to plant on there.

950

:

But you don't really know what those interactions are.

951

:

Like it interacts with the weather also with a lot of things.

952

:

And you don't really know what the functional form of that is.

953

:

And so that's where GP here is going to be extremely interesting because it's going to

allow you to see in 2D, try and find out what these correlation between X and Y are and

954

:

take that into account in your model.

955

:

That's very abstract, but I'm going to link afterwards to a tutorial.

956

:

We actually just released today in PIMC a tutorial that I've been working on with Bill

Engels, who is also a GP expert.

957

:

And I've been working with him on this tutorial for an approximation, a new approximation

of GPs.

958

:

And I'll get back to that in a few minutes.

959

:

But first, why GPs in time?

960

:

So you can apply GPs on spatial data on space.

961

:

But then you can also apply GPs on time.

962

:

Time is 1D most of the time, one dimensional.

963

:

Space is usually 2D.

964

:

And you can actually do GPs in 3D.

965

:

You can do spatial temporal GPs.

966

:

That's it.

967

:

That's even more complicated.

968

:

But like 1D GPs, that's really awesome.

969

:

Because then most of the time when you have a time dependency, it's non -linear.

970

:

For instance, that could be the way the performance of a baseball player evolve within the

season.

971

:

You can definitely see the performance of a baseball player fluctuate with time during the

season.

972

:

And that would be nonlinear, very probably.

973

:

the thing is you don't know what the form of that function is.

974

:

And so that's what the GP is here for.

975

:

It's going to come and try to discover what the functional form is for you.

976

:

And that's why I found GPs extremely, like they are really magical mathematical beasts.

977

:

First, they're really like beautiful mathematically.

978

:

And a lot of things are actually a special case of GPs, like neural networks.

979

:

Neural networks are actually Gaussian processes, but a special case of Gaussian processes.

980

:

Gaussian random walk.

981

:

They are a special case of Gaussian processes.

982

:

So they are a very beautiful mathematical object, but also very practical.

983

:

Now, as Uncle Ben said, with great power comes great responsibility.

984

:

And GPs are hard to yield.

985

:

It's a powerful weapon, but it's hard to yield.

986

:

It's like Excalibur.

987

:

You have to be worthy to yield it.

988

:

And so it takes training and time to...

989

:

use them, but it's worth it.

990

:

And so we use that with Thermo, Juristo from that Estonian NGO.

991

:

But I use that almost all the time.

992

:

Right now I'm working more and more on sports data and yeah, I'm actually working on some

football data right now.

993

:

And well, you want to take into account

994

:

these wheezing season effects from players.

995

:

I don't know what the linear form is.

996

:

And right now, the first model I did taking the time into account was just a linear trend.

997

:

So it's just saying as time passes, you expect a linear change.

998

:

So the change from one to two is going to be the same one than the one from nine to 10.

999

:

But usually it's not the case with time.

:

01:16:37,471 --> 01:16:39,041

It's very non -linear.

:

01:16:39,041 --> 01:16:41,661

And so here, you definitely want to apply a GP on that.

:

01:16:41,661 --> 01:16:44,587

You could apply other stuff like random walks.

:

01:16:44,587 --> 01:16:46,338

autoregressive stuff and so on.

:

01:16:46,338 --> 01:16:49,619

I don't personally don't really like those models.

:

01:16:49,619 --> 01:16:58,543

find them like, it's like you have to apply that structure to the model, but at the same

time, they're not that easier to use than GPs.

:

01:16:58,683 --> 01:17:00,984

So, know, might as well use a GP.

:

01:17:01,584 --> 01:17:13,313

And I'll end this very long answer with a third point, which is now it's actually easier

to use GPs.

:

01:17:13,313 --> 01:17:18,976

because there is this new decomposition of GPs that's Hilbert space decomposition.

:

01:17:18,976 --> 01:17:20,737

So HSGP.

:

01:17:21,638 --> 01:17:28,021

And that's basically a decomposition of GPs that's like a dot product.

:

01:17:28,021 --> 01:17:31,543

So kind of a linear regression, but that gives you a GP.

:

01:17:32,324 --> 01:17:42,449

And that's amazing because GPs are known to be extremely slow to sample because it's a lot

of matrix multiplication, as I was saying.

:

01:17:42,623 --> 01:17:43,733

at some point.

:

01:17:44,214 --> 01:17:49,218

But with HSGP, it becomes way, way faster and way more efficient.

:

01:17:49,218 --> 01:17:52,761

Now, you cannot always use HSGP, there are caveats and so on.

:

01:17:52,761 --> 01:17:57,264

But Bill and I have been working on this tutorial.

:

01:17:57,264 --> 01:17:58,925

It's going to be in two parts.

:

01:17:59,466 --> 01:18:02,127

The first part was out today.

:

01:18:02,228 --> 01:18:09,823

And I'm going to send you the links for the show notes here in the chat we have.

:

01:18:09,847 --> 01:18:11,518

That's up on the PMC website.

:

01:18:11,518 --> 01:18:17,981

If you go, that's HSGP First Steps and Reference.

:

01:18:18,362 --> 01:18:26,667

And we go through why you would use HSGP, how you would use it in PMC, and the basic use

cases.

:

01:18:26,667 --> 01:18:31,629

And the second part is going to be the more advanced use cases.

:

01:18:31,670 --> 01:18:38,133

Bill and I have started working on that, but it's always taking time to develop good

content.

:

01:18:38,638 --> 01:18:39,438

on that front.

:

01:18:39,438 --> 01:18:46,363

yeah, we're getting there and it's open source, so we're doing that on our free time

unpaid.

:

01:18:47,105 --> 01:18:49,666

that always takes a bit more time.

:

01:18:49,687 --> 01:18:52,149

But we'll get there.

:

01:18:52,149 --> 01:19:07,901

And finally, another resource that I think your listeners are going to appreciate is I'm

doing a webinar series on HSGP where we have a modeler that

:

01:19:08,073 --> 01:19:14,336

who comes on the show and shares our screen and does live coding.

:

01:19:14,717 --> 01:19:16,168

so the first part is out already.

:

01:19:16,168 --> 01:19:18,859

I'm going to send you that for the show notes.

:

01:19:18,859 --> 01:19:37,483

had Juan Orduz on the show and he went into like, the first part of how to do HSGPs and

what are even HSGPs from a mathematical point of view because Juan is a mathematician.

:

01:19:37,483 --> 01:19:48,699

So yeah, like I'll end by my very long, passionate rant about about GPS here, but long

story short, GPS are amazing.

:

01:19:49,180 --> 01:19:54,262

And that's a good investment of your time to be skillful with GPS.

:

01:19:54,443 --> 01:19:55,583

Fantastic.

:

01:19:55,583 --> 01:19:59,265

Another area that I would love to be able to dig deep into.

:

01:19:59,466 --> 01:20:06,079

And so our lucky listeners out there who have the time will now be able to dig into that

resource and many of the others that you have.

:

01:20:06,081 --> 01:20:10,062

suggested in this episode, which we've all got for you in the show notes.

:

01:20:10,062 --> 01:20:11,883

Thank you so much.

:

01:20:11,883 --> 01:20:13,703

Alex, this has been an amazing episode.

:

01:20:13,703 --> 01:20:22,025

Before I let my guests go, I always ask for book recommendation and you've had some

already for us in this episode, but I wonder if there's anything else.

:

01:20:22,225 --> 01:20:27,037

you had, the recommendation you already had was Bernoulli is something about Bernoulli.

:

01:20:27,037 --> 01:20:27,557

Right.

:

01:20:27,557 --> 01:20:34,949

Bernoulli's fallacy, the crises, the crisis of modern science and the logic of science,

think.

:

01:20:35,269 --> 01:20:35,969

yeah.

:

01:20:35,969 --> 01:20:46,182

Yeah, I'll send you that actually, the episode with Aubrey and David Spiegelhalter,

because these are really good, especially for less technical people, but who are curious

:

01:20:46,182 --> 01:20:47,972

about science and how that works.

:

01:20:47,972 --> 01:20:50,233

think it's very good entry point.

:

01:20:50,373 --> 01:20:54,274

Yeah, so this book is amazing.

:

01:20:55,164 --> 01:20:56,975

my God, this is an extremely hard question.

:

01:20:56,975 --> 01:21:02,696

I love so many books and I read so many books that I'm taken aback.

:

01:21:02,696 --> 01:21:05,919

So books, I would say books which...

:

01:21:05,919 --> 01:21:10,932

I find extremely good and have influenced me because a book is also like, it's not only

the book, right?

:

01:21:10,932 --> 01:21:14,733

It's also the moment when you read the book.

:

01:21:15,434 --> 01:21:24,139

So like, yeah, if you like a book and you come back to it later, you'll have a different

experience because you are a different person and you have different skills.

:

01:21:24,139 --> 01:21:31,122

And yeah, so I'm going to cheat and give you several recommendations because I have too

many of them.

:

01:21:31,663 --> 01:21:35,891

so technical books, I would say

:

01:21:35,891 --> 01:21:37,521

The Logic of Science by E .T.

:

01:21:37,521 --> 01:21:38,402

Jains.

:

01:21:38,402 --> 01:21:39,082

E .T.

:

01:21:39,082 --> 01:21:45,474

Jains is an old mathematician scientist, it's like in the Bayesian world, E .T.

:

01:21:45,474 --> 01:21:47,784

Jains is like a rock star.

:

01:21:47,984 --> 01:21:53,866

Definitely recommend his masterpiece, The Logic of Science.

:

01:21:53,866 --> 01:21:57,667

That's a technical book, but that's actually a very readable book.

:

01:21:57,667 --> 01:22:00,768

And that's also very epistemological.

:

01:22:00,768 --> 01:22:03,128

So that one is awesome.

:

01:22:03,128 --> 01:22:05,645

Much more applied if you want to learn.

:

01:22:05,645 --> 01:22:06,565

patient stats.

:

01:22:06,565 --> 01:22:11,965

A great book to do that, Statistical Rethinking by Richard Michael Rath.

:

01:22:11,965 --> 01:22:13,605

Really great book.

:

01:22:13,725 --> 01:22:15,385

I've read it several times.

:

01:22:15,385 --> 01:22:20,025

Any book by Andrew Gellman, as you were saying, definitely recommend them.

:

01:22:20,025 --> 01:22:22,525

They tend to be a bit more advanced.

:

01:22:22,525 --> 01:22:28,645

If you want a really beginner's one, his last one, actually Active Statistics, he's a

really good one.

:

01:22:28,645 --> 01:22:31,825

I just had him on the show, episode 106.

:

01:22:31,825 --> 01:22:32,645

John?

:

01:22:34,217 --> 01:22:38,199

for people who like numbers, say like that.

:

01:22:39,620 --> 01:22:52,575

And I remember that when I was studying political science, Barack Obama's book from before

he was president, I don't remember the name.

:

01:22:52,575 --> 01:22:56,687

I think it's the Audacity of Hope, I'm not sure.

:

01:22:56,747 --> 01:23:02,423

But his first book before he became president, that was actually a very interesting.

:

01:23:02,423 --> 01:23:03,893

Dreams from my father?

:

01:23:03,893 --> 01:23:04,314

Yes.

:

01:23:04,314 --> 01:23:05,084

Yeah, yeah, this one.

:

01:23:05,084 --> 01:23:06,064

Dreams from my father.

:

01:23:06,064 --> 01:23:07,244

Yeah, yeah, yeah.

:

01:23:07,244 --> 01:23:07,935

Very interesting one.

:

01:23:07,935 --> 01:23:10,155

The other ones were a bit more political.

:

01:23:10,155 --> 01:23:11,696

I found a bit less interesting.

:

01:23:11,696 --> 01:23:14,936

But this one was really interesting to me.

:

01:23:15,277 --> 01:23:19,218

And another one for people who are very nerdy.

:

01:23:19,218 --> 01:23:20,908

So I'm a very nerdy person.

:

01:23:20,908 --> 01:23:24,519

I love going to the gym, for instance.

:

01:23:24,519 --> 01:23:28,380

I have my, I can do my own training plan, my own nutritional plan.

:

01:23:28,380 --> 01:23:30,571

I've dug into that research.

:

01:23:30,571 --> 01:23:31,441

I love that.

:

01:23:31,441 --> 01:23:33,103

because I love sports also.

:

01:23:33,103 --> 01:23:45,553

Another very good book I definitely recommend to develop good habits is Kathy Milkman's

How to Change, the Science of Getting from Where You Are to Where You Wanna Go.

:

01:23:45,553 --> 01:23:49,476

Extremely good book full of very practical tips.

:

01:23:50,737 --> 01:23:54,040

Yeah, that's an extremely good one.

:

01:23:54,040 --> 01:23:57,993

And then a last one that I read recently.

:

01:23:58,084 --> 01:23:59,804

no, actually two last ones.

:

01:23:59,935 --> 01:24:00,805

All right.

:

01:24:00,805 --> 01:24:01,666

Yeah.

:

01:24:01,666 --> 01:24:03,266

One last two.

:

01:24:03,607 --> 01:24:05,647

Penultimate, I think you say.

:

01:24:05,647 --> 01:24:12,890

How Minds Change by David McCraney for people who are interested in how beliefs are

formed.

:

01:24:12,991 --> 01:24:14,371

Extremely interesting.

:

01:24:15,071 --> 01:24:15,902

he's a journalist.

:

01:24:15,902 --> 01:24:20,534

He's got a fantastic podcast that's called You're Not So Smart.

:

01:24:20,814 --> 01:24:22,915

And I definitely recommend that one.

:

01:24:23,275 --> 01:24:28,457

And yeah, that's how people change their mind, basically, because I'm very interested in

that.

:

01:24:28,615 --> 01:24:34,527

And in the end, is a, like, it's a trove of wisdom, this book.

:

01:24:35,008 --> 01:24:37,949

And very, like, last one, Promise.

:

01:24:39,390 --> 01:24:46,452

I also am extremely passionate about Stoicism, Stoic philosophy.

:

01:24:46,673 --> 01:24:56,376

And that's a philosophy I find extremely helpful to live my life and navigate the

difficulties that we all have in life.

:

01:24:56,657 --> 01:24:57,497

And

:

01:24:58,217 --> 01:25:12,606

a very iconic book in these is Meditations from Marcus Aurelius, reading the thoughts of a

Roman emperor, one of the best Roman emperor there was.

:

01:25:13,087 --> 01:25:19,511

It's really fascinating because he didn't treat that to be published.

:

01:25:19,652 --> 01:25:21,673

It was his journal, basically.

:

01:25:21,673 --> 01:25:28,077

That's absolutely fascinating to read that and to see that they kind of had the same

issues we still have.

:

01:25:28,077 --> 01:25:30,057

You know, so that's yeah, fantastic.

:

01:25:30,057 --> 01:25:31,857

It's my, I read it very often.

:

01:25:31,857 --> 01:25:32,377

Yeah.

:

01:25:32,377 --> 01:25:37,607

I haven't actually read meditations, but I read Ryan holidays, the daily stoic.

:

01:25:37,607 --> 01:25:38,077

Yeah.

:

01:25:38,077 --> 01:25:38,467

yeah.

:

01:25:38,467 --> 01:25:38,997

Yeah.

:

01:25:38,997 --> 01:25:39,637

Yeah.

:

01:25:39,637 --> 01:25:40,917

And that's really good.

:

01:25:40,917 --> 01:25:47,467

it's, yeah, 366 daily meditations on wisdom, perseverance, and the art of living and based

on Stoke philosophy.

:

01:25:47,467 --> 01:25:49,777

And so there is a lot from Marcus or really soon there.

:

01:25:49,777 --> 01:25:57,893

He's probably the plurality of content and wow, it is.

:

01:25:58,701 --> 01:26:07,741

It is mind blowing to me how somebody to millennia ago is the same as me.

:

01:26:07,741 --> 01:26:10,801

mean, I mean, that's holding myself in.

:

01:26:10,801 --> 01:26:12,901

I'm not a Roman emperor.

:

01:26:13,721 --> 01:26:18,461

s I write will not be studied:

:

01:26:18,461 --> 01:26:27,621

eel with this individual from:

problems that he's facing and how similar they are.

:

01:26:27,689 --> 01:26:30,050

the problems that I face every day, it's staggering.

:

01:26:30,491 --> 01:26:30,951

Yeah.

:

01:26:30,951 --> 01:26:31,201

Yeah.

:

01:26:31,201 --> 01:26:31,392

Yeah.

:

01:26:31,392 --> 01:26:32,482

No, that's incredible.

:

01:26:32,482 --> 01:26:48,633

Me, something that really talked to me was, well, that I remember is at some point he's

saying to himself that it's no use to go to the countryside to escape everything because

:

01:26:48,633 --> 01:26:50,975

the real retreat is in yourself.

:

01:26:50,975 --> 01:26:57,369

It's like, if you're not able to be calm and

:

01:26:57,751 --> 01:27:00,842

find equanimity in your daily life.

:

01:27:00,842 --> 01:27:06,845

It's not because you're going to get away from the city and Rome was like the megalopolis

at the time.

:

01:27:06,845 --> 01:27:12,597

It's not because you get away from the city that you're going to find tranquility over

there.

:

01:27:12,597 --> 01:27:14,518

You have to find tranquility inside.

:

01:27:14,518 --> 01:27:18,989

then, yeah, you'll go to the country that is going to be even more awesome.

:

01:27:19,030 --> 01:27:22,361

But it's not because you go there that you find tranquility.

:

01:27:22,361 --> 01:27:27,425

And that was super interesting to me because I was like, wait, because I definitely feel

that when I'm in a

:

01:27:27,425 --> 01:27:30,257

big, big metropolis.

:

01:27:30,257 --> 01:27:31,848

At some point, I want to get away.

:

01:27:31,848 --> 01:27:38,133

But I was like, wait, we're leaving that already at that time where they didn't have

internet, they didn't have cars and so on.

:

01:27:38,133 --> 01:27:43,817

But for them, it was already something that was too many people, too much noise.

:

01:27:43,857 --> 01:27:45,499

I found that super interesting.

:

01:27:45,499 --> 01:27:46,279

For sure.

:

01:27:46,279 --> 01:27:47,379

Wild.

:

01:27:47,680 --> 01:27:50,122

Well, this has been an amazing episode, Alex.

:

01:27:50,122 --> 01:27:54,583

I really am glad that Doug suggested you for the show because...

:

01:27:54,583 --> 01:27:55,883

This has been fantastic.

:

01:27:55,883 --> 01:27:57,874

I've really enjoyed every minute of this.

:

01:27:57,874 --> 01:28:03,695

I wish it could go on forever, but sadly all good things must come to an end.

:

01:28:04,296 --> 01:28:10,427

And so before I let you go, very last thing is do you have other places that we should be

following you?

:

01:28:10,427 --> 01:28:14,358

We're going to have a library of links in the show notes for this episode.

:

01:28:14,499 --> 01:28:20,940

And of course we know about your podcast, learn Bayesian statistics.

:

01:28:21,100 --> 01:28:23,721

We've got the intuitive Bayes.

:

01:28:23,753 --> 01:28:29,437

educational platform and open source libraries like PyMC and RViz.

:

01:28:29,877 --> 01:28:36,382

In addition to those, is there any other social media platform or other way that people

should be following you or getting in touch with you after the program?

:

01:28:36,382 --> 01:28:39,274

Well, yeah, thanks for mentioning that.

:

01:28:39,274 --> 01:28:45,908

So yeah, intuitive base, learning vision statistics, PyMC Labs, you mentioned them.

:

01:28:46,229 --> 01:28:52,929

And I'm always available on Twitter, alex underscore andora.

:

01:28:52,929 --> 01:28:57,091

like the country and that's where it comes from.

:

01:28:59,472 --> 01:29:02,383

Because it has two Rs and not only one.

:

01:29:02,383 --> 01:29:08,115

And so when I say it in another language than Spanish, then people write it with just one

R.

:

01:29:09,616 --> 01:29:12,507

otherwise, LinkedIn also, I'm over there.

:

01:29:12,507 --> 01:29:16,999

So you can always reach out to me over there, LinkedIn or Twitter.

:

01:29:16,999 --> 01:29:20,241

also, yes, me podcast suggestions, stuff like that.

:

01:29:20,241 --> 01:29:21,201

always...

:

01:29:21,579 --> 01:29:23,131

the lookout for something cool.

:

01:29:23,131 --> 01:29:26,997

again, yeah, thanks a lot, for having me on.

:

01:29:26,997 --> 01:29:30,021

Thanks a lot, Doug, for the recommendation.

:

01:29:30,482 --> 01:29:32,285

yeah, that was a blast.

:

01:29:32,285 --> 01:29:33,988

I enjoyed it a lot.

:

01:29:33,988 --> 01:29:35,589

So thank you so much.

:

01:29:39,277 --> 01:29:42,980

This has been another episode of Learning Bayesian Statistics.

:

01:29:42,980 --> 01:29:53,469

Be sure to rate, review, and follow the show on your favorite podcatcher, and visit

learnbaystats .com for more resources about today's topics, as well as access to more

:

01:29:53,469 --> 01:29:57,552

episodes to help you reach true Bayesian state of mind.

:

01:29:57,552 --> 01:29:59,514

That's learnbaystats .com.

:

01:29:59,514 --> 01:30:02,376

Our theme music is Good Bayesian by Baba Brinkman.

:

01:30:02,376 --> 01:30:04,358

Fit MC Lance and Meghiraan.

:

01:30:04,358 --> 01:30:07,520

Check out his awesome work at bababrinkman .com.

:

01:30:07,520 --> 01:30:08,715

I'm your host.

:

01:30:08,715 --> 01:30:09,686

Alex Andorra.

:

01:30:09,686 --> 01:30:13,869

can follow me on Twitter at Alex underscore Andorra, like the country.

:

01:30:13,869 --> 01:30:21,174

You can support the show and unlock exclusive benefits by visiting Patreon .com slash

LearnBasedDance.

:

01:30:21,174 --> 01:30:23,556

Thank you so much for listening and for your support.

:

01:30:23,556 --> 01:30:25,858

You're truly a good Bayesian.

:

01:30:25,858 --> 01:30:29,360

Change your predictions after taking information in.

:

01:30:29,360 --> 01:30:36,033

And if you're thinking I'll be less than amazing, let's adjust those expectations.

:

01:30:36,033 --> 01:30:49,169

Let me show you how to be a good Bayesian Change calculations after taking fresh data in

Those predictions that your brain is making Let's get them on a solid foundation

Chapters

Video

More from YouTube

More Episodes
113. #113 A Deep Dive into Bayesian Stats, with Alex Andorra, ft. the Super Data Science Podcast
01:30:51
112. #112 Advanced Bayesian Regression, with Tomi Capretto
01:27:18
108. #108 Modeling Sports & Extracting Player Values, with Paul Sabin
01:18:04
10. #10 Exploratory Analysis of Bayesian Models, with ArviZ and Ari Hartikainen
00:44:06
48. #48 Mixed Effects Models & Beautiful Plots, with TJ Mahr
01:01:24
66. #66 Uncertainty Visualization & Usable Stats, with Matthew Kay
01:01:57
73. #73 A Guide to Plotting Inferences & Uncertainties of Bayesian Models, with Jessica Hullman
01:00:55