Artwork for podcast Learning Bayesian Statistics
#109 Prior Sensitivity Analysis, Overfitting & Model Selection, with Sonja Winter
Episode 10925th June 2024 • Learning Bayesian Statistics • Alexandre Andorra
00:00:00 01:10:49

Share Episode

Shownotes

Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work !

Visit our Patreon page to unlock exclusive Bayesian swag ;)

Takeaways

  • Bayesian methods align better with researchers' intuitive understanding of research questions and provide more tools to evaluate and understand models.
  • Prior sensitivity analysis is crucial for understanding the robustness of findings to changes in priors and helps in contextualizing research findings.
  • Bayesian methods offer an elegant and efficient way to handle missing data in longitudinal studies, providing more flexibility and information for researchers.
  • Fit indices in Bayesian model selection are effective in detecting underfitting but may struggle to detect overfitting, highlighting the need for caution in model complexity.
  • Bayesian methods have the potential to revolutionize educational research by addressing the challenges of small samples, complex nesting structures, and longitudinal data. 
  • Posterior predictive checks are valuable for model evaluation and selection.

Chapters

00:00 The Power and Importance of Priors

09:29 Updating Beliefs and Choosing Reasonable Priors

16:08 Assessing Robustness with Prior Sensitivity Analysis

34:53 Aligning Bayesian Methods with Researchers' Thinking

37:10 Detecting Overfitting in SEM

43:48 Evaluating Model Fit with Posterior Predictive Checks

47:44 Teaching Bayesian Methods

54:07 Future Developments in Bayesian Statistics

Thank you to my Patrons for making this episode possible!

Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser, Julio, Edvin Saveljev, Frederick Ayala, Jeffrey Powell, Gal Kampel, Adan Romero, Will Geary, Blake Walters, Jonathan Morgan and Francesco Madrisotti.

Links from the show

Transcript

This is an automatic transcript and may therefore contain errors. Please get in touch if you're willing to correct them.

Transcripts

Speaker:

Priors represent a crucial part of the

Bayesian workflow, and actually a big

2

:

reason for its power and usefulness.

3

:

But why is that?

4

:

How do you choose reasonable priors in

your models?

5

:

What even is a reasonable prior?

6

:

These are deep questions that today's

guest, Sonja Winter, will guide us

7

:

through.

8

:

an assistant professor in the College of

Education and Human Development of the

9

:

University of Missouri, Sonia's research

focuses on the development and application

10

:

of patient approaches to the analysis of

educational and developmental

11

:

psychological data, with a specific

emphasis on the role of priors.

12

:

What a coincidence!

13

:

In this episode, she shares insights on

the selection of priors, prior sensitivity

14

:

analysis, and the challenges of working

15

:

with longitudinal data.

16

:

She also explores the implications of

Bayesian methods for model selection and

17

:

fit indices in structural equation

modeling, as well as the challenges of

18

:

detecting overfitting in models.

19

:

When she's not working, you'll find Sonja

baking delicious treats, gardening, or

20

:

watching beautiful birds.

21

:

This is Learning Bayesian Statistics,

episode.

22

:

Welcome to Learning Bayesian Statistics, a

podcast about Bayesian inference, the

23

:

methods, the projects, and the people who

make it possible.

24

:

I'm your host, Alex Andorra.

25

:

You can follow me on Twitter at alex

.andorra, like the country.

26

:

For any info about the show, learnbasedats

.com is Laplace to be.

27

:

Show notes, becoming a corporate sponsor,

unlocking Bayesian Merge, supporting the

28

:

show on Patreon, everything is in there.

29

:

That's learnbasedats .com.

30

:

If you're interested in one -on -one

mentorship, online courses, or statistical

31

:

consulting, feel free to reach out and

book a call.

32

:

at topmate .io slash alex underscore and

dora see you around folks and best patient

33

:

wishes to you all.

34

:

Hello my dear patients, today I want to

thank the fantastic Jonathan Morgan and

35

:

Francesco Madrisotti for supporting the

show on Patreon.

36

:

Your support is invaluable and literally

makes this show possible.

37

:

can't wait to talk with you guys in the

Slack channel.

38

:

Second, with my friends and fellow PymC

core developers, Ravin Kumar and Tommy

39

:

Capretto, we've just released our new

online course, Advanced Regression with

40

:

Bambi and PymC.

41

:

And honestly, after two years of

development, it feels really great to get

42

:

these out into the world, not only because

it was, well, long and intense, but mainly

43

:

because I am so proud of the level of

44

:

of details, teachings, and exercises that

we've packed into this one.

45

:

It's basically the course I wish I had

once I had gone through the beginner's

46

:

phase when learning patience tests, that

moment when you're like...

47

:

Okay, I know how to do basic models, but

where do I go from here?

48

:

I remember feeling quite lost, so we

wanted to give you a one -stop shop for

49

:

such intermediate models with the most

content possible, as evergreen as it gets.

50

:

If that sounds interesting, go to

intuitivebase .com and check out the full

51

:

syllabus.

52

:

We're enrolling the first cohort as we

speak!

53

:

Of course, you get a 10 % discount if

you're a patron of the show.

54

:

Go to the Patreon page or the Slack

channel to get the code.

55

:

Okay, back to the show now and looking

forward to seeing you in the intuitive

56

:

base discourse.

57

:

Sonia Winter, welcome to Learning Bayesian

Statistics.

58

:

Thank you.

59

:

Thanks for having me.

60

:

I'm really excited to talk to you today.

61

:

Same, same.

62

:

That's a treat.

63

:

I have a lot of questions.

64

:

I really love.

65

:

like everything you're doing in your

research.

66

:

We're going to talk a lot about priors

today, folks.

67

:

So yeah, like get ready.

68

:

But first, can you provide a brief

overview of your research interests and

69

:

how patient methods play a role in your

work?

70

:

Yeah, sure.

71

:

So my background is actually in

developmental psychology.

72

:

I did a bachelor or master's degree at

Utrecht University.

73

:

And during that time, I really realized

that a lot of work needed to be done on

74

:

the analysis part of social science

research.

75

:

And so I switched and got really into

structural equation models, which are

76

:

these big multivariate models that include

latent variables.

77

:

I'm sure we'll talk more about that later.

78

:

But those models can be hard to estimate

and there are all these issues.

79

:

And so I was introduced to Bayesian

statistics.

80

:

right after my master's degree when I was

working with Rens van der Schoot, also at

81

:

Utrecht University.

82

:

And he asked me to do this big literature

review about it with him.

83

:

And that really introduced me.

84

:

And so now I focus a lot on Bayesian

estimation and how it can help us estimate

85

:

these structural equation models.

86

:

And then specifically more recently, I've

really been focusing on how those priors

87

:

can really help us.

88

:

both with estimation and also just with

understanding our models a little bit

89

:

better.

90

:

So yeah, I'm really excited about all of

that.

91

:

Yeah, I can guess that sounds awesome.

92

:

So structural equation modeling, we

already talked about it on the show.

93

:

So today we're going to focus a bit more

on priors and how that fits into the SEM

94

:

framework.

95

:

So for people who don't know about SEM, I

definitely recommend episode 102 with Ed

96

:

Merkel.

97

:

And we talked exactly about structural

equation modeling and causal inference in

98

:

psychometrics.

99

:

So that will be a.

100

:

a very good introduction i think to these

topics for people and what i'm curious

101

:

about sonia is you work a lot on priors

and things like that but how how did you

102

:

end up working on that was something that

you were always curious about or that

103

:

something that appeared later later on in

your in your phd studies

104

:

I would say definitely something that

started or piqued my interest a little bit

105

:

later.

106

:

I think so after I first got familiarized

with Bayesian methods, I was excited

107

:

mostly by how it could help, like priors

could help us estimate, like avoid

108

:

negative variances and those types of

things.

109

:

But I saw them more as a pragmatic tool to

help with that.

110

:

And I didn't really focus so much on that.

111

:

I feel like I also was a little bit afraid

at the time of, you know, those

112

:

researchers who talk a lot about, well, we

shouldn't really make our priors

113

:

informative because that's subjective and

that's bad.

114

:

And so I really typically use like

uninformative priors or like software

115

:

defaults for a lot of my work in the

beginning.

116

:

But then during my PhD studies, I

actually.

117

:

Well, first of all, I worked with another

researcher, Sanaa Smith, who was also a

118

:

PhD student at the time.

119

:

And she was really intrigued by something

she found that these software defaults can

120

:

really cause issues when you're,

especially when your data is like very

121

:

small, it can, it can make your results

look wild.

122

:

And so we worked on this paper together

and created a shiny app to demonstrate all

123

:

of that.

124

:

And that made me realize that maybe

uninformative priors.

125

:

are not always the best way to go.

126

:

And also a prior that looks informative in

one scenario might be relatively

127

:

uninformative in another.

128

:

And so I really started shifting my, my

perspective on priors and focusing more on

129

:

how ignoring them is kind of like ignoring

the best part of Bayesian in my opinion,

130

:

at this point.

131

:

and so now I really want to look at how,

how they can help us and how we can be

132

:

thoughtful.

133

:

We don't want to drive our science by

priors, right?

134

:

We want to learn something new from our

data, but we find that balance is really

135

:

what I'm looking for now.

136

:

Yeah, well, what a fantastic application

of updating your belief, right?

137

:

From a meta standpoint, you just like

updated your priors pretty aggressively

138

:

and also very rationally.

139

:

That's really impressive.

140

:

Well done.

141

:

Because that's hard to do also.

142

:

It's not something we like to do.

143

:

So that's great.

144

:

Well done on doing that.

145

:

And actually now that you're on the other

side, how do you approach the selection of

146

:

priors in your research and what advice do

you have for people new to Bayesian

147

:

methods?

148

:

Yeah, great question.

149

:

I think at least within structural

equation modeling, we as like applied

150

:

researchers are helped somewhat because

distributions, at least for priors, are

151

:

sort of clear.

152

:

Like you don't have to think too much

about them.

153

:

And so you can immediately jump into

thinking about, okay, what level of

154

:

information do I want to convey in those

priors?

155

:

And I think whenever I'm working with

applied researchers, I try to strike a

156

:

balance with them because I know they are

not typically comfortable using like super

157

:

informative priors that are really narrow.

158

:

And so I just asked them to think about,

well, what would be a reasonable range?

159

:

Like if we are estimating a linear

regression parameter, what would that

160

:

effect size look like?

161

:

Right.

162

:

It might be zero or it might be two, but

it's probably not going to be 20.

163

:

And so we can.

164

:

sort of shape our prior to align with

those sort of expectations about how

165

:

probable certain values are versus others.

166

:

It's a really, I don't know, interactive

process between me and the researcher to

167

:

get this right, especially for those types

of parameters that they are really

168

:

interested in.

169

:

I think another type of parameter that is

more challenging for applied researchers

170

:

are those that are placed on residual

variances, for example.

171

:

Like people typically don't...

172

:

think about the part of the outcome that

they can't explain that much.

173

:

And so that's where I do rely a bit more

on sort of, I don't know, industry

174

:

standard choices that are typically not

super informative.

175

:

But then once we pick our like target

priors, I always advise the researcher to

176

:

follow it up with a sensitivity analysis

to see.

177

:

like how robust their findings are to

changes in those priorities, either making

178

:

them more informative or less informative.

179

:

And so yeah, that's really the approach I

take.

180

:

Of course, if someone wants to go full

base and full informative and they have

181

:

this, this wealth of previous research to

draw from, then I'm all for going, going

182

:

that route as well.

183

:

It's just not as common.

184

:

Hmm.

185

:

Hmm.

186

:

Yeah, I see.

187

:

in what, what are the...

188

:

main difficulties you see from people that

you advise like that?

189

:

Where do you see them having more defenses

up or just more difficulties because they

190

:

have a hard time wrapping their head

around a specific concept?

191

:

I think just all over, I think if anyone

has ever tried to do like a power analysis

192

:

working with researchers, it's sort of a

similar concept because

193

:

It is not, at least in my field or the

people I work with are not very typically

194

:

already thinking about the exact parameter

estimates that they are expecting to see,

195

:

right?

196

:

They are just, they just go with the

hypothesis.

197

:

I think these two things are correlated

and they might not even go as far as to

198

:

think, is it positive or negative?

199

:

So then once you ask them those questions,

it really forces them to go much deeper on

200

:

their theory and really consider like.

201

:

What is, what am I expecting?

202

:

What is reasonable based on what I know

from, from previous studies or just

203

:

experience.

204

:

And that can be kind of challenging.

205

:

It's, it's kind of, I think sometimes the

researchers might feel like I'm

206

:

criticizing them for not knowing, but I

think that's perfectly normal to not know.

207

:

Like we already have so many other things

to think about.

208

:

But it definitely.

209

:

is kind of a hurdle.

210

:

Also the time commitment, I think, to

really consider the priors, especially if

211

:

you're coming from a frequentist realm

where you just say, okay, maximum

212

:

likelihood go.

213

:

Not only do you not have to think about

the estimation, but then also your results

214

:

are almost instant.

215

:

And so that's always kind of a challenge

as well.

216

:

I see.

217

:

Yeah.

218

:

Yeah.

219

:

Definitely something also I seen, I seen

beginners.

220

:

yeah, it, it really depends on also where

they are coming from, as you were saying.

221

:

Yeah.

222

:

I did.

223

:

Your advice will depend a lot on that.

224

:

yeah.

225

:

Yeah.

226

:

And actually you work also a lot on prior

sensitivity analysis.

227

:

can you, can you tell people what that is?

228

:

And the importance of it in your, in your

modeling workflow and.

229

:

how you incorporate it into your research.

230

:

Yeah.

231

:

So a sensitivity analysis for priors is

something that you typically do after you

232

:

run your main analysis.

233

:

So you come up with your target set of

priors for all your parameters, estimate

234

:

the model, look at the results, look at

the posteriors.

235

:

And then in the next step, you think

about, well, how can I change these

236

:

priors?

237

:

in sort of meaningful ways, either making

them more informative, perhaps making them

238

:

represent some other theory, making them

less informative as well.

239

:

So making the influence of the prior

weaker in your results.

240

:

And then you rerun your analysis for all

of those different prior scenarios, and

241

:

then compare those results to the ones

that you actually obtained with your

242

:

target analysis and your target priors.

243

:

And the idea here is to see,

244

:

how much your results actually depend on

those prior beliefs that you came into the

245

:

analysis with.

246

:

If you don't find any differences, then

you can say, well, my results are mostly

247

:

influenced by my data, by the new evidence

that I obtained.

248

:

They are robust to changes in prior

beliefs, right?

249

:

It doesn't really matter what beliefs you

came into the analysis with.

250

:

The results are going to be the same,

which is great.

251

:

In other cases, you might find that your

results do change meaningfully.

252

:

So for example, in effect that was

significant with your priors is no longer

253

:

significant using a frequentist term here,

but hopefully people will understand once

254

:

you change your priors.

255

:

And that's, of course, is a little bit

more difficult to handle because what do

256

:

you do?

257

:

I want to say that the goal is not to

258

:

use the sensitivity analysis to then go

back and change your priors and run the

259

:

analysis again and report that in your

paper.

260

:

That would be sort of akin to p -hacking.

261

:

Instead, I think it just contextualizes

your findings.

262

:

It's showing that the knowledge you came

into the analysis with is partially

263

:

driving your results.

264

:

And that probably means that the evidence

in your new data is not super strong.

265

:

And so it may indicate some issues with

your theory or some issues with your data.

266

:

And you have to collect more data to

figure out which of those it is basically.

267

:

And so it's, it's kind of helping you also

figure out the next steps in your

268

:

research, I feel, which is helpful.

269

:

But it can be frustrating, of course, and

harder to convince maybe co -authors and

270

:

reviewers to.

271

:

move forward with a paper like that.

272

:

But to me it is very interesting these

results from sensitivity analyses.

273

:

Yeah, yeah, completely agree in that.

274

:

That's very interesting to see the, yeah,

if the results differ on the priors, and

275

:

that can also help, you know, settle any

argument on the choice of prior.

276

:

You know, if people are really in

disagreement about which priors to choose,

277

:

well, then you can run the model with both

sets of priors, and if the results don't

278

:

change, it's like, well, let's stop

arguing.

279

:

That's kind of...

280

:

It's kind of silly.

281

:

We just lost time.

282

:

So let's just focus on the results then.

283

:

I think it's a very interesting framework.

284

:

And then there is another.

285

:

So that is like that entails running the

model, running MCMC on the model.

286

:

But there are some checks that you do

before that to ensure the robustness of

287

:

your patient models.

288

:

And one of that step is.

289

:

very crucial and called primary predictive

checks.

290

:

Can you talk about that to beat Sonja?

291

:

Yeah, so as you said, these checks happen

before you do any actual analysis.

292

:

So you can do them before you collect any

data.

293

:

In fact, one reason for using them is to

figure out whether the priors you came up

294

:

with results in sensible ranges of

possible parameter estimates, right?

295

:

In some cases, especially with these

complex multivariate models, your priors

296

:

may interact in unexpected ways and then

result in predictions that are not in line

297

:

with what your theory is actually telling

you you should expect.

298

:

And so prior predictive checks basically

until you specify your priors for all your

299

:

parameters.

300

:

And then you generate

301

:

parameter values from those priors by

combining it with your model

302

:

specification.

303

:

And then those combinations of parameter

estimates are used to generate what are

304

:

called prior predictive samples.

305

:

So these are samples of some pre

-specified size that represent possible

306

:

observations that align with what your

priors are conveying combined with your

307

:

model.

308

:

And so ideally,

309

:

those prior predictive samples look kind

of like what you would expect your data to

310

:

look like.

311

:

And sometimes for researchers, it is

easier to think about what the data should

312

:

look like compared to what the parameter

estimates can be.

313

:

And so in that sense, prior predictive

checks can be really helpful in checking

314

:

not just the priors, but checking the

researcher and making sure that they

315

:

actually convey their knowledge to me, for

example, correctly.

316

:

Yeah, did that answer your question?

317

:

Yeah, yeah, I think that's a great

definition and definitely encourage any

318

:

Bayesian practitioner to include prior

predictive checks in their workflow.

319

:

Once you have written a model, that should

be the first thing you do.

320

:

Do not run a CMC before doing prior

predictive checks.

321

:

And recently, I feel like a lot of the

software packages for Bayesian methods

322

:

have...

323

:

included very simple ways of running these

checks, which when I first started looking

324

:

at them, it was kind of more of a niche

step in the workflow.

325

:

And so it required a few more steps and

some more like coding, but now it's as

326

:

easy as just switching like a toggle to

get those prior predictive samples.

327

:

So that's great.

328

:

Yeah, yeah, yeah, completely agree.

329

:

That's also, yeah, it's definitely

something that's, that's been more and

330

:

more popular in the different

331

:

classes and courses that I teach, whether

it's online courses or live workshops,

332

:

always show prior predictive checks almost

all the time.

333

:

So yeah, it's becoming way, way more

popular and widespread.

334

:

So that's really good because I can tell

you when I work on a real model for

335

:

clients, always the first thing I do

before running MCMC is prior predictive

336

:

checks.

337

:

And actually there is a fantastic way

of...

338

:

you know, doing prior predictive checks,

like kind of industrialized and that's

339

:

called simulation based calibration.

340

:

Have you heard of that?

341

:

No, I mean, maybe the term, but I have no

idea what it is.

342

:

So that's just like making prior

predictive checks on an industrialized

343

:

scale.

344

:

Basically now instead of just

345

:

running through the model forward, as you

explained, and generate prior predictive

346

:

samples, what you're doing with SPC, so

simulation -based calibration, is

347

:

generating not only prior predictive

samples, but prior samples of the

348

:

parameters of the model.

349

:

You stock these parameters in some object.

350

:

but you don't give them to the model, but

you keep them somewhere safe.

351

:

And then the prior predictive samples, so

the plausible observations generated by

352

:

the model based on the prior samples that

you just kept in the fridge, these prior

353

:

predictive samples, now you're going to

consider them as data.

354

:

And you're going to tell the model, well,

run MCMC on these data.

355

:

as if we had observed these prior

predictive samples in the wild, because

356

:

that's what prior predictive samples are.

357

:

It's possible samples we could observe

before we know anything about real data.

358

:

So you feed that to the model.

359

:

You make the model run MCMC on that.

360

:

So that means backward inference.

361

:

So now the model is going to find out

about the plausible parameter values which

362

:

could have generated this data.

363

:

And then what you're going to do is

compare the posterior

364

:

distribution that the model inferred for

the parameter values to the true parameter

365

:

values that you kept in the fridge before.

366

:

You're going to get, so these parameter

values are true.

367

:

So you just have one of them, because it's

just one sample from the prior parameters.

368

:

And you're going to compare these value,

these value to the distribution of

369

:

posterior parameters that you just got

from the model.

370

:

And based on that,

371

:

and how far the model is from the true

parameter, you can find out if your model

372

:

is biased or if it's well calibrated.

373

:

And that's a really great way to be much

more certain that the model is able to

374

:

recover what you want it to recover.

375

:

basically playing God, and then you're

trying to see if the model is able to

376

:

recover the parameters that you use to

generate the data.

377

:

And not only will you do that once, but

you want to do that many times, many, many

378

:

times, because, well, the more you do it,

then you enter a kind of a frequentist

379

:

realm, right?

380

:

Where you're like, you just repeat the

experiments a lot.

381

:

And then that's how you're going to see

how calibrated the model is, because then

382

:

you can do some calibration plots.

383

:

there are a lot of metrics around that

it's a kind of a developing area of the

384

:

research but there are a lot of metrics

and one of them is basically just plotting

385

:

the true parameter values and well for

instance the mean posterior value from the

386

:

parameter and then if this mean is most of

the time along the the line x equals y

387

:

well that means you are in pretty good

shape you are but I mean it's the mean

388

:

here you

389

:

So you have to look at the whole

distribution, but that's to give you an

390

:

idea.

391

:

And so the bottleneck is you want to do

that a lot of time.

392

:

So you have to run MCMC a lot of times.

393

:

Most of the time, if you're just doing a

regression, that should be okay.

394

:

But sometimes it's going to take a lot of

time to run MCMC and it can be hard.

395

:

In these cases, you have new algorithms

that can be efficient because there is one

396

:

called

397

:

amortized Bayesian inference, a method

called amortized Bayesian inference.

398

:

We just covered that in episode 107 with

Marvin Schmidt.

399

:

And basically that's exactly a use case

for amortized Bayesian inference because

400

:

the model doesn't change, but the data

changes in each iteration of the loop.

401

:

And so what amortized Bayesian inference

is doing is just, well, just is training a

402

:

deep neural network on the model.

403

:

as a first step.

404

:

And then the second step is the inference,

but the inference is just instantaneous

405

:

because you've trained the deep neural

network.

406

:

And that means you can do, you can get as

black, almost as many poster samples as

407

:

you want.

408

:

Once you have trained the deep neural

network.

409

:

And so that's why it's almost all

inspection inference.

410

:

And that's a perfect use case for SBC

because then like you can just like you

411

:

get new, a new,

412

:

new samples for free.

413

:

And actually, so I definitely encourage

people to look at that.

414

:

It's still developing.

415

:

So right now you cannot, for instance, use

Baseflow, which is the Python package that

416

:

Marvin talked about in 1 .07 with PIMC,

but it's something we're working on.

417

:

And the goal is that it's completely

compatible.

418

:

But yeah, like I'll link to the tutorial

notebook in the show notes for people.

419

:

who want to get an idea of what SPC is

because even though you're not applying it

420

:

right now at least you have that in mind

and you know what that means and you can

421

:

work your way to out that.

422

:

Yeah that's amazing.

423

:

I feel like one of the biggest hurdles in

the structural equation modeling approach

424

:

with using Bayesian is just the time

commitment.

425

:

I'm

426

:

There is one analysis I was running and it

takes, I think for one analysis, it takes

427

:

almost a week to run it because it's a big

sample and then it's a complicated model.

428

:

And so if I would have to rerun that model

a thousand times, it would not be fun.

429

:

so knowing that there's maybe some options

on the horizon to help us speed along that

430

:

process would be, I think that would

change our field for sure.

431

:

So that's very exciting.

432

:

Yeah, yeah, yeah.

433

:

That's really super exciting.

434

:

And that's why I'm also super enthusiastic

about the desalmatized Bayesian infant

435

:

stuff, because I discovered that in

episode 107, so it's not a long time ago.

436

:

But as soon as I heard about that, I dug

into it, because that's super interesting.

437

:

Yeah.

438

:

I'm going to read about it after we finish

recording this.

439

:

Yeah, yeah, for sure.

440

:

And feel free to send me any questions.

441

:

And I find it's also a very elegant way to

marry the Bayesian framework in the deep

442

:

neural network methods.

443

:

So I really love that.

444

:

It's really elegant and promising, as you

were saying.

445

:

Talking about SCM, so structural equation

modeling, do you find that Bayesian

446

:

methods help?

447

:

in for these kind of models and especially

when it comes to educational research

448

:

which is one of your fields?

449

:

Yes, I think Bayesian methods can sort of

help on both ends of the spectrum that we

450

:

see with educational data which is either

we have very small samples and so

451

:

researchers still have these ambitious

theoretical models that they want to test.

452

:

but it's just not doable with frequentist

estimators.

453

:

And so based with the priors, it can help

a little bit to boost the information that

454

:

we have, which is really nice.

455

:

And then on the other side, ever since

starting this position and moving into a

456

:

college of education, I've been given

access to many large data sets that have

457

:

very complicated nesting structures.

458

:

That's something you see all the time in

education.

459

:

You have

460

:

schools and then teachers and students and

the students they change teachers because

461

:

it's also longitudinal so there's a time

component and all of these different

462

:

nested structures can be very hard to

model using estimators like nextman

463

:

likelihood and bayesian methods not

necessarily structural equation modeling

464

:

but maybe more a hierarchical linear model

or some other multi -level approach it can

465

:

be super flexible to handle all of those

466

:

structures and still give people results

that they can use to inform policy.

467

:

Because that's something in education that

I didn't really see when I was still in

468

:

the department of psychology before is

that a lot of the research here is really

469

:

directly informing what is actually going

to happen in schools.

470

:

And so it's really neat that these

Bayesian methods are allowing them to

471

:

answer much more complicated research

questions and really make use of all of

472

:

the data that they have.

473

:

So that's been really exciting.

474

:

And actually, I wanted to ask you

precisely what the challenges you face

475

:

with longitudinal data and how do you

address these challenges because I know

476

:

that can be pretty hard.

477

:

I think with longitudinal data, the

biggest challenge actually doesn't have

478

:

anything to do with the estimator.

479

:

It is more just inherent in longitudinal

data, which is that we will always...

480

:

unless you have a really special sample,

but we will always have missing data.

481

:

Participants will always drop out at some

point or just skip a measurement.

482

:

And of course, other estimation methods

also have options for accommodating

483

:

missing data, such as full information

maximum likelihood.

484

:

But I find that the Bayesian approach

where you can do imputation while you're

485

:

estimating, so you're just imputing the

data at every posterior sample, is very

486

:

elegant, efficient.

487

:

and easy for researchers to wrap their

minds around.

488

:

And it still allows you just like with

other multiple imputation methods to

489

:

include an sort of auxiliary model

explaining the missingness, which helps

490

:

with the like missing at random, type data

that we deal with a lot.

491

:

And so I feel that that is especially

exciting.

492

:

I honestly started thinking about this

more deeply when I started my position

493

:

here and I met my new colleague.

494

:

Dr.

495

:

Brian Keller, he is working on some

software, it's called BLIMP, which I think

496

:

it stands for Bayesian Latent Interaction

Modeling Program, I want to say.

497

:

So it's actually created for modeling

interactions between latent variables,

498

:

which is a whole other issue.

499

:

But within that software, they actually

also created a really powerful method for

500

:

dealing with missing data, or not

necessarily the method, but just the

501

:

application of it.

502

:

And so...

503

:

Now that I've met him and he's always

talking about it, it makes me think about

504

:

it more.

505

:

So that's very exciting.

506

:

Yeah, for sure.

507

:

And feel free to add a link to this

project to Blimp in the show notes,

508

:

because I think that's going to be very

interesting to listeners.

509

:

And how, I'm wondering if patient methods

improve...

510

:

the measurement and the evaluation

processes in educational settings, because

511

:

I know it's a challenge.

512

:

Is that something that you're working on

actively right now, or you've done any

513

:

projects on that that you want to talk

about?

514

:

Well, I teach measurement to grad

students.

515

:

So it's not necessarily that I get to talk

about Bayes a lot in there.

516

:

But what I'm realizing is that

517

:

When we talk about measurement from a

frequentist standpoint, we typically start

518

:

with asking students a bunch of questions.

519

:

Let's say we're trying to measure math

ability.

520

:

So we ask them a bunch of math questions.

521

:

Then if we use frequentist estimation, we

can use those item responses to generate

522

:

some sort of probability of those

responses giving some underlying level of

523

:

math ability.

524

:

So how probable is it that they gave these

answers given this level of math?

525

:

But actually what we want to know is what

is the student's math ability, given the

526

:

patterns of observed responses.

527

:

And so Bayes theorem gives us a really

elegant way of answering exactly that

528

:

question, right.

529

:

Instead of the opposite way.

530

:

And so I think in a big way, Bayesian

methods just align better with how people

531

:

already think about the research that

they're doing or the thing, the questions

532

:

that they're, they want to answer.

533

:

I think.

534

:

This is also a reason why a lot of

researchers struggle with getting the

535

:

interpretation of things like a confidence

interval correct, right?

536

:

It's just not intuitive.

537

:

Whereas Bayesian methods, they are

intuitive.

538

:

And so in that sense, I think not so much

like estimation wise, but just

539

:

interpretation wise, Bayesian methods can

help a lot in our field.

540

:

And then in addition to that, I think when

we do use Bayesian estimation,

541

:

those posterior distributions, they can

give us so much more information about the

542

:

parameters of interest that we are

interested in.

543

:

And they can also help us understand what

future data would look like given those

544

:

posteriors, right?

545

:

If we move from like prior predictors to

posterior predictors, which are these

546

:

samples generated from the posteriors,

that should look like our data should look

547

:

like that data, right?

548

:

If our model is doing a good job of

representing our data.

549

:

And so,

550

:

I think that's an exciting extension of

Bayes as well.

551

:

It gives us more tools to evaluate our

model and to make sure that it's actually

552

:

doing a good job of representing our data,

which is especially important in

553

:

structural equation modeling, where we

rely very heavily on global measures of

554

:

fit.

555

:

And so this is a really nice new tool for

people to use.

556

:

I see.

557

:

Okay.

558

:

Yeah.

559

:

I am.

560

:

I need to know about that in particular.

561

:

That's...

562

:

That's very interesting.

563

:

Yeah.

564

:

So I mean, I would have more questions on

that, but I want to ask you in particular

565

:

on a publication you have about under

-fitting and over -fitting.

566

:

And you've looked at the performance of

Bayesian model selection in SEM.

567

:

I find that super interesting.

568

:

So can you summarize the key findings of

this paper and...

569

:

their application, their implications for

researchers using SEM?

570

:

Yeah, for sure.

571

:

This is a really fun project for me to

work on, kind of an extension of my

572

:

dissertation.

573

:

So it made me feel like, I'm really moving

on, creating a program of research.

574

:

So yeah, thanks for asking about the

paper.

575

:

So yeah, as I already kind of mentioned,

within structural equation modeling,

576

:

Researchers rely really heavily on these

model selection and fit indices to make

577

:

choices about what model they're going to

keep in the end.

578

:

A lot of the times, researchers come in

with some idea of what the model would

579

:

look like, but they are always tinkering a

little bit.

580

:

They're ready to know that they're wrong

and they want to get to a better model.

581

:

And so the same is true when we use

Bayesian estimation and we have sort of a

582

:

similar set of indices to look at.

583

:

in terms of the fit of a single model or

comparing multiple models and selecting

584

:

the best one.

585

:

And so very typically those indices are

tested in terms of how well they can

586

:

identify underfit.

587

:

And so underfit occurs when you forgot to

include a certain parameter.

588

:

So your model is too simple for the

underlying data generating mechanism.

589

:

You forgot something.

590

:

And so all of these indices generally

work.

591

:

pretty well, and that's also what we found

in our study in terms of selecting the

592

:

correct model when there are some

alternatives that have fewer parameters or

593

:

picking up on the correct model fitting

well by itself versus models that forget

594

:

these parameters.

595

:

But what we were really interested in is

looking at, OK, how well do these indices

596

:

actually detect overfitting?

597

:

So that's where you add parameters that

you don't really need.

598

:

So you're making your model overly

complex.

599

:

And when we have models that are too

complex, they tend not to generalize to

600

:

new samples, right?

601

:

They're optimized for our specific sample

and that's not really useful in science.

602

:

So we want to make sure that we don't keep

going and like adding paths and making our

603

:

models super complicated.

604

:

And so surprisingly what we found across

like a range of over fitting scenarios is

605

:

that they do not really do a good job of

detecting any of this.

606

:

Most indices, if anything, just make the

model look better and better and better.

607

:

Even some of these indices, like model

selection indices, will have a penalty

608

:

term in their formula that's supposed to

penalize for having too many parameters,

609

:

right?

610

:

For making your model too complex.

611

:

And even those were just like, yeah, this

is fine.

612

:

Keep going, keep going.

613

:

And so that's a little bit worrisome.

614

:

And I think...

615

:

We really need to think about developing

some new ways of detecting when we go too

616

:

far, right?

617

:

Figuring out at what point we need to stop

in our model modification, which is

618

:

something that researchers really love to

do, especially in structural equation

619

:

modeling.

620

:

I won't speak for any other areas.

621

:

And so, yeah, I think there's a lot of

work to be done.

622

:

And I was very surprised that these

indices that are supposed to help us

623

:

detect overfitting also didn't really do.

624

:

a good job.

625

:

And so I'm excited to work more on this.

626

:

I would say in general, if people want an

actionable takeaway, it is always helpful

627

:

when you have multiple models to compare

versus just your one model of interest.

628

:

It will help you tease, sort of figure out

better, which one is the correct one

629

:

versus just is your model good enough?

630

:

And so that would be my, my advice for

researchers.

631

:

Yeah.

632

:

Yeah, definitely.

633

:

I always like having a very basic and dumb

looking linear regression to compare to

634

:

that and build my way on top of that

because you can already do really cool

635

:

stuff with plain simple linear regression

and why making it harder if you cannot

636

:

prove, you cannot discern a particular

effect of...

637

:

of the new method you're applying.

638

:

Yeah.

639

:

And so do you have then from from your

dive into these, do you have some fit

640

:

indices that you recommend?

641

:

And how do they compare to traditional fit

indices?

642

:

So I think for model

643

:

fit of a single model within structural

equation modeling.

644

:

The most popular ones are called

comparative fit index, the Tucker Lewis

645

:

index, and then the root mean square error

of approximation.

646

:

You'll see these in like every single

paper published.

647

:

And so there are Bayesian versions of

those indices, but based on all my

648

:

research using those so far,

649

:

I would actually not recommend those at

all for evaluating the fit of your

650

:

specific model.

651

:

It seems from at least my research that

they are very sensitive to your sample

652

:

size, which means that as you get a larger

and larger sample, your model will just

653

:

keep looking better and better and better

and better, even if it's wrong.

654

:

So something that would be flagged as like

a...

655

:

a misspecified model with a small sample

might look perfectly fine with a large

656

:

sample.

657

:

And so that's not what you want, right?

658

:

You want the fit index to reflect the

misspecification, not your sample size.

659

:

And so I was really excited when these

were first introduced, but I think we need

660

:

a lot more knowledge about how to actually

use them before they are really useful.

661

:

And so my advice for researchers who want

to know something about their fit is

662

:

really to look at

663

:

the posterior predictive checks.

664

:

And within structural equation modeling,

I'm not sure how widespread this is for

665

:

other methods, but we have something

called a posterior predictive p -value,

666

:

where we basically take our observed data

and evaluate the fit of that data to our

667

:

model at each posterior iteration.

668

:

For example, using a likelihood ratio test

or like a chi -square or something.

669

:

And then we do the same for a posterior

predicted sample.

670

:

using this in within each of those samples

as well.

671

:

And the idea is that if your model fits

your data well, then about half of the

672

:

predictive samples should fit better and

the other half should fit worse, right?

673

:

Yours should be nicely cozy in the middle.

674

:

If all of your posterior predictive

samples fit worse than your actual data,

675

:

then it's an indication that you are

overfitting, right?

676

:

Like,

677

:

the model will never fit as well as it

does for your specific data.

678

:

And so I think in that sense, that index

could potentially give some idea of

679

:

overfitting, although again, in our study,

we didn't really see that happening.

680

:

But I think it's a more informative method

of looking at fit within Bayesian

681

:

structural equation modeling.

682

:

And so even though it's kind of old

school, I think it's still probably the...

683

:

the best option for researchers to look

at.

684

:

Okay, yeah, thanks.

685

:

That's like, I love that.

686

:

That's very practical.

687

:

And I think listeners really appreciate

that.

688

:

I have like, I was wondering about SEMs

again, and if you have an example from

689

:

your research where Bayesian SEM provided

significant insights that

690

:

traditional methods might have missed.

691

:

Yeah, so some work I'm working on right

now is with a group of researchers who are

692

:

really interested in figuring out how

strong the evidence is that there is no

693

:

effect, right?

694

:

That some path is zero within a bigger

structural model.

695

:

And with frequentist analysis, all we can

really do is fail to reject the known,

696

:

right?

697

:

We have an absence of evidence.

698

:

but that doesn't mean that there's

evidence of absence.

699

:

And so we can't really quantify like how

strong or how convinced we should be that

700

:

that null is really a null effect.

701

:

But with Bayesian methods, we have base

factors, right?

702

:

And we can actually explicitly test the

evidence in favor of the estimate being

703

:

zero versus the estimate being not zero,

right?

704

:

Either smaller or larger than zero.

705

:

And so that's really...

706

:

When I talked to the applied researchers,

once they came to me with this problem,

707

:

which started as just like a structural

equation modeling problem, but then I was

708

:

like, well, have you ever considered using

Bayesian methods?

709

:

Because I feel like it could really help

you get at that question.

710

:

Like how strong is that evidence relative

to the evidence for an effect, right?

711

:

And so we've been working on that right

now and it is very interesting to see the

712

:

results and then also to communicate that

with them and see.

713

:

They get so excited about it.

714

:

So that's been fun.

715

:

Yeah, for sure.

716

:

That's super cool.

717

:

And you don't have anything to share in

the show notes yet, right?

718

:

Not yet.

719

:

No, I'll keep you posted.

720

:

Yeah, for sure.

721

:

Because maybe by the time of publication,

you'll have something for us.

722

:

Yes.

723

:

And now I'd like to talk a bit about

your...

724

:

your teaching because you teach a lot of

classes.

725

:

You've talked a bit about that already at

the beginning of the show, but how do you

726

:

approach teaching Bayesian methods to

students in your program, which is the

727

:

statistics measurement and evaluation and

indication program?

728

:

Yeah, so I got to be honest and say I have

never taught an entire class on Bayesian

729

:

methods yet.

730

:

I'm very excited that I just talked with

my colleagues and I got the okay to

731

:

develop it and put it on the schedule.

732

:

So it's coming.

733

:

But I did recently join a panel

discussion, which was about teaching

734

:

Bayesian methods.

735

:

It was organized by the Bayesian Education

Research and Practice Section of the ISBA

736

:

Association.

737

:

And so the other two panelists, I was

really starstruck.

738

:

to be honest, were E .J.

739

:

Wagemakers and Joachim van de Kerkoven,

which are like, to me, those are really

740

:

big names.

741

:

And so talking to them, I really learned a

lot during that panel.

742

:

I felt like I was more on the panel as a

as an audience member, but it was great

743

:

for me.

744

:

And and so from that, I think if I do get

to teach a class on Bayesian methods,

745

:

which hopefully will be soon.

746

:

I think I really want to focus on showing

students the entire Bayesian workflow,

747

:

right?

748

:

Just as we were talking about, starting

with figuring out priors, prior predictive

749

:

checks, maybe some of that fancy

calibration.

750

:

And then also doing sensitivity analyses,

looking at the fit with the posterior

751

:

predictive samples, all of that stuff.

752

:

I think...

753

:

For me, I wouldn't necessarily combine

that with structural equation models

754

:

because those are already pretty

complicated models.

755

:

And so I think within a class that's

really focused on Bayesian methods, I

756

:

would probably stick to a simple but

general model, such as a linear regression

757

:

model, for example, to illustrate all of

those steps.

758

:

Yeah, I've been just buying, like I have a

whole bookshelf now of books on Bayesian

759

:

and teaching Bayesian.

760

:

And so I'm excited to start reading those.

761

:

developing my class soon yeah that's super

exciting well done congrats on that i'm

762

:

glad to hear that so first eg vagon makers

was on the show i don't remember which

763

:

episode but i will definitely link to it

in the show notes and second yeah which

764

:

books are you are you gonna use well

765

:

Good question.

766

:

So there's one that I kind of like, but it

is very broad, which is written by David

767

:

Kaplan, who's at the University of

Wisconsin Madison.

768

:

And it's called, I think, vision

statistics for the social sciences.

769

:

And so what I like about it is that many

of the examples that are used throughout

770

:

the book are very relevant to the students

that I would be teaching.

771

:

And it also covers a wide range.

772

:

of models, which would be nice.

773

:

But now that I've like philosophically

switched more to this workflow

774

:

perspective, it's actually a little bit

difficult to find a textbook that covers

775

:

all of those.

776

:

And so I may have to rely a lot on some of

the online resources.

777

:

I know there's some really great posts by,

I'm so bad with names.

778

:

I want to say his name is Michael

something.

779

:

Where he talks about workflow.

780

:

Yes, probably.

781

:

Yes, that sounds familiar.

782

:

His posts are really informative and so I

would probably rely on those a lot as

783

:

well.

784

:

Especially because they also use

relatively simpler models.

785

:

I think, yeah, for some of the components

of the workflow that they just haven't

786

:

been covered in textbooks as much yet.

787

:

So if anyone is writing a book right now,

please add some chapters on those lesser

788

:

known.

789

:

components, that would be great.

790

:

Yeah.

791

:

Yeah, so there is definitely Michael

Bedoncourt's blog.

792

:

And I know Andrew Gelman is writing a book

right now about the Bayesian workflow.

793

:

So the Bayesian workflow paper.

794

:

Yeah, that's a good paper.

795

:

Yeah, I'll put it in the show notes.

796

:

But basically, he's turning that into a

book right now.

797

:

Amazing.

798

:

Yeah, so it's gonna be perfect for you.

799

:

And have you taken a look at his latest

book, Active Statistics?

800

:

Because that's exactly for preparing

teachers to teach patient stats.

801

:

Yes, he has like an I feel like an older

book as well where he has these

802

:

activities, but it's really nice that he

came out with this newer, more recent one.

803

:

I haven't read it yet, but it's on my

804

:

on my to buy list.

805

:

I have to buy these books through the

department, so it takes a while.

806

:

Yeah, well, and you can already listen to

episode 106 if you want.

807

:

He was on the show and talked exactly

about these books.

808

:

Amazing.

809

:

I'll put it in the show notes.

810

:

And what did we talk about?

811

:

There was also Michael Betancourt, E .G.

812

:

Wagenmarkers,

813

:

Active statistics, microbed and code,

yeah, and the Bayesian workflow paper.

814

:

Yeah, thanks for reminding me about that

paper.

815

:

Yeah, it's a really good one.

816

:

I think it's going to be helpful.

817

:

I'm not sure they cover SBC already, but

that's possible.

818

:

But SBC, in any case, you'll have it in

the Bayes flow tutorial that I already

819

:

linked to in the show notes.

820

:

So I'll put out that.

821

:

And actually, what are future developments

in Bayesian stats that excite you the

822

:

most, especially in the context of

educational research?

823

:

Well, what you just talked about, and this

amortized estimation thing is very

824

:

exciting to me.

825

:

I think, as I mentioned, one of the

biggest hurdles for people switching to

826

:

Bayesian methods is just the time

commitment, especially with structural

827

:

equation models.

828

:

And so knowing that people are working on

algorithms that will speed that up, even

829

:

for a single analysis, it's just really

exciting to me.

830

:

And in addition to that, sort of in a

similar vein, I think a lot of smart

831

:

people are working on software, which is

lowering barriers to entry.

832

:

People in education, they know a lot about

education, right?

833

:

That's their field, but they don't have

time to really dive into.

834

:

Bayesian statistics.

835

:

And so for a long time, it was very

inaccessible.

836

:

But now, for example, as you already

mentioned, Ed Merkel, he has his package

837

:

Blavan, which is great for people who are

interested in structural equation modeling

838

:

and Bayesian methods.

839

:

And sort of similarly, you have that

Berkner has that BRMS package.

840

:

And then if you want to go even more

accessible, there's JASP.

841

:

which is that point and click sort of

alternative to SPSS, which I really enjoy

842

:

showing people to let them know that they

don't need to be afraid that they'll lose

843

:

access to SPSS at some point in their

life.

844

:

So I think those are all great things.

845

:

And in a similar vein, there are so many

more online resources now.

846

:

Then when I first started learning about

base, like when people have questions or

847

:

they want to get started, I have so many

links to send them of like papers, online

848

:

courses, YouTube videos, podcasts like

this one.

849

:

and so I think that's, what's really

exciting to me, not so much what we're

850

:

doing behind the scenes, right?

851

:

The actual method itself, although that's

also very exciting, but for working with

852

:

people.

853

:

in education or other applied fields.

854

:

I'm glad that we are all working on making

it easier.

855

:

So, yeah.

856

:

Yeah.

857

:

So first, thanks a lot for recommending

the show to people.

858

:

I appreciate it.

859

:

And yeah, completely resonate with what

you just told.

860

:

Happy to hear that the educational efforts

are.

861

:

useful for sure that's something that's

very dear to my heart and I spend a lot of

862

:

time doing that so my people and yeah as

you are saying it's already hard enough to

863

:

know a lot about educational research but

if you have to learn a whole new

864

:

statistical framework from scratch it's

very hard and more than that it's not

865

:

really valued and incentivized in the

academic realm so like why would you even

866

:

spend time doing that?

867

:

you'd much rather write a paper.

868

:

So that's like, that's for sure that's an

issue.

869

:

So yeah, definitely working together on

that is definitely helping.

870

:

And on that note, I put all the links in

the show notes and also Paul Burkner was

871

:

on the show episode 35.

872

:

So for people who want to dig deeper about

Paul's work, especially BRMS, as you

873

:

mentioned Sonia.

874

:

definitely take a well give a give a

listen to that to that episode and also

875

:

for people who are using Python more than

are but really like the formula syntax

876

:

that BRMS has you can do that in Python

you can use a package called BAMI and it's

877

:

basically BRMS in in Python in the

878

:

that's built on top of PimC and that's

built by a lot of very smart and cool

879

:

people like my friend Tomica Pretto.

880

:

He's one of the main core developers.

881

:

I just released actually an online course

with him about advanced regression in

882

:

Bambi and Python.

883

:

So it was a fun course.

884

:

We've been developing that for the last

two years and we released that this week.

885

:

So I have to say I'm quite relieved.

886

:

Congratulations.

887

:

Yeah, that's exciting.

888

:

Yeah, that was a very fun one.

889

:

It's just, I mean, it took so much time

that because we wanted something that was

890

:

really comprehensive and as evergreen as

it gets.

891

:

So we didn't want to do something, you

know, quick and then having to do it all

892

:

over again one year later.

893

:

So I wanted to take our time and basically

take people from normal linear regression

894

:

and then okay, how do you generalize that?

895

:

How do you handle?

896

:

non -normal likelihoods, how do you handle

several categories?

897

:

Because most of the examples out there in

the internet are somewhat introductory.

898

:

How do you do Poisson regression and

binomial regression most of the time?

899

:

But what about the most complex cases?

900

:

What happens if you have zero inflated

data?

901

:

What happens if you have data that's very

dispersed that a binomial or a Poisson

902

:

cannot handle?

903

:

What happens if you have multi -category

called data?

904

:

More than two categories.

905

:

You cannot use the binomial.

906

:

You have to use the category called all

the multinomial distributions.

907

:

And these ones are harder to handle.

908

:

You need another link function that the

inverse logit.

909

:

So it's a lot of stuff.

910

:

But the cool thing is that then you can do

really powerful models.

911

:

And if you marry that with hierarchical

models, that is really powerful stuff that

912

:

you can do.

913

:

So yeah, that's what the whole course is

about.

914

:

I'll have Tommy actually on the show to

talk about that with him.

915

:

So that's going to be a fun one.

916

:

Yeah, I'm looking forward to hearing more

about it.

917

:

Sounds like something I might recommend to

some people that I know.

918

:

Yeah, yeah.

919

:

that's exciting.

920

:

Yeah, yeah, for sure.

921

:

Happy to.

922

:

Happy to.

923

:

like send you send you the link I put the

link in the show notes anyway so that

924

:

people who are interested can can take a

look and of course patrons of the show

925

:

have a 10 % discount because they are they

are the best listeners in the world so you

926

:

know they deserve a gift yes they are well

Sonya I've already taken quite a lot of

927

:

your time so we're gonna we're gonna start

closing up but

928

:

I'm wondering if you have any advice to

give to aspiring researchers who are

929

:

interested in incorporating Bayesian

methods into their own work and who are

930

:

working in your field, so educational

research?

931

:

Yeah, I think the first thing I would say

is don't be scared, which I say a lot when

932

:

I talk about statistics.

933

:

Don't be scared and take your time.

934

:

I think...

935

:

A lot of people may come into Bayesian

methods after hearing about frequentist

936

:

methods for years and years and years.

937

:

And so it's going to take more than a week

or two to learn everything you need to

938

:

know about Bayes, right?

939

:

That's normal.

940

:

We don't expect to familiarize ourselves

with a whole new field in a day or a week.

941

:

And that's fine.

942

:

Don't feel like a failure.

943

:

Then.

944

:

I don't know, I would also try and look

for papers in your field, right?

945

:

Like if you're studying school climate, go

online and search for school climate base

946

:

and see if anyone else has done any work

on your topic of interest using this new

947

:

estimation method.

948

:

It's always great to see examples of how

other people are using it within a context

949

:

that you are familiar with, right?

950

:

You don't have to start reading all these

technical papers.

951

:

You can stay within your realm of

knowledge, within your realm of expertise,

952

:

and then just eke out a little bit.

953

:

And then after that, I mean, as we just

talked about, there are so many resources

954

:

available that you can look for, and a lot

of them are starting to become super

955

:

specific as well.

956

:

So if you are interested in structural

equation models, go look for resources

957

:

about Bayesian structural equation

modeling.

958

:

But if you're interested in some other

model, try and find resources specific to

959

:

those.

960

:

And as you're going through this process,

a nice little side benefit that's going to

961

:

happen is that you're going to get really

good at Googling because you've got to

962

:

find all this information.

963

:

But it's out there and it's there to find.

964

:

So, yeah, that would really be my advice.

965

:

Don't be scared.

966

:

Yeah, it's a good one.

967

:

That's definitely a good one because

then...

968

:

Like if you're not scared to be

embarrassed or fail, you're gonna ask a

969

:

lot of questions, you're gonna meet

interesting people, you're gonna learn way

970

:

faster than you thought.

971

:

So yeah, definitely great advice.

972

:

Thanks, Sonja.

973

:

And people in our field, Invasion Methods,

they are so nice.

974

:

I feel like they are just so excited

when...

975

:

I'm so excited when anyone shows any

interest in what I do.

976

:

Yeah, don't be scared to reach out to

people either because they're going to be

977

:

really happy that you did.

978

:

True, true.

979

:

Yeah, very good point.

980

:

Yeah, I find that community is extremely

welcoming, extremely ready to help.

981

:

And honestly, I still have to find trolls

in that community.

982

:

That's really super value.

983

:

I feel like it helps that a lot of us came

into this area through also kind of like a

984

:

roundabout way, right?

985

:

I don't think anyone is born thinking

they're going to be a Beijing statistician

986

:

and so we understand.

987

:

Yeah, yeah.

988

:

Yeah, well, I did.

989

:

I think my first word was prior.

990

:

So, you know, okay.

991

:

Well, you're the exception to the rule.

992

:

Yeah, yeah.

993

:

But you know, that's life.

994

:

I'm used to being the black sheep.

995

:

That's fine.

996

:

no, I think I wanted to be a football

player or something like that.

997

:

no, also I wanted to fly planes.

998

:

I wanted to be a fighter pilot at some

point later after I had outgrown football.

999

:

You're a thrill seeker.

:

01:04:36,493 --> 01:04:43,353

I wanted to be a vet or something, but

then I had to take my pets to the vet and

:

01:04:43,353 --> 01:04:44,013

they were

:

01:04:44,013 --> 01:04:47,493

bleeding and I was like, no, I don't want

the event anymore.

:

01:04:48,293 --> 01:04:56,733

Well, it depends on the kind of animals

you treat, but veterinarian can be a

:

01:04:56,733 --> 01:05:00,053

thrill seeking experience too.

:

01:05:00,053 --> 01:05:07,513

You know, like if you're specialized in

snakes or grizzlies or lions, I'm guessing

:

01:05:07,513 --> 01:05:12,827

it's not all the time, you know, super,

super easy and tranquil.

:

01:05:13,645 --> 01:05:14,825

no.

:

01:05:16,545 --> 01:05:17,005

Awesome.

:

01:05:17,005 --> 01:05:19,865

Well Sonia, that was really great to have

you on the show.

:

01:05:19,865 --> 01:05:22,285

Of course, I'm going to ask you the last

two questions.

:

01:05:22,285 --> 01:05:24,625

Ask every guest at the end of the show.

:

01:05:24,625 --> 01:05:29,725

So if you had unlimited time and

resources, which problem would you try to

:

01:05:29,725 --> 01:05:30,341

solve?

:

01:05:32,045 --> 01:05:37,185

I thought about this a lot because I

wanted to solve many problems.

:

01:05:37,525 --> 01:05:41,805

So when I give this answer, I'm hoping

that other people are taking care of all

:

01:05:41,805 --> 01:05:43,005

those other problems.

:

01:05:43,005 --> 01:05:48,505

But I think something that I've noticed

recently is that a lot of people seem to

:

01:05:48,505 --> 01:05:56,505

have lost the ability or the interest in

critical thinking and being curious and

:

01:05:56,505 --> 01:05:59,245

trying to figure out things by yourself.

:

01:05:59,245 --> 01:06:02,093

And so that's something that I would like

to.

:

01:06:02,093 --> 01:06:04,533

solve or improve somehow?

:

01:06:04,533 --> 01:06:10,193

Don't ask me how, but I think being a

critical thinker and being curious are two

:

01:06:10,193 --> 01:06:17,173

really important skills to have to succeed

in our society right now.

:

01:06:17,173 --> 01:06:23,513

I mean, there's so much information being

thrown at us that it's really up to you to

:

01:06:23,513 --> 01:06:26,633

figure out what to focus on and what to

ignore.

:

01:06:26,633 --> 01:06:30,061

And for that, you really need this

critical thinking skill and...

:

01:06:30,061 --> 01:06:33,461

and also the curiosity to actually look

for information.

:

01:06:33,461 --> 01:06:38,681

And so I think that's, it's also a very

educational problem, I feel.

:

01:06:38,681 --> 01:06:43,161

So if it's where I am right now in my

career, but yeah, that would be something

:

01:06:43,161 --> 01:06:44,521

to solve.

:

01:06:44,641 --> 01:06:45,601

Yeah.

:

01:06:45,861 --> 01:06:49,181

Completely understand that was actually my

answer also.

:

01:06:49,181 --> 01:06:50,511

So I like, really?

:

01:06:50,511 --> 01:06:51,261

Yeah.

:

01:06:51,261 --> 01:06:51,641

Yeah.

:

01:06:51,641 --> 01:06:51,921

Yeah.

:

01:06:51,921 --> 01:06:53,261

I completely agree with you.

:

01:06:53,261 --> 01:06:53,621

Yeah.

:

01:06:53,621 --> 01:06:55,041

These are topics I found.

:

01:06:55,041 --> 01:06:56,161

I find them.

:

01:06:56,161 --> 01:06:57,241

I find super interesting.

:

01:06:57,241 --> 01:06:58,181

How do you.

:

01:06:58,221 --> 01:07:02,581

do we teach critical thinking, how do we

teach the scientific methods, things like

:

01:07:02,581 --> 01:07:02,821

that.

:

01:07:02,821 --> 01:07:07,401

It's always something I'm super excited to

talk about.

:

01:07:07,561 --> 01:07:13,941

Yeah, I also hope it will have some sort

of trickle down effect on all the other

:

01:07:13,941 --> 01:07:14,721

problems, right?

:

01:07:14,721 --> 01:07:18,661

Once the whole world is very skilled at

critical thinking, all the other issues

:

01:07:18,661 --> 01:07:22,001

will be resolved pretty quickly.

:

01:07:22,101 --> 01:07:27,149

Yeah, not only because it's directly

solved, but...

:

01:07:27,149 --> 01:07:34,729

I would say mainly because then you have

maybe less barriers.

:

01:07:35,809 --> 01:07:39,789

And so yeah, probably coming from that.

:

01:07:40,429 --> 01:07:46,709

And then second question, if you could

have dinner with any great scientific

:

01:07:46,709 --> 01:07:50,869

mind, dead, alive or fictional food.

:

01:07:51,929 --> 01:07:54,285

So I ended up

:

01:07:54,285 --> 01:07:59,485

Choosing Ada Lovelace who's like one of

the first or maybe the first woman who

:

01:07:59,485 --> 01:08:02,525

ever worked in computer programming area.

:

01:08:02,525 --> 01:08:06,785

I think she's very interesting I also

recently found out that she passed away

:

01:08:06,785 --> 01:08:12,505

when she was only like 36 Which is like

I'm I'm getting at that age and she

:

01:08:12,505 --> 01:08:16,665

already accomplished all these things By

the time she passed away and so now I'm

:

01:08:16,665 --> 01:08:22,145

like, okay I gotta I gotta step it up, but

I would really love to talk to her about

:

01:08:22,145 --> 01:08:24,027

just her experience.

:

01:08:24,077 --> 01:08:30,657

being so unique in that very manly world

and in that very manly time in general, I

:

01:08:30,657 --> 01:08:36,017

think it would be very interesting to hear

the challenges and also maybe some

:

01:08:36,017 --> 01:08:40,577

advantages or like benefits that she saw,

like why did she go through all this

:

01:08:40,577 --> 01:08:42,597

trouble to begin with?

:

01:08:42,597 --> 01:08:47,237

Yeah, I think it would be an interesting

conversation to have for sure.

:

01:08:48,017 --> 01:08:49,757

Yeah, yeah, definitely.

:

01:08:49,757 --> 01:08:50,757

Yeah, great choice.

:

01:08:50,757 --> 01:08:53,549

I think, I think somebody already

:

01:08:53,549 --> 01:08:54,289

had answered.

:

01:08:54,289 --> 01:08:59,269

I don't remember who, but yeah, it's not a

very common choice.

:

01:08:59,269 --> 01:09:02,709

We can have a dinner party together.

:

01:09:03,049 --> 01:09:04,129

Yeah, exactly.

:

01:09:04,129 --> 01:09:05,329

That's perfect.

:

01:09:05,449 --> 01:09:06,649

Fantastic.

:

01:09:06,929 --> 01:09:07,689

Great.

:

01:09:07,689 --> 01:09:08,289

Thank you.

:

01:09:08,289 --> 01:09:09,989

Thank you so much, Sonja.

:

01:09:09,989 --> 01:09:11,829

That was a blast.

:

01:09:11,889 --> 01:09:13,889

I learned so much.

:

01:09:14,609 --> 01:09:15,909

Me too.

:

01:09:16,509 --> 01:09:18,049

You're welcome.

:

01:09:18,849 --> 01:09:23,245

And well, as usual, I put resources and a

link to a website.

:

01:09:23,245 --> 01:09:26,685

in the show notes for those who want to

dig deeper.

:

01:09:26,685 --> 01:09:30,165

Thank you again, Sonia, for taking the

time and being on this show.

:

01:09:30,465 --> 01:09:31,765

Yeah, thank you.

:

01:09:31,765 --> 01:09:33,701

It was so much fun.

:

01:09:37,869 --> 01:09:41,609

This has been another episode of Learning

Bayesian Statistics.

:

01:09:41,609 --> 01:09:46,569

Be sure to rate, review, and follow the

show on your favorite podcatcher, and

:

01:09:46,569 --> 01:09:51,489

visit learnbaystats .com for more

resources about today's topics, as well as

:

01:09:51,489 --> 01:09:56,229

access to more episodes to help you reach

true Bayesian state of mind.

:

01:09:56,229 --> 01:09:58,149

That's learnbaystats .com.

:

01:09:58,149 --> 01:10:02,989

Our theme music is Good Bayesian by Baba

Brinkman, fit MC Lass and Meghiraam.

:

01:10:02,989 --> 01:10:06,149

Check out his awesome work at bababrinkman

.com.

:

01:10:06,149 --> 01:10:07,309

I'm your host.

:

01:10:07,309 --> 01:10:08,289

Alex Andorra.

:

01:10:08,289 --> 01:10:12,549

You can follow me on Twitter at Alex

underscore Andorra, like the country.

:

01:10:12,549 --> 01:10:17,629

You can support the show and unlock

exclusive benefits by visiting Patreon

:

01:10:17,629 --> 01:10:19,809

.com slash LearnBasedDance.

:

01:10:19,809 --> 01:10:22,249

Thank you so much for listening and for

your support.

:

01:10:22,249 --> 01:10:24,489

You're truly a good Bayesian.

:

01:10:24,489 --> 01:10:28,039

Change your predictions after taking

information in.

:

01:10:28,039 --> 01:10:34,149

And if you're thinking I'll be less than

amazing, let's adjust those expectations.

:

01:10:34,605 --> 01:10:40,065

Let me show you how to be a good Bayesian

Change calculations after taking fresh

:

01:10:40,065 --> 01:10:46,045

data in Those predictions that your brain

is making Let's get them on a solid

:

01:10:46,045 --> 01:10:47,845

foundation

Chapters

Video

More from YouTube