Artwork for podcast Learning Bayesian Statistics
#121 Exploring Bayesian Structural Equation Modeling, with Nathaniel Forde
Behavioral & Social Sciences Episode 12111th December 2024 • Learning Bayesian Statistics • Alexandre Andorra
00:00:00 01:08:12

Share Episode

Shownotes

Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!

Visit our Patreon page to unlock exclusive Bayesian swag ;)

Takeaways:

  • CFA is commonly used in psychometrics to validate theoretical constructs.
  • Theoretical structure is crucial in confirmatory factor analysis.
  • Bayesian approaches offer flexibility in modeling complex relationships.
  • Model validation involves both global and local fit measures.
  • Sensitivity analysis is vital in Bayesian modeling to avoid skewed results.
  • Complex models should be justified by their ability to answer specific questions.
  • The choice of model complexity should balance fit and theoretical relevance. Fitting models to real data builds confidence in their validity.
  • Divergences in model fitting indicate potential issues with model specification.
  • Factor analysis can help clarify causal relationships between variables.
  • Survey data is a valuable resource for understanding complex phenomena.
  • Philosophical training enhances logical reasoning in data science.
  • Causal inference is increasingly recognized in industry applications.
  • Effective communication is essential for data scientists.
  • Understanding confounding is crucial for accurate modeling.

Chapters:

10:11 Understanding Structural Equation Modeling (SEM) and Confirmatory Factor Analysis (CFA)

20:11 Application of SEM and CFA in HR Analytics

30:10 Challenges and Advantages of Bayesian Approaches in SEM and CFA

33:58 Evaluating Bayesian Models

39:50 Challenges in Model Building

44:15 Causal Relationships in SEM and CFA

49:01 Practical Applications of SEM and CFA

51:47 Influence of Philosophy on Data Science

54:51 Designing Models with Confounding in Mind

57:39 Future Trends in Causal Inference

01:00:03 Advice for Aspiring Data Scientists

01:02:48 Future Research Directions

Thank you to my Patrons for making this episode possible!

Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser, Julio, Edvin Saveljev, Frederick Ayala, Jeffrey Powell, Gal Kampel, Adan Romero, Will Geary, Blake Walters, Jonathan Morgan, Francesco Madrisotti, Ivy Huang, Gary Clarke, Robert Flannery, Rasmus Hindström and Stefan.

Links from the show:

Transcript

This is an automatic transcript and may therefore contain errors. Please get in touch if you're willing to correct them.

Transcripts

Speaker:

Today I am thrilled to host Nathaniel Ford, a staff data scientist at Personio, where he

works on people analytics for one of the leading HR intelligence platforms.

2

:

With more than a decade of experience spanning insurance, gaming and e-commerce, Nathaniel

brings a wealth of knowledge to the table.

3

:

He's also an active contributor to the

4

:

time-sync ecosystem with a particular focus on causal inference.

5

:

In this episode, Nathaniel takes us on a deep dive into the world of structural equation

modeling, or SEM, and confirmatory factor analysis, or CFA.

6

:

We explore the advantages of Bayesian approaches in handling these models from greater

flexibility to enhanced model validation through sensitivity analysis.

7

:

Whether

8

:

you're curious about fitting complex models to real-world data, the intersection of SEM

and causal inference, or the growing role of Bayesian methods in industry applications,

9

:

this conversation offers insights for both beginners and seasoned practitioners.

10

:

This is Learning Bayesian Statistics, episode 121, recorded October 11, 2024.

11

:

Welcome Bayesian Statistics, a podcast about Bayesian inference, the methods, the

projects, and the people who make it possible.

12

:

I'm your host, Alex Andorra.

13

:

You can follow me on Twitter at alex-underscore-andorra.

14

:

like the country.

15

:

For any info about the show, learnbasedats.com is Laplace to be.

16

:

Show notes, becoming a corporate sponsor, unlocking Bayesian Merge, supporting the show on

Patreon, everything is in there.

17

:

That's learnbasedats.com.

18

:

If you're interested in one-on-one mentorship, online courses, or statistical consulting,

feel free to reach out and book a call at topmate.io slash alex underscore and dora.

19

:

See you around, folks.

20

:

and best patient wishes to you all.

21

:

And if today's discussion sparked ideas for your business, well, our team at Pimc Labs can

help bring them to life.

22

:

Check us out at pimc-labs.com.

23

:

And before we start the show, I wanted to particularly warm our new members in our small

invasion family on Patreon.

24

:

Thank you so much to the mysterious and drilled

25

:

9830 and to Alex for joining the full posterity or higher on Patreon.

26

:

I hope that you will enjoy your merch and Alex, I want to tell you that not only will I

love your first name of course but also you joined on November 13 which is my birthday so

27

:

thank you so much for the birthday present Alex.

28

:

Can't wait to see you guys in the Slack channel.

29

:

And now, let's hear from Nathaniel.

30

:

Nathaniel Ford, welcome to Learning Vision Statistics.

31

:

Thanks for having me, Alex.

32

:

Yeah, thanks for taking the time.

33

:

It's always a pleasure to have you on the show, even though it's the first time.

34

:

I mean, on the main format, but you've already been on the show actually to do a modeling

webinar.

35

:

I think it's the first time that happens.

36

:

So you were first, you first appeared doing a great...

37

:

modeling webinar about Bayesian non-parametric causal inference.

38

:

You demoed how to do that with PMC.

39

:

And I think there is even some BART model in there.

40

:

So it's very in-depth tutorial that I definitely recommend people interested in to check

out.

41

:

I will add it to the show notes, course, as well as the video of the webinar because...

42

:

Yeah, encourage people to follow along the video because you're demoing that live with the

tutorial and also some of the live questions.

43

:

And you're a specialist of these in-depth guides and tutorials, mainly about causal

inference because it's one of your favorite topics.

44

:

And that's also why I thought it was interesting to have you on the show today because you

have a new tutorial out on the PMC website.

45

:

I was thinking, okay, now it's time to have you on the show and also talk about what you

do and your background.

46

:

So thanks for taking the time and well, let's start as always do.

47

:

Can you tell us what you're doing nowadays and how you ended up working on this?

48

:

Yeah, certainly.

49

:

So I'm a data scientist for Presonio at the moment.

50

:

It's a sort of HR intelligence platform, like a kind of one stop shop for all your sort of

51

:

HR workflows and sort of insights.

52

:

I've been working there for about three years.

53

:

And I've been working in data science or data science adjacent roles in industry for about

10 years.

54

:

So yeah, I kind of, think I got a little bit lucky with my first job in industry, which

kind of set me up well to sort of move into data science more generally, which I can speak

55

:

to as well.

56

:

so I kind of graduated from university into the teeth of the global financial crisis.

57

:

So like I was, took, the first job I could get.

58

:

but it was fortunate because I got a job with, Marsha and McLennan, which is a huge

reinsurance insurance kind of company.

59

:

They've got Oliver Wyman, Guy Carpenter, Marsh underneath them.

60

:

And they were, they were spinning up a innovation center in Dublin.

61

:

And, my first job.

62

:

was working for Marsh on their new data quality function, which was kind of a neat first

job for anyone who was data curious because the sort of flow of that job was like every

63

:

quarter you'd get a new data set from Marsh and you'd have to try and evaluate it for data

quality, missing data, poor sort of data entry, this kind of thing.

64

:

It basically highlighted the risks

65

:

of poor data quality for all the use cases that Marsh had in the company.

66

:

So it was a good first training round for, say, budding data scientist, we'll say.

67

:

Yeah, for sure.

68

:

And how did you end up then in the...

69

:

I can see causal inference coming in here, but how did Bayes come up?

70

:

Do you remember when you were first introduced to Bayesian stats?

71

:

Yeah, I was thinking about this the other day.

72

:

So I think...

73

:

I had to be introduced to Bayesian stats twice and the first time it didn't really take

for me.

74

:

like in university, did philosophy and kind of went into mathematical logic after that.

75

:

And as a sort of tangent from studying mathematical logic, I was working on different

logics of dependence.

76

:

like justification, logics and dependence structures and independent structures.

77

:

And I kind of came across Judea Pearl's work on independent structures and the graph

relationships between sort of probabilistic independence and graph like directed acyclic

78

:

graphs.

79

:

And that sort of backed me into sort of thinking about the sort of conditional

probabilities and Bayesian probabilities from like, I even worked on trying to replicate

80

:

Pearl's completeness proof.

81

:

of the relationships between probabilistic algebra and the independence relationships on

those graphs.

82

:

So from a highly theoretical perspective, I came across Bayes and it didn't stick.

83

:

It didn't resonate with me.

84

:

I worked through some of the mathematics, but I didn't get it, I don't think.

85

:

And it really took exposure in industry.

86

:

In my second job, think, I worked with

87

:

So I worked in Guy Carpenter for about a year and it was kind of a nice role because I

worked with this sort of catastrophic risk modelers.

88

:

So they would have, they were building risk models for portfolios of property that were at

risk from some natural disasters like earthquakes or floods or fires.

89

:

And they would try and sort of simulate over sort of multiple, basically Monte Carlo MMC

kind of modeling.

90

:

to estimate the risk to each sort of portfolio of properties.

91

:

like seeing that application was sort of my first sort of exposure to this is a practical

tool that it can be very useful and very informative for making massively important

92

:

decisions about like the nature of disaster, catastrophic risk.

93

:

Like countries were buying insurance contracts from say Guy Carpenter like.

94

:

Mexico would ensure itself versus earthquake risk based on simulation data.

95

:

So that was second time and it really stuck with me that that was a practical tool.

96

:

And it connected the dots in my head a little bit with the theoretical background and the

practicalities.

97

:

And then I just started looking into where should I learn this and found PiMC and the

welcoming community there.

98

:

You know, the rest.

99

:

There is DC story.

100

:

Yeah.

101

:

So in how

102

:

So actually today I invited you because you have this new tutorial about structural

equation modeling, confirmatory factor analysis.

103

:

So that's already two acronyms, SEM and CFA that we should definitely introduce.

104

:

So could you start by explaining the basics of SEM and CFA for our listeners who might be

new to these concepts?

105

:

Yeah.

106

:

So I think maybe the...

107

:

The way I kind of came into trying to understand factor analysis and sorry, confirmatory

factor analysis was probably initially the scikit-learn factor analysis methods.

108

:

So the presentation there instead of a typical sort of machine learning workflow for

factor analysis is to think of it as a dimensional reduction technique.

109

:

So you have like this wide array of features, your X matrix and your prediction problem or

whatever.

110

:

And you think that there's a bunch of them that are kind of related, and you want to sort

of zip them up into one feature that simplifies your modeling workflow.

111

:

So like you take these related features, push them through this factor analysis routine,

and it spits out like maybe one factor or two factors, depending on what you're aiming at.

112

:

But it reduces the complexity of your data set by kind of not exactly aggregating up

multiple features, but creating new features from existing features.

113

:

So that was my first exposure to factor analysis.

114

:

Confirmatory factor analysis is more common, less not in machine learning workflows, but

more common in psychometrics and social sciences, educational sciences, learning

115

:

development sciences, where the factor itself, the thing you wrap up all your features

into, is of independent interest.

116

:

So you might have this kind of construct or notion of say mathematical aptitude, which is

not itself directly measurable, but you have a bunch of measures that seem related to it,

117

:

right?

118

:

Like, or should be informed by it.

119

:

So like historic test results on like say your junior cycle math exams and your other

cycle math exams and like, how do they all kind of collectively inform a view of your

120

:

individual mathematical aptitude?

121

:

So a confirmatory factor analysis model is interested in these abstract constructs and you

might gather a wide data set and you want to sort of reduce it in terms of complexity, but

122

:

also see which are good measures for which of these latent factors exist.

123

:

So this is often the case.

124

:

so, so yes.

125

:

So yeah, so basically it's a dimensional reduction technique, but the things that you're

reducing to these multiple indicator variables are of independent interest themselves.

126

:

So you might want to try and measure this latent construct of mathematical aptitude and

see if that itself is predictive of future math scores, for instance.

127

:

So that's kind of factor analysis.

128

:

And the confirmatory part is you're making a statement by building this model that

129

:

These indicator variables, like say your historic math scores, load onto this aptitude

factor well, and they make sense collectively as indicators for this abstract notion of

130

:

aptitude.

131

:

So what you're trying to do when you fit a confirmatory factor analysis is make sure that

the specified relationships between these abstract constructs and your variables that you

132

:

define in your modeling architecture make sense and can recover aspects of the observed

relationships in your data matrix.

133

:

Primarily, you're interested in recovering

134

:

the correlations and covariance structures between your observed variables.

135

:

And so as you build out that architecture of the confirmatory relationships, you then

evaluate the success of your factor model by seeing how well it can recover the covariance

136

:

structure in your observed data.

137

:

Okay, so that means does that mean CFA has like more structure in it because you assume

the model has a structure can almost like a DAG?

138

:

Yes, you're like by imagine you have a data matrix with six variables.

139

:

You say that the first three variables are related to

140

:

life satisfaction in the case that I was looking at.

141

:

And the second three variables are related to measures of parental support.

142

:

So you want to sort of abstract across those six measures to have like one score for each

individual on how well supported they were by their parents and another score for how well

143

:

they report life satisfaction scores.

144

:

But you are imposing that structure, like you're imposing that structure to say,

145

:

these three variables load on this factor, these three variables load on this factor, and

you're trying to confirm that structural specification by seeing how well that model can

146

:

then reproduce the observed covariance structures between the six indicator metrics that

you actually have in your data set.

147

:

And the difference with classic factor analysis here in that example, for instance, would

be that

148

:

You would not be telling the model that the first three factors relate to life

satisfaction and the three last factors relate to parenting.

149

:

You would just put everything in the same model and just see if the model picks that up

itself.

150

:

Machine learning approaches are often less theory driven and more sort of, does it perform

better predictively on the outcome I care about?

151

:

Confirmatory factor analysis is sort of inherently theory driven.

152

:

You're trying to describe what you believe to be the operating theory that drives these

outcomes.

153

:

And by evaluating the confirmatory, or by trying to confirm the factor analysis structure

that you've specified in your model, you're in some sense trying to validate a theory of

154

:

what drives these outcomes of interest.

155

:

Okay, yeah.

156

:

Yeah, so I can clearly see the link then.

157

:

with DAGs and causal inference for sure.

158

:

Yeah.

159

:

And so, mean, so that's already seemingly kind of complicated enough or at least hard to

explain on a podcast setting.

160

:

And then structural equation modeling on top of that is a way in some sense to add yet

more structure to the theory that you wish to confirm.

161

:

So you now have you have all your measures for your indicator variables.

162

:

So you're three for life satisfaction and three for parental support.

163

:

But now you want to say that of the two latent constructs that you've just defined, that

there's a regression relationship between those two constructs, for instance.

164

:

And then we say we try to predict life satisfaction from parental support.

165

:

And structural equation modeling is used across a bunch of diverse fields and means maybe

subtly different things in each, but inherently it's just about adding more explicit

166

:

structure.

167

:

to the relationships in your big multivariate joint distribution.

168

:

And you can be quite explicit in how you structure those relationships or those dependency

chains.

169

:

You can have chains of regressions or chains of functional relationships between your

indicator variables, but also the latent constructs that you derive via your measurement

170

:

model, the factor analysis structure that you have baked into your structural equation

model.

171

:

Okay.

172

:

Okay.

173

:

Yeah, I see.

174

:

So actually for people interested in more about structural equation modeling, causal

inference and psychometrics, I recommend listening to episode 102 with Ed Markle.

175

:

I linked to that in the show notes because Ed is doing psychometrics.

176

:

that's really, it seems to be used a lot SEM in psychometrics.

177

:

So for people interested in that, definitely recommend that one.

178

:

Today, we're going to focus on your tutorial about CFA and SEM, Nathaniel.

179

:

I know you prefer going by that.

180

:

Actually, I'm curious what were the main goals of this tutorial and what prompted you to

explore this area?

181

:

What prompted me to explore...

182

:

To be clear, none of the data we used in that tutorial is Personios data.

183

:

But work in Presonio, so it's a HR platform and intelligence platform and like one of the

sort of primary ways in which HR departments attempt to gauge their effect on the employee

184

:

base in any company is to run sort of regular surveys of different stripes and sort of the

main or canonical survey that gets run across most industries is like an engagement

185

:

survey.

186

:

where you're trying to build a model or understand what drives employee engagement or

satisfaction in some sense or other.

187

:

And those surveys can be quite dense.

188

:

There's many, many questions across different themes about your work-life balance or your

working conditions or your autonomy.

189

:

so it's kind of a psychometric adjacent.

190

:

question, you're trying to figure out like what is it that, what metrics load on aspects

of employee happiness and how does that happiness drive their engagement and how does

191

:

ultimately questions of like how does engagement drive productivity in the business's

bottom line?

192

:

And it's a problem that is just well suited to structural equation modeling.

193

:

And I didn't know enough about structural equation modeling to

194

:

to be comfortable applying it sort of out of the box tools.

195

:

So I wanted to sort of dive into the modeling framework, understand it better and kind of

build it in a framework I know well, which is PyMC.

196

:

And like the beautiful thing about probabilistic programming languages and PyMC as well is

that it offers you this language or freedom to express those complicated modeling

197

:

structures in a way that...

198

:

A, I'm familiar with, but B is also very intuitive and powerful given the Bayesian

setting.

199

:

I see.

200

:

Yeah.

201

:

So that was a way for you to learn in an open garage way in a way.

202

:

And so what are some of the key challenges actually that you experienced in applying SEM

and CFA?

203

:

Yeah.

204

:

So there's a lot of, like I think you mentioned, Blavan is like a package that's being

developed for Bayesian structural equation modeling.

205

:

by Ed Merkel.

206

:

Exactly.

207

:

And that is sort of an augmentation of the more traditional Levan package, which is also

used to estimate structural equation models.

208

:

Levan is, I think, tries to fit these models in a more

209

:

as a classical way or frequentist way.

210

:

it often fits these models with maximum likelihood and sort of methods.

211

:

And so I've tried to use those those models and because there's a lot of tooling around

Levan, which makes it easier to interpret these models.

212

:

And it's a powerful package in its own right.

213

:

But these models are sort of inherently quite complex, like you're building rich

structural relationships.

214

:

with many, parameters in a lot of cases across complex and possibly not well understood

relationships in this sort multivariate survey that you're going to be working with.

215

:

And I often found that the model fit either wouldn't converge or if it did converge, it

would be a saturated model, which can be fine, don't get me wrong, but one of the sort of

216

:

difficult balancing acts you have with

217

:

fitting these structural equation models with maximum likelihood is that they, and this is

a detail that I think is relevant for our discussion about Bayesian model fitting, that

218

:

Levin model tries to fit the data by optimizing the fit of the covariance matrix against

the observed covariance matrix.

219

:

So the problem that you have there is that you have a number of degrees of freedom to

spend in your sort of model fitting routine.

220

:

And you can, with these complex models, can quickly exhaust your degrees of freedom and

you get a saturated model very quickly, which then becomes harder to sort of build on and

221

:

sort of ultimately interpret.

222

:

So that was sort of one challenge I had, sort of just working with Levan and working with

SEMS in general.

223

:

And what I kind of found to be compelling about the Bayesian approach to fitting SEM

models is that the...

224

:

The estimation routine is entirely different.

225

:

You're not aiming to fit a structure or optimize your fit to the covariance structure.

226

:

So you're not looking to fit against a aggregate of your observed data.

227

:

You're actually looking to sort of retrodict or predict the observed values of your

survey.

228

:

So it's a kind of a completely different model estimation sort of approach.

229

:

And it also kind of unlocks then for you the sort of more typical posterior predictive

checks that you get out of the Bayesian or contemporary Bayesian modeling workflow.

230

:

You can also, of course, look at the covariance structures that have been derived over

through your Bayesian modeling estimation workflow.

231

:

But it's not as limited in the manner in which it tries to fit data to the model.

232

:

And you obviously get then the

233

:

power of priors to kind of change up and down when you find like hard to fit models that

where the sampler doesn't behave well.

234

:

So yeah, that was the primary challenge and Bayesian approaches to that model, to those

complex models in general helped overcome that challenge.

235

:

Hmm.

236

:

Okay.

237

:

Yeah.

238

:

And why?

239

:

Why?

240

:

does using the Bitcoin framework here helps?

241

:

Is that a classic reason like, well, using priors is actually helping a lot because it

adds structure to the model?

242

:

Or is that something else mainly?

243

:

Yeah, so it adds structure to the model for sure.

244

:

And you can of weigh in, in some sense, where you think that you have implausible values

without it.

245

:

It helps in that respect, but there's also kind of a detail about like, once you have a

saturated MLE fit for a SEM model, it's harder to then add more structure and then

246

:

interpret the outcomes of that saturated model.

247

:

Whereas you don't, because you're not really working with the same degrees of freedom

problem when you're estimating the Bayesian model, you can kind of add more structure and

248

:

you get different sort of evaluation criteria that help you sort of.

249

:

I think, at least to my mind, more smoothly interpret the outcomes of your model fit.

250

:

Okay.

251

:

I see.

252

:

That's interesting.

253

:

How was using PIMC for that kind of model?

254

:

Because I don't think when people pick up PIMC, they think about these kinds of models.

255

:

Maybe causal inference, they pick it up for that.

256

:

SEM or CFA, I'm not sure this is something that pops, that's a software that pops into

people's mind easily.

257

:

So can you discuss how you used PIMC in this tutorial and what advantages did you see that

PIMC provided when modeling complex structures like those in your SEM and CFA tutorial?

258

:

Yeah, so like I'm not.

259

:

I mean, like I'm very impressed with the sort of flexibility of probabilistic programming

languages in general.

260

:

And I think.

261

:

Like I think even under the hood, Blavan, for instance, fits the complex models using

Stan.

262

:

It's maybe not the historic natural fit for fitting these models.

263

:

I don't know enough about the history of psychometrics.

264

:

I think there have been a couple of recent books from like 2021 and 2020 on Bayesian

psychometrics or Bayesian structural equation modeling have been released.

265

:

And I would be kind of maybe hopeful that there'll be more structural equation, Bayesian

structural equation modeling done in the future.

266

:

I think these models are sort of inherently complicated to articulate.

267

:

And with the reason why packages like Levan and

268

:

I Blavan and think Lizzrel and all of these things were sort of used widely is because it

hides a lot of the complexity behind a nice cleaner UI.

269

:

And like it's a validated UI in some sense, right?

270

:

So these model fit routines have been well justified, battle tested.

271

:

And like you don't want to be using necessarily a sort of bespoke tool for the complexity

that accrues to these types of modeling routines.

272

:

So yeah, so I mean, I use PMC because I'm very familiar with PMC.

273

:

These models are sort of inherently probabilistic.

274

:

So I think it is a natural fit for expressing these types of structures.

275

:

But because of the complexity of these models, if you were

276

:

working in industry or if you were working in academia and you don't want to build these

models by hand, you should probably use a package that is more battle tested for SEM

277

:

modeling than PyMC.

278

:

That's not to say we couldn't invite an alternative to Blavan on a PyMC base, but it would

be a lot of work and I think perhaps redundant.

279

:

I see.

280

:

Okay.

281

:

Yeah.

282

:

Still, yeah, to me, that's amazing to know that if you're already comfortable with Stan or

Pimcey, you can write up these kind of models in this framework.

283

:

That's really cool.

284

:

And that's why I really encourage people to take a look at your tutorial because, well,

the code is in there.

285

:

so that's, at least to me, that's really how I really understand the methods, much more by

reading the code.

286

:

much more than by reading the equations and what the model is supposed to do.

287

:

When I see the code, it really helps usually to understand what we're trying to do.

288

:

Yeah, 100%.

289

:

That was part of the appeal for me to learn the nature of the model was to see if I could

build it.

290

:

And I was pleasantly surprised to see that I could.

291

:

Yeah.

292

:

And how do you go about

293

:

Well, developing, we talked about that, but a question I often get and myself also I get

in my work with the Marlins is, well, how do you validate the models?

294

:

What criteria do you use to assess model fit, model accuracy, to understand whether going

a more complex road would pay off or not?

295

:

Because often,

296

:

Developing a model is a lot of choices, right?

297

:

It's a bit like a garden of fucking path, right?

298

:

Where, well, you can always come up with a more complex method and model, but it's gonna

take time and resources and that's an opportunity because for you as the modeler and then

299

:

of course the company.

300

:

So yeah, how do you think about these topics?

301

:

How do you decide on that?

302

:

Where do you think, when do you think that adding complexity is

303

:

worth it or not?

304

:

Yeah, so these are good questions.

305

:

I think there's probably two parts to this at least.

306

:

So there's just questions of sort of evaluating model fit in general.

307

:

And for a SEM or a CFA model, it breaks into at least two sort of views on the problem.

308

:

One is sort of measures of global model fit.

309

:

So this is where you try to look at the model as a whole and you'll find one summary

statistic that is your measure of performance.

310

:

Like this kind of lens is like maybe in typical Russian, maybe someone overweights the

importance of R squared as a measure of performance, right?

311

:

There is analogous summary statistics used in sort of traditional SEM models, a ton of

them actually.

312

:

They're sort of like indices of global model fit.

313

:

And then there are also measures of local model fit.

314

:

So what I mean there is like, instead of looking at the overall model, you look at how

well does the model recapture a sort of relationship.

315

:

So imagine your model has a covariance matrix of like kind of five by five or something

like that, right?

316

:

Where you're interested in particular, the covariances between the outcome of interest.

317

:

and one of the main drivers in your joint distribution.

318

:

And you're primarily interested in preserving a good model fit to that relationship.

319

:

Then you can look at sort of how well your model recovers the covariance or the

correlations between that component of your overall covariance matrix after your model has

320

:

been fit.

321

:

So in that way, you can distinguish model evaluation criteria to be local model evaluation

criteria or

322

:

global model evaluation criteria.

323

:

And so in the tutorial, I tried to pull out kind of two views on this, like because it's a

Bayesian model, first of all, you can do posterior predictive checks in general.

324

:

So you can do posterior predictive checks across this multivariate distribution, right?

325

:

So across the, whatever, 15 or so input variables, how well am I retrodicting all of those

variables after the model has fit?

326

:

You can also then sort of aggregate up the relationships and sort of build a predicted

covariance relationship between each of those output or outcome variables and then compare

327

:

that to the observed covariance relationship in your sample.

328

:

So you can measure how well your model recaptures the observed covariances.

329

:

You can also measure how well your model captures the observed data points.

330

:

And you can then also do like kind of Loo or WAIC measures on the outcome in general.

331

:

And that falls into sort of the typical Bayesian posterior predictive checking sort of

routines.

332

:

So there's kind of multiple lenses on how you evaluate the model, just based purely say on

the statistical metric.

333

:

There's another kind of subtle thing to think about here.

334

:

When you have...

335

:

Like, and this goes back to the fact that confirmatory factor analysis and structural

equation models are sort of articulations of a theory that you want to evaluate.

336

:

So like if your theory is about a dependency relationship between, parental support and

life satisfaction, and you have this mediating structure between like whatever other

337

:

variables that you're interested in, you want to answer questions about that dependency

relationship.

338

:

So articulating the right SEM structure or the model structure to be able to interrogate

those questions in itself will kind of lead you to build a particular model structure,

339

:

right?

340

:

That would be, that could be compared to a less complex structure, but the less complex

structure won't be able to answer that question that you have in mind.

341

:

So like, even though you might then look at these two models,

342

:

on the aggregate statistics of fit and maybe the less complex model does better, but it

fails to answer the question you're interested in.

343

:

And so then when you're weighing up your resources and your time allocation or whatever,

like you build a model that answers the question of interest to you, whether that's like

344

:

the best fitting model or not, this is the closest articulation you can get to your

theory, your theoretical question.

345

:

which presumably is important because it's going to drive a decision for you in work or in

your day to day.

346

:

So, yeah, there's multiple dimensions on which to evaluate fit and complexity tradeoffs.

347

:

But the psychometric discipline seems to be focused on trying to answer the theoretical

question that matters.

348

:

like, yeah.

349

:

That makes sense.

350

:

And that's also usually what I answer people when they ask me about that.

351

:

It's more you have to look at a collection of metrics and comparison much more than just

looking at one of them and call it a day.

352

:

Yes, that's true.

353

:

actually also another element, sorry, element of evaluating these models, especially in

the Bayesian setting, is the real and vital importance of sensitivity check-in.

354

:

like sensitivity checking to how you set those priors, right?

355

:

Like if you have articulated the model you want, it has the right structure, you really

need to make sure that you're not like skewing the thing by having too loose a prior.

356

:

So like as part of this workflow with the same model at least, you want to be damn sure

you've done good sensitivity analysis to make sure that the results that you're interested

357

:

in hold even after like changing reasonable priors.

358

:

Yeah, that makes sense.

359

:

That makes sense for sure.

360

:

And that's related to simulation-based calibration, I guess.

361

:

So that's also something I really recommend more and more if you have computing power or a

model that's not too long to fit, or if you can use a multi-spatial inference, is doing

362

:

simulation-based calibration even before

363

:

fitting the model to real data, that's really good because that gives you confidence that

the model is doing what you want it to do.

364

:

So once you start hitting roadblocks when you start fitting the model to real data, you

know it comes either from a model structure or parameterization issue, but at least it's

365

:

not coming from the...

366

:

the DAG if you want that to hang in mind because you know that if the DAG you have in mind

corresponds to the data you observed, then you'll be able to recover the parameters.

367

:

So we recommend doing that also.

368

:

could you like actually talking about difficulties in roadblocks, are there any technical

insights or interesting findings from writing your tutorials that were particularly

369

:

striking or unexpected to you?

370

:

mean, the thing that tripped me up most when building these models probably is I started

with this confirmatory factor baseline kind of model, which is, I think, is also called

371

:

the measurement model.

372

:

So before you add extra structure amongst the variables, you want to just establish the

relationships between your observed data and the factors of interest.

373

:

So that building that went reasonably smoothly.

374

:

And the thing that tripped me up when I was trying to add structure onto that was like I

initially just added regression, regression formulas effectively into the PMC model

375

:

context.

376

:

And it wasn't working.

377

:

And I realized at some point it was because I had just like fixed formulas.

378

:

wasn't sampling those kind of latent structural formulas.

379

:

I wasn't.

380

:

putting them in as a random variable.

381

:

Instead, the formula had to be put inside a normal distribution as the mu parameter.

382

:

We're predicting the center of a normal distribution, and then the model gets to sample

the rest of the random variation associated with that regression equation.

383

:

And I had been too optimistic in guessing that the fixed formula would be enough.

384

:

Rather, there's also variation or random variation that needs to be accounted for as you

build the structural components of your SEM model.

385

:

The regression equations themselves are sort of probabilistic random variables.

386

:

And secondly, when you build these regression components of your SEM model, you are taking

them out of the multivariate structure that is in your simpler

387

:

confirmatory factor analysis model, right?

388

:

But you need to put the possibility back into the model that those regression formulas in

your structure also have covariance structures.

389

:

So you needed to build extra sort of residual covariance structures to ensure that if

there is a sort co-determining effect between two regression components of your structural

390

:

model, that model was able to articulate that.

391

:

code determination effect and estimate the degree or strength of the code determination

effect in your model.

392

:

So it tripped me up getting that extra structure in there.

393

:

I see.

394

:

And how did you realize it was an issue and you had to that?

395

:

The model fits worse.

396

:

The model fits were dramatically worse.

397

:

There was more divergences, sometimes utter failure to converge.

398

:

One of the maybe surprising things is like the model structure there in Pine Seed

converges really fast for such a complex model.

399

:

It takes like five minutes to fit data.

400

:

So yeah, so it just wasn't working cleanly.

401

:

you could see that in the model fit statistics, but you could also see it in the sort of

health diagnostics.

402

:

of the sampler, which is kind of a nice, that's a nice aspect of HMC in some sense.

403

:

It's like, if it's not working, it's also an indication that your model is poorly

specified, right?

404

:

So yeah.

405

:

Yeah.

406

:

I mean, divergences are really, you know, the nerdy realization of the idea that the

obstacle is the way, you know, it's just.

407

:

It's always terrible to see the divergences, but in the end, they're always a blessing,

but it really costs adapting your mindset to be like, that's great.

408

:

I have divergences instead of being like, my God, no.

409

:

Yeah.

410

:

There was some banging on my head against the wall as we tried to figure out.

411

:

Actually, so we mentioned SEM, CFA.

412

:

We mentioned causal inference, but we didn't explain yet how.

413

:

these methods intersect together.

414

:

So can you explain the role SEM and CFA play in understanding causal relationships in

data?

415

:

Yeah, so I mean, there's actually a nice paper, Judea Pearl and think Ken Bowlin, I think

Ken Bowlin, and it talks about the nature of the relationship between structural equation

416

:

modeling and causal inference.

417

:

And the sort of argument there is that like it goes back to the fact that these models are

articulations of theory, like they're articulations of a relationship that you think holds

418

:

between like whether it's constructs or independent metrics or not, you're sort of

building a theory that you ultimately want to believe is causal in some sense or other in

419

:

most cases.

420

:

Doesn't give you causation for free.

421

:

You still need to

422

:

the right variables in the model, structure them appropriately, make sure that you try to

remove as much confounding as possible.

423

:

And in some sense, factor analysis, the fact that you're collecting a bunch of metrics

underneath one factor helps you articulate or disentangle the independence relationships

424

:

between multiple data points and recover conditional independence.

425

:

structures which will give you license to make causal claims about your model if it's well

fit.

426

:

But the discipline is sort of inherently about articulating a claim about the effects of

the relationships between these variables.

427

:

So it's kind of inherently causally keyed, if you get me.

428

:

Yeah.

429

:

I see.

430

:

OK.

431

:

OK, that's interesting.

432

:

And so here the

433

:

That's interesting to me to hear that the factor analysis is helping actually in this

causal, like recovering the causal relationships.

434

:

Because intuitively I would say that it could muddy the waters because you're reducing the

third dimensionality of your data, which I really get why you would do that, but I would

435

:

have intuited that that would make causal inference harder.

436

:

There's a nice quote at the end of the tutorial I have from Judeo Pearl there where it's

like, you're using these factor like structures to potentially create a new construct, but

437

:

that could be like, like doctors can just create a new syndrome based on different aspects

of behavior or sort of, you know, health response that is hard to sort of think about when

438

:

it's this sort of multivariate presentation.

439

:

10 different symptoms, you want to call those 10 different symptoms, group them together,

call it a syndrome or something like that, which helps you then think about the structure

440

:

between outcomes.

441

:

And so rather than having to work out what's the outcome, like do they live, die or suffer

in pain for 10 years due to this presentation of 10 symptoms, it's like, what is the

442

:

probability of survival based on having this particular syndrome?

443

:

Right?

444

:

it's a sort of, it's not like, not only is it a dimensional reduction technique, it's also

sort of a theory refinement technique, right?

445

:

Cause you have lots of data around things which are potentially presenting in a very

complex way by gathering them under a factor, you sort of unify, make more elegant or

446

:

clean your theory in a way that you hope to be justified theoretically.

447

:

Like it's not just like.

448

:

There's work to be done to make a SAM model, a compelling causal account of the

relationships between these variables.

449

:

So it's not for free, but like it's a, structure lends itself to building a theory about

what drives what, what causes what.

450

:

And so that's how you can use it to articulate those, the complexity of those

relationships.

451

:

Hmm.

452

:

Okay.

453

:

Okay.

454

:

Fascinating.

455

:

Yeah.

456

:

Ian, do you?

457

:

Do you have some practical applications in mind of these methods in real world scenarios,

especially when it comes to understanding risk and causal inference?

458

:

I mean, so like, kind of want to use these ultimately to look at sort of employee

engagement type survey data.

459

:

like, I want to know what are the sort of themes that are driving

460

:

the successful outcomes for an engaged workforce, for instance, based on their responses

across clusters of thematically grouped questions.

461

:

So those questions might be related to their worker autonomy in their role.

462

:

It might be related to the working conditions of the job.

463

:

It might be related to their view of the company writ large.

464

:

Like what are the decisive factors that are more sway in influencing the outcome, which is

like

465

:

a score of employee engagement.

466

:

And I think like more generally just across like well-designed surveys will give you

clusters of questions which are sort of mapped to appropriate themes which can be like

467

:

kind of articulated as a factor analysis model.

468

:

And then you want to kind of figure out what is the sort of relationship between, you

know, the employee's sense of autonomy and their sense of engagement.

469

:

You're not going to ever measure autonomy directly, but you might measure it indirectly

via a well-designed survey.

470

:

I think similar use cases are done across the social sciences.

471

:

These models are built in such a way that they can take these complex presentation of

multivariate survey responses.

472

:

and sort of thematically group or cluster these responses in ways that allow you to better

articulate the theory of what's going on behind the scenes.

473

:

Okay.

474

:

Yeah.

475

:

Okay.

476

:

I see.

477

:

So a lot of survey data, That makes sense.

478

:

I survey data is the use case I kind of have in mind, but I think there's applications

beyond that as well.

479

:

It's like any sort of complicated mess of measurements that you want to sort of...

480

:

add structure to make more compelling an account.

481

:

Yeah, that makes sense.

482

:

Something I really like, I said it in the introduction to this episode that you have a

background in philosophy and mathematical logic, which I find super interesting and very

483

:

original for that field.

484

:

I'm wondering how do you see these disciplines influencing your approach to data science

and probabilistic modeling?

485

:

Yeah, I mean, I think deeply, sort of deeply important to the way I think about these

problems.

486

:

Maybe like philosophy in general, I think there's a fascinating study and more people

should do it.

487

:

probably more direct influences the sort of study of logic that I did like a lot of the

focus and when you're studying a logic or various logics is

488

:

The way you want to express different consequences or different inferences between your

assumptions and the predicted outcomes in some sense or other, you have various varieties

489

:

of logic that encode different consequence relationships.

490

:

So there's like relevance logics, there's justification logics, there's dynamic epistemic

logics, there's a whole range of these particular things.

491

:

And they're all kind of just like different little lexicons.

492

:

with different connectors and rules of consequence.

493

:

Like, so how you can derive a conclusion from a set of assumptions when you're working

with a relevance logic versus when you're working with classical logic are different.

494

:

They have different sets of consequences.

495

:

And what I kind of found when I was working with those things is like, is sort of, like

there are natural arguments you want to make or inferences you want to make, which are

496

:

blocked when you're using a particular logic.

497

:

And what I find fascinating about sort of, PMC and probabilistic programming languages,

it's, it's also a language.

498

:

you, you get to articulate different structures and draw inferences in ways that are sort

of, mathematically defined and structured.

499

:

but it's, also lets you answer questions which are fuzzier than does this follow?

500

:

necessarily from your assumptions.

501

:

It's like this follows probabilistically with some degree of surety here and there.

502

:

So I think the focus on logic, the structuring of your argument in that also leads me to

think about modeling as structuring an argument in some sense.

503

:

Every model you build to my mind is like an argument about the state of the world.

504

:

You were saying the world is thus and so.

505

:

And here's how I'm measuring that and here's how I'm evaluating that claim.

506

:

And I'm retrodicting, retrodicting against past data to sort of reinforce that my model is

a good fit to the world.

507

:

It's an argument about how the world is or approximates, how we can approximate the world

through this linguistic structure.

508

:

So yeah, I don't know if that answers your question exactly, but yeah.

509

:

Yeah.

510

:

No, for sure.

511

:

And also something I've seen you talk about and express is keen interest in inference and

measurement in the presence of natural variation and confounding.

512

:

How does this interest of yours shape the way you design your models?

513

:

Yeah, so that's kind of a

514

:

So, yeah, so I think maybe I just have a very suspicious personality and that I think that

the world is constantly trying to fool me or that there will be a confounding relationship

515

:

in the data that I'm working with.

516

:

And that's kind of generally led me to look at the sort of causal inference questions and

causal inference can be just considered like a collection of methods or models.

517

:

that attempt to adjust for suspected patterns of confounding in your data generating

processes.

518

:

A nice kind of overlap between the same world and the sort of causal inference type models

is the Bayesian formulation of instrumental variable designs.

519

:

The Bayesian formulation of that model fits a multivariate normal distribution on your

520

:

on your treatment and you're a instrumental variable because you want to measure the sort

of correlation and covariance between those two things to adjust for the confounding

521

:

relationship, the confounding influence of a third variable on your treatment data.

522

:

that is kind of like the instrumental variables and path tracing rules.

523

:

were all of them were like first envisioned or imagined by Sewell Wright.

524

:

who came up with a lot of these SEM structures and instrumental variable designs.

525

:

And both are kind of like attempts to articulate a model structure that adjusts for the

risk of confounding in your data generating processes.

526

:

And I find that species of this way of looking at the world that there is a data

generating process.

527

:

there is a risk of confounding if you think your data generating process looks anything

like this.

528

:

And here's a technique or a sophisticated adjustment you can make to your model to account

for that.

529

:

And yeah, so that sort of sits at the heart of the way I think about modeling.

530

:

What do I need to adjust for?

531

:

What are the risks that this is going to go completely wrong?

532

:

How do I sort of validate that I haven't gotten it completely wrong?

533

:

Yeah, with the advancements that

534

:

that we see now in generative AI.

535

:

I'm curious what future trends do you anticipate or maybe hope for in the field of causal

inference and structural equation modeling?

536

:

Specifically with respect to the advancement of generative AI.

537

:

Yeah, or just what...

538

:

What future trends do you see in general in these fields or maybe things that you hope

for?

539

:

think the causal inference, I feel like has been getting a lot of traction, like the

importance of confounding in basic data analysis even, just like that is more and more

540

:

visible.

541

:

Even just in industry like the...

542

:

the prevalence of sort of quasi-experimental designs to like, maybe you're working in

industry and people don't want to run an A-B test for everything, but they're just

543

:

launching a new policy, a new procedure, and they want some sort of guide on how to

understand what the impact of that policy is.

544

:

The proliferation and spread of understanding of causal inference and confounding risk for

the evaluation of those new policies or procedures, I think has

545

:

gained a lot of traction in recent years and I would hope it continues to do so.

546

:

And with that, the increased understanding of risk of bias, waste due to biased

conclusions, increased caution.

547

:

think the poor causal inference is a kind of zero interest rate phenomenon.

548

:

If money's on the table and you're going to waste money by doing something really poorly,

you want to do that inference well.

549

:

And so now we're not quite in a zero.

550

:

I know the interest rates came down recently in Europe, but we're not back to zero

interest rate world yet.

551

:

I would expect causal inference to go from strength to strength over the next couple of

years.

552

:

I see.

553

:

also, I'm wondering if you have advice.

554

:

that you would give to aspiring data scientists interested in specializing in what you're

doing, which is the intersection of probabilistic modeling and causal inference?

555

:

think it's generally always just like if you're really young and you're on the job market

and you're looking for a portfolio piece or something like that, I always find a problem

556

:

that you're interested in, not one that exists out there like a

557

:

niche data set, maybe you create your own data set, kind of show the workflow and the

thought that went into your understanding of the data generating process.

558

:

And that you can think about the aspects of that data generating process, which would

support your conclusions, but also threaten the conclusions.

559

:

And the ability to able to articulate that understanding is more important to me when I'm

like hiring.

560

:

than the ability to say you just deployed the latest fancy model.

561

:

I just want to hear that you've thought about what is the thing you're actually measuring.

562

:

Can think through the problem in some sense.

563

:

I think that's more impressive on the job market for Brad than it is to say that you've

played with the latest deep learning phenomenon.

564

:

Yeah, because I think this also shows that you're able to learn.

565

:

And I think that's one of the most important things in our jobs because I think one of the

best job descriptions of our line of work is

566

:

The ability to be uncomfortable all the time with what you think you know, and being able

to always update what you're doing, how you're doing, and why you're doing it, extremely

567

:

important, completely agree with you.

568

:

Yes.

569

:

I would also emphasize good communication is going to be vital in any creative discipline.

570

:

That's true.

571

:

The ability to communicate complex topics to different audiences.

572

:

and break them down is extremely important.

573

:

So that's why also, you know, making the effort to do the communication you're doing with

the, your return articles and also coming on the podcast, different media.

574

:

That's really part of the job.

575

:

would say not something that you do, you know, on the site because what counts is on the

code.

576

:

Because good code that's not used is not very useful.

577

:

Yeah, utterly useless.

578

:

Yeah.

579

:

So to close this out before I ask you the last two questions, are there any future

projects or research areas that you're currently excited about?

580

:

Particularly involving Bayesian methods, of course, but things that you're learning right

now that you're really excited about?

581

:

Yeah, I'm kind of focused on two things at the moment.

582

:

So more and more into sort synthetic control methods for causal inference.

583

:

And I also, I kind of want to go back to basics a little bit on understanding just survey

methodology and how to think about surveys well, especially stratified survey sampling.

584

:

But yeah, so that's kind of on my radar to do.

585

:

Should we expect a new in-depth tutorial about that?

586

:

Yeah, probably.

587

:

a couple of months down the line, I think.

588

:

Sounds good.

589

:

Can't wait to read that.

590

:

Well, Nathaniel, that was really a pleasure to have you on the show.

591

:

think we were able to cover a lot of ground.

592

:

So thank you so much.

593

:

for clicking the time.

594

:

Of course, before letting you go, I'm going to ask you the last two questions I ask every

guest at the end of the show.

595

:

So first one, if you had unlimited time and resources, which problem would you try to

solve?

596

:

Yeah, so I had a kind of idea for this and I'm not entirely sure how I'd go about it, but

like, I feel like there's like this general tragedy of the commons kind of phenomena that

597

:

you like have, say for

598

:

climate activism or for even just like working effectively in a big organization with

politics and little kind of kingdoms being built here, there and everywhere.

599

:

I'd love to know like for different organizational structures, how do you mitigate if not

solve these sort of risks for tragedy of the commons like so for more efficient sort of

600

:

I guess more efficient workflows for an organization that can mitigate the risks of tragic

commons effects.

601

:

Yeah.

602

:

So like if there's infinite time and money and like a big research proposal, probably

fine.

603

:

For different organization structures, there's different mitigation strategies and then

which ones to apply where and how, that kind of thing.

604

:

Definitely understand that.

605

:

That's a great answer.

606

:

And second question, if you could have

607

:

dinner with any great scientific mind, dead, alive or fictional, who would it be?

608

:

Yeah, so I was thinking about this one.

609

:

So like, is it a hard requirement that it be a scientific mind?

610

:

Because I was thinking like it would be great to have dinner with Borges, you know, the

Argentinian short story writer.

611

:

He wrote beautifully concise, beautifully concise short stories.

612

:

But if a scientific mind is a hard requirement, I think I would go for the sort of

philosopher Nelson Goodman.

613

:

Okay.

614

:

Yeah, both good choices.

615

:

will allow José Luis Borges because he's Argentinian and my wife is Argentinian.

616

:

know, like, I have to accept him.

617

:

And also, definitely.

618

:

I love that choice.

619

:

Both are, I think for both choices, it's a first.

620

:

on the show.

621

:

yeah, I love both.

622

:

And yeah, would argue also Borges influenced quite a lot of scientists, right?

623

:

So think he was the one who wrote The Garden of Forking Path.

624

:

I think one of the stories is called that.

625

:

So you could argue it's it's almost a scientific story.

626

:

yeah, I think so.

627

:

Also Library of Babel is like this excellent little

628

:

mediation on combinatorics in a short form.

629

:

Nice.

630

:

yeah, definitely.

631

:

Awesome.

632

:

Well, that was really a pleasure, Nathaniel.

633

:

As usual, I'll put resources and a link to your website and socials and the papers and of

course your tutorial in the show notes for those who want to dig deeper.

634

:

Thank you again, Nathaniel, for taking the time and being on this show.

635

:

Thank you, Alex.

636

:

It was great.

637

:

This has been another episode of Learning Bayesian Statistics.

638

:

Be sure to rate, review, and follow the show on your favorite podcatcher, and visit

learnbaystats.com for more resources about today's topics, as well as access to more

639

:

episodes to help you reach true Bayesian state of mind.

640

:

That's learnbaystats.com.

641

:

Our theme music is Good Bayesian by Baba Brinkman, fit MC Lars and Meghiraam.

642

:

Check out his awesome work at bababrinkman.com.

643

:

I'm your host.

644

:

Alex Andorra.

645

:

You can follow me on Twitter at Alex underscore Andorra, like the country.

646

:

You can support the show and unlock exclusive benefits by visiting Patreon.com slash

LearnBasedDance.

647

:

Thank you so much for listening and for your support.

648

:

You're truly a good Bayesian.

649

:

Change your predictions after taking information.

650

:

And if you're thinking I'll be less than amazing.

651

:

Let's adjust those expectations.

652

:

me show you how to be a good Bayesian Change calculations after taking fresh data in Those

predictions that your brain is making Let's get them on a solid foundation

Chapters

Video

More from YouTube

More Episodes
121. #121 Exploring Bayesian Structural Equation Modeling, with Nathaniel Forde
01:08:12
120. #120 Innovations in Infectious Disease Modeling, with Liza Semenova & Chris Wymant
01:01:39
119. #119 Causal Inference, Fiction Writing and Career Changes, with Robert Kubinec
01:25:00
117. #117 Unveiling the Power of Bayesian Experimental Design, with Desi Ivanova
01:13:11
114. #114 From the Field to the Lab – A Journey in Baseball Science, with Jacob Buffa
01:01:31
113. #113 A Deep Dive into Bayesian Stats, with Alex Andorra, ft. the Super Data Science Podcast
01:30:51
112. #112 Advanced Bayesian Regression, with Tomi Capretto
01:27:18
109. #109 Prior Sensitivity Analysis, Overfitting & Model Selection, with Sonja Winter
01:10:49
102. #102 Bayesian Structural Equation Modeling & Causal Inference in Psychometrics, with Ed Merkle
01:08:53
94. #94 Psychometrics Models & Choosing Priors, with Jonathan Templin
01:06:25
92. #92 How to Make Decision Under Uncertainty, with Gerd Gigerenzer
01:04:45
89. #89 Unlocking the Science of Exercise, Nutrition & Weight Management, with Eric Trexler
01:59:50
84. #84 Causality in Neuroscience & Psychology, with Konrad Kording
01:05:42
83. #83 Multilevel Regression, Post-Stratification & Electoral Dynamics, with Tarmo Jüristo
01:17:20
19. #19 Turing, Julia and Bayes in Economics, with Cameron Pfiffer
01:00:26
27. #27 Modeling the US Presidential Elections, with Andrew Gelman & Merlin Heidemanns
01:00:52
28. #28 Game Theory, Industrial Organization & Policy Design, with Shosh Vasserman
01:03:56
31. #31 Bayesian Cognitive Modeling & Decision-Making, with Michael Lee
01:09:18
34. #34 Multilevel Regression, Post-stratification & Missing Data, with Lauren Kennedy
01:12:39
40. #40 Bayesian Stats for the Speech & Language Sciences, with Allison Hilger and Timo Roettger
01:05:32
52. #52 Election forecasting models in Germany, with Marcus Gross
00:58:07
71. #71 Artificial Intelligence, Deepmind & Social Change, with Julien Cornebise
01:05:07
53. #53 Bayesian Stats for the Behavioral & Neural Sciences, with Todd Hudson
00:56:12
57. #57 Forecasting French Elections, with… Mystery Guest
01:21:48
77. #77 How a Simple Dress Helped Uncover Hidden Prejudices, with Pascal Wallisch
01:09:00