Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!
If there is one guest I don’t need to introduce, it’s mister Andrew Gelman. So… I won’t! I will refer you back to his two previous appearances on the show though, because learning from Andrew is always a pleasure. So go ahead and listen to episodes 20 and 27.
In this episode, Andrew and I discuss his new book, Active Statistics, which focuses on teaching and learning statistics through active student participation. Like this episode, the book is divided into three parts: 1) The ideas of statistics, regression, and causal inference; 2) The value of storytelling to make statistical concepts more relatable and interesting; 3) The importance of teaching statistics in an active learning environment, where students are engaged in problem-solving and discussion.
And Andrew is so active and knowledgeable that we of course touched on a variety of other topics — but for that, you’ll have to listen ;)
Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !
Thank you to my Patrons for making this episode possible!
Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser, Julio, Edvin Saveljev, Frederick Ayala, Jeffrey Powell, Gal Kampel, Adan Romero, Will Geary and Blake Walters.
Visit https://www.patreon.com/learnbayesstats to unlock exclusive Bayesian swag ;)
Takeaways:
- Active learning is essential for teaching and learning statistics.
- Storytelling can make statistical concepts more relatable and interesting.
- Teaching statistics in an active learning environment engages students in problem-solving and discussion.
- The book Active Statistics includes 52 stories, class participation activities, computer demonstrations, and homework assignments to facilitate active learning.
- Active learning, where students actively engage with the material through activities and discussions, is an effective approach to teaching statistics.
- The flipped classroom model, where students read and prepare before class and engage in problem-solving activities during class, can enhance learning and understanding.
- Clear organization and fluency in teaching statistics are important for student comprehension and engagement.
- Visualization plays a crucial role in understanding statistical concepts and aids in comprehension.
- The future of statistical education may involve new approaches and technologies, but the challenge lies in finding effective ways to teach basic concepts and make them relevant to real-world problems.
Chapters:
00:00 Introduction and Background
08:09 The Importance of Stories in Statistics Education
30:28 Using 'Two Truths and a Lie' to Teach Logistic Regression
38:08 The Power of Storytelling in Teaching Statistics
57:26 The Importance of Visualization in Understanding Statistics
01:07:03 The Future of Statistical Education
Links from the show:
Transcript
This is an automatic transcript and may therefore contain errors. Please get in touch if you're willing to correct them.
If there is one guest I don't need to
introduce, it is Mr.
2
:Andrew Gammann.
3
:So I won't.
4
:I will refer you back to his two previous
appearances on the show, though, because
5
:learning from Andrew is always a pleasure.
6
:So go ahead and listen to episodes 20 and
27.
7
:The links are in the show notes.
8
:In this episode, Andrew and I discuss his
new book, Active Statistics,
9
:which focuses on teaching and learning
statistics through active student
10
:participation.
11
:Like this episode, the book is divided
into three parts.
12
:One, the ideas of statistics regression
and causal inference.
13
:Two, the value of storytelling to make
statistical concepts more relatable and
14
:interesting.
15
:And three, the importance of teaching
statistics in an active learning
16
:environment where students are engaged in
problem solving and discussion.
17
:And well, Andrew is so active and
knowledgeable,
18
:that we of course touched on a variety of
their topics, but for that, you'll have to
19
:listen.
20
:This is Learning Basis Statistics, episode
,:
21
:Welcome to Learning Bayesian Statistics, a
podcast about Bayesian inference, the
22
:methods, the projects, and the people who
make it possible.
23
:I'm your host, Alex Andorra.
24
:You can follow me on Twitter at alex
.andorra, like the country.
25
:For any info about the show, learnbaystats
.com is left last to be.
26
:Show notes, becoming a corporate sponsor,
unlocking Bayesian Merge, supporting the
27
:show on Patreon, everything is in there.
28
:That's LearnBasedStats .com.
29
:If you're interested in one -on -one
mentorship, online courses, or statistical
30
:consulting, feel free to reach out and
book a call at topmate .io slash alex
31
:underscore and dora.
32
:See you around, folks, and best patient
wishes to you all.
33
:on LBS now, so for curious listeners, I
definitely recommend episode 20, which was
34
:your first one with Andrew Gell -Mann.
35
:Yes, you were here.
36
:And with Akive Tali and Jennifer Hale, it
was both your previous book, Regression
37
:and Other Stories.
38
:And then episode 27 with Marilyn
nn, where we talked about the:
39
:US presidential elections.
40
:We talked about the model you folks did
for the economists.
41
:So definitely recommend checking this one
out because I'm guessing this is going to
42
:be interesting also for this year's
election.
43
:Yeah, we're working with them for 2024 as
well.
44
:So we're trying to improve the model.
45
:Perfect.
46
:Yeah.
47
:So it seems like you're releasing a book
every four year just before the US
48
:election.
49
:I hope it won't be four years before an
Xbook comes out.
50
:We're trying to finish our Bayesian
workflow book.
51
:So we're hoping that will be done by the
end of the year.
52
:Well, yeah, definitely curious to check
this one out.
53
:I think I also saw that you're working on
an MRP update book.
54
:Is that still the case?
55
:Yeah, I think Yajuan and some Lauren...
56
:Uh, Kennedy and some other people are
organizing this, um, uh, MRP book edited
57
:book we're putting together.
58
:Yeah.
59
:Um, I will definitely check these out.
60
:Well, writing books is a lot of fun
because you can write whatever you want
61
:because you're trying to communicate with
the audience.
62
:When you write an article, you're trying
to communicate with the reviewers who
63
:aren't the readers.
64
:It's a very weird indirect thing.
65
:It's.
66
:I guess similarly, if you're trying to
write a TV show, you have to convince the
67
:TV network to produce the show, but
they're not the people who are watching it
68
:and articles are like that too.
69
:But a book is so simple.
70
:You just write a book and you're just
aiming to reach people.
71
:It's very pleasant.
72
:I recommend it.
73
:Yeah.
74
:I can see that it's something you really
enjoy because you're such a prolific
75
:author.
76
:Yeah.
77
:I am.
78
:Personally, I use MRP quite a lot and
often, so I'm definitely super curious to
79
:see what's going to be in this book.
80
:I'm sure I'm going to learn things
personally, and that's also going to help
81
:me teach MRP, which I'm doing from time to
time.
82
:Thanks a lot.
83
:We have a research project I'm very
excited about now, which is integrating
84
:survey weights into MRPs.
85
:So people do it now, though.
86
:They'll think they'll run weighted
regression or they'll do like in, they'll
87
:have the model in stand and use power
likelihood, but it's not really quite
88
:right.
89
:So we have what I think is a better
approach, but that's not what you have me
90
:here today, right?
91
:Here I'm supposed to talk about our active
statistics book, my new book with Aki.
92
:Yeah, yeah.
93
:Yeah, exactly.
94
:I would, we can put whatever you want, but
yeah, the main focus is going to be your
95
:new book.
96
:Active Statistics with Akira Etari.
97
:And yeah, so maybe can you give us an idea
of the genesis of the book and thanks for
98
:showing up the book on the video.
99
:So those watching on YouTube.
100
:So it's for people learning statistics or
teaching statistics.
101
:So the story is that everybody says you
102
:Want to do active learning so students
should be working together class class
103
:time should be an active time for students
to be thinking about problems discussing
104
:problems.
105
:I notice so what.
106
:Okay, I teach a class based on regression
other stories and it's two semesters and
107
:each semester is 13 weeks and each week
has two classes.
108
:So that's 52 classes.
109
:And we cover the book every class is an
hour and a half long, or I guess, seventy
110
:five minutes long and each class.
111
:I have a story a class participation
activity, a computer, a computer
112
:demonstration, some quick drills for
students to work on in class, and then the
113
:discussion problem for students to talk
about and think more.
114
:I don't always have time in every class to
do all of these, but sometimes I do and I
115
:can always do most of them.
116
:I found when I had been teaching
statistics, I told stories a lot, but what
117
:happened, it's tricky to tell a story,
partly because for other, not every
118
:teacher has a lot of experience, so they
don't always have a lot of good stories.
119
:So,
120
:So, okay, so our book, it's okay.
121
:Our book is 52 stories, 52 class
participation activities, 52 computer
122
:demonstrations, et cetera, one for each
class.
123
:So, first, these are 52 stories that are
pretty good that I've come up with or that
124
:Aki and I have encountered in our careers.
125
:So, there are high quality stories, but
also when you tell a story, when I tell a
126
:story in class, sometimes it gets a little
disorganized.
127
:So, it worked good.
128
:It worked well to write the stories down.
129
:And for each story, we very explicitly say
how it connects to the week's topic, the
130
:week's reading, and also how it connects
to the course as a whole.
131
:And I felt that had been missing before.
132
:It wasn't hard for me to tell an
entertaining story with statistical
133
:content, but I wasn't always making that
connection with what was happening in
134
:class.
135
:So I feel that if you're a student and you
want to learn statistics, you can read
136
:these stories and...
137
:There are great little stories.
138
:There aren't a lot of sources for
statistics stories out there.
139
:Textbooks tend to have boring examples.
140
:They want to set it up like here's how to
turn the crank.
141
:Sometimes textbooks tell stories, but they
don't tell them well.
142
:And I'll give you an example of that in a
moment.
143
:There isn't really anything like this.
144
:And so maybe we should have just had a
little book.
145
:Our book is, how long is it?
146
:It's three and two fifty pages long.
147
:Maybe we should have had just a book that
was like 50 or 100 pages long with just
148
:the stories, because that already is
great.
149
:Maybe it should have been several
pamphlets rather than one book.
150
:Then we have class participation
activities.
151
:These are things where the class gets
involved.
152
:They're filling out survey forms or.
153
:they're doing an experiment on each other
or we do an experiment on them or they're
154
:weighing bags of things and trying to get
estimates, they're flipping coins.
155
:I love these.
156
:Deb Nolan and I had a book a few years
ago, Teaching Statistics, A Bag of Tricks,
157
:which had a few activities, but this is a
million times better.
158
:First, we didn't have 52 activities, but
also these are lined up with the course.
159
:So they go in sequence.
160
:So they're not just fun things to do.
161
:There are things that line up with
particular lessons.
162
:And I just love that people tell me
they'll say, Oh, I liked your book and I
163
:used one of your activities in one of my
classes.
164
:And it makes you want to scream and like,
you know, throw something at the TV or
165
:punch the wall or whatever.
166
:I want you to do it in every class, every
class should have an activity or at least
167
:most of the time.
168
:So that was a lot of effort because we had
a bunch, but a bunch of them, like we just
169
:created from scratch.
170
:We need an activity for this.
171
:And that's really great.
172
:So that could have been its own pamphlet,
another 50 pages.
173
:Then we have computer demonstrations.
174
:And I find that live demos are great.
175
:But if you try to do it from scratch, you
get tangled in the code.
176
:So it's good to have pre -written live
demos.
177
:And so that's like to say you should have
a demo.
178
:And it's surprisingly hard.
179
:You create even something simple, simulate
fake data and run a regression.
180
:You have to have good values of the
parameters or else you're not really
181
:demonstrating the point you want to make.
182
:If it has some curvature, how much to
have.
183
:So we tested them out and did them in
class.
184
:And so that way when I teach, I can always
have a live demo, which is everybody's
185
:favorite part of class and so forth with
the others.
186
:And then we have some homework assignments
and we have some chapters at the beginning
187
:where we talk about how to set up the
class and how to learn better.
188
:It's not really just for teachers, as I
said, should be for.
189
:students.
190
:So that's what's in it.
191
:Yeah, well, thanks a lot, Andrew.
192
:I already have a lot of follow -up
questions for you.
193
:But something also you've told me in
preparing the episode is that you have
194
:thought about the book in three distinct
parts.
195
:All right, so first one being the idea of
statistics, regression and causal
196
:inference.
197
:Then another pillar would be like using
stories to explain statistics.
198
:And the third pillar would be the method
of teaching with active student
199
:participation.
200
:So why did you choose these three
different pillars and how do you think
201
:they are helping an active learning of
statistics, which is one of the goals of
202
:your book?
203
:So.
204
:Teaching or learning is like a vector.
205
:It has a magnitude and a direction.
206
:And the magnitude is how hard you work to
figure stuff out.
207
:And the direction is what you're learning.
208
:So yeah, I think applied regression and
causal inference is super important.
209
:This typical audience for this book would
be students who took one statistics class.
210
:Maybe they already took statistics in high
school or at university.
211
:took that one class where they learned
about sampling and experimenting and
212
:estimation, intervals, normal
distribution, stuff like this.
213
:This is all about using it, about going
beyond that.
214
:So, yeah, I think applied statistics is
great.
215
:I want to teach regression about, like,
the most important thing is understanding
216
:the model and being able to use it.
217
:Not so much the mathematical theorem
about…
218
:least squares estimation.
219
:That's important too.
220
:There's other places to learn that.
221
:So yeah, the direction is that it's
applied statistics.
222
:I think the magnitude is about how to make
that work, how to get people to learn.
223
:And so most of the learning is not done in
class, but at least if students are doing
224
:these activities,
225
:in class that the hour and a half or the
three hours a week they're spending in
226
:class, they are already heavily thinking
about it.
227
:Which, and you know, I just like, it's
kind of horrible for the students because
228
:you really make them work.
229
:It's like teaching a foreign language
class, right?
230
:If you go and take a usual class in
college, you sit in the back and you zone
231
:out and you're like, oh, this is pleasant.
232
:It's like watching a movie, maybe.
233
:But if you're in a foreign language class,
you're working all the time, right?
234
:The teacher's always making you talk and
listen.
235
:If you lose focus for a second, it's...
236
:Difficult statistics is a foreign language
and you can learn by speaking it and
237
:practicing it So I think it's important in
class to be able to do that or if you're
238
:studying at home to have these activities
and stories That there isn't I mean, it's
239
:and of course the computer I'll say like
my computer code is pretty bad.
240
:So that's good, right?
241
:Because that's like student code.
242
:It's all crappy code.
243
:So it's realistic I know it's not the
world's cleanest always
244
:I would say, but it runs, but maybe it
doesn't all run either.
245
:It ran when I wrote it.
246
:But it's supposed to be, when I do code
demos in class, what I like to do is
247
:actually type in the code, not copy and
paste it.
248
:So that's modeling how someone might do
it.
249
:So we try to keep them short enough that
you can do that.
250
:Yeah, thanks a lot.
251
:I see what you're doing and I really
appreciate it because that's also helping
252
:me in my own teaching philosophy because I
do have the same experience where the
253
:students who end up learning the most are
usually the most active ones.
254
:but then the main question is, okay, how
do I make them all active?
255
:Or at least give them the opportunity to
all be active.
256
:And that's really one of the things.
257
:Yeah, when I teach, I make them talk.
258
:Like even it could be a class with 50 or
more students, but I'll tell the story and
259
:then I'll pause and then say, well, what
do you think?
260
:Talk to your neighbor about this.
261
:And I look and I make sure they're
talking.
262
:And if they're not talking, then I walk
over and say, you know, I go like this to
263
:them.
264
:and if their computer is out by look and
if they're on their social media, I ask
265
:them to close their computer and if their
phone is out, I ask them to close their
266
:phone and so forth.
267
:The funny thing is as a teacher, that's
hard, it's easier as a teacher to just
268
:talk and talk and talk and talk, like I'm
talking now, I'm just talking.
269
:It's easy to talk and you have complete
control over it.
270
:So that's why I really needed to structure
this in this way.
271
:That was my original motivation for all of
this.
272
:was that many years ago I was teaching a
class and I couldn't make it because I had
273
:my co -teacher, another faculty member in
the department was teaching the same level
274
:class, teach my class, and then I went and
taught hers and she said, oh, your
275
:students were just dead.
276
:And then I talked to her class and they
were so lively and I realized not that she
277
:was lucky, but that they had been in that
habit of participating in class.
278
:She's just a natural great teacher.
279
:I'm naturally not a good teacher.
280
:And so I...
281
:do this stick in order to get them
involved.
282
:And then I just wanted to do it well.
283
:I want to tell stories, but I want to be
able to make the point, to help them learn
284
:it.
285
:Yeah, that's interesting because me, when
the teachers were doing that to me, it's
286
:because I was talking too much.
287
:That happened quite a lot.
288
:Maybe that's why I have a podcast now.
289
:Apart from these philosophical
considerations.
290
:Yeah, that's very interesting.
291
:I'm going to try that in my own classes.
292
:The thing is I personally teach a lot of
online courses and so I cannot beender and
293
:see the screens.
294
:So that's pretty hard.
295
:Yeah, it's tough.
296
:I remember when I was doing the class over
Zoom and you could try to put them in a
297
:little room so they work in pairs, but yet
if you can't see them doing it, I think
298
:there is some online...
299
:conferencing software where you can
actually see the pairs and then then or
300
:the small groups, but I don't I don't know
the full story with that, but I could get
301
:so I gave you an example.
302
:There's something one of the things it's
difficult.
303
:I don't know.
304
:There's any answer about this about the
stories is that if they're too if they're
305
:too simple, that's boring.
306
:But if they're too complicated, then you
know, that's not good either.
307
:One thing I like to say like I I want to
send the message that.
308
:Statistics, how did I put it in the book?
309
:I had a slogan that statistics is hard.
310
:It should not feel tricky.
311
:So I don't like those.
312
:I don't like this.
313
:I like statistics stories with a twist,
but I don't like the kind of stories where
314
:the messages, this is just hard like this,
like at Monte Hall problem.
315
:I hate that because it's just so confusing
to people.
316
:Like, what's the lesson that you're
teaching?
317
:Right?
318
:Like, this is really, really confusing.
319
:I don't want to teach that.
320
:But here's an example.
321
:And this is a very standard example used
in United States statistics classes where
322
:we put another twist on it based on the
recent literature.
323
:So this was a survey that was done in 1936
by a magazine called the Literary Digest.
324
:And they did a very famous in statistics
books example.
325
:They did a survey for the presidential
election and it was the presidential
326
:election was Franklin Roosevelt running
for reelection against somebody who wasn't
327
:Franklin Roosevelt.
328
:So you kind of know who won that election.
329
:But in the their poll, actually, Franklin
Roosevelt was going to get destroyed.
330
:They did a poll with they they surveyed 10
million people and two and a half million
331
:of those responded.
332
:And out of that, it looked like Roosevelt
was completely getting smoked.
333
:Well, there were two things happening.
334
:One is the two and a half million
respondents were not random sample of the
335
:10 million people.
336
:Second, the 10 million people were
themselves not representative of Americans
337
:because it was from lists of people who
own cars and things like richer people.
338
:So it wasn't a representative sample and
usually it just stops there.
339
:But that's not a good place to stop for a
couple of reasons.
340
:One of which is what lesson are you
telling people?
341
:If you don't have a random sample, your
survey is no good.
342
:Well, unfortunately, no surveys are random
samples.
343
:I mean, no surveys of humans, no political
polls are.
344
:So the message would be, oh, you can't
ever trust any political poll.
345
:Well, that would be a mistake because
political polls, even when they're off,
346
:they tend only to be off by a couple of
percentage points.
347
:So what goes on with political?
348
:Well, so let's OK, so let's look at this
survey.
349
:The first thing is that the same magazine
had
350
:done this survey in previous elections and
it had worked well.
351
:So they had some track record.
352
:It wasn't as dumb as it sounds.
353
:Second thing, and this is something that
two statisticians recently looked into, I
354
:was able to take advantage of their work.
355
:So Sharon Lore and Michael Brick had
written a paper on this:
356
:Digest Survey where they realized that, or
the data from the survey are actually
357
:somewhere, like they're available.
358
:The, um, and one of the quest, the survey
asked people who they would vote for, but
359
:it also asked who they voted for in the
previous election.
360
:So you can adjust for that because you
know, the election outcome, the previous
361
:election outcome.
362
:Well, it's not perfect.
363
:It's not everybody voted in the previous
election.
364
:And, but it's pretty good.
365
:And when you do that adjustment, you get,
well, you find that Roosevelt was supposed
366
:to win.
367
:Well, it's not a perfect adjustment.
368
:It's still quite a bit off.
369
:It's.
370
:Even after doing this adjustment, it's
still not a representative sample.
371
:But now we've changed the lesson from,
hey, it's not a random sample, you fool,
372
:blah, blah, blah, to, hey, this sample is
not a representative sample, but
373
:statistics can be used to adjust it.
374
:Look at this.
375
:But the adjustment is imperfect.
376
:So it's a more subtle message.
377
:Well, it's trickier to teach.
378
:That's one reason why I like having the
story written as a story very clearly in
379
:the book, because then the student or the
teacher can read through the whole thing.
380
:If you're a student, you can read it
through.
381
:And if you're a teacher, you can first
read it before trying to teach it.
382
:And there it is.
383
:It's on page 36 and 37 of our book.
384
:There's a copy of the survey form.
385
:And.
386
:It takes it.
387
:It's it's literally like the takes up the
description takes up one one page of of
388
:the book.
389
:Almost almost all of it is a quote from
Lauren Brick because they're the ones who
390
:did it and then a little discussion of how
it relates to the class.
391
:But everything is like these stories are
all like that.
392
:Like they're all you have to balance it.
393
:And it's it's it's tricky like they almost
should be another.
394
:booklet of the really simple stories that
we've been including because they're too
395
:boring for me, but maybe still interesting
for the students.
396
:I don't know.
397
:We went back and forth.
398
:It's structured from beginning to end of
the course.
399
:So each sec, there's 20, well, there's a
couple of introductory chapters and then
400
:there's 13 sections for the first semester
and then 13 sections for the second
401
:semester.
402
:So most of the book is, is 13 straight, is
26 sections.
403
:And in each one we have a story and the
404
:participation activity.
405
:And we went back and forth about whether
to do it that way or whether to put all
406
:the stories in one place and all the
activities in one place.
407
:And I don't know.
408
:Now I'm thinking I wish we had done it
that way.
409
:But Aki and I went around and around on
this a million times.
410
:There's no, you don't need to hear about
this.
411
:I wanted it to look right.
412
:The thing is, if you opened up at random,
you might get a page of homework
413
:assignments and then it might look like a
textbook.
414
:So it's like, that's the...
415
:it all kind of looks the same.
416
:So maybe if we had separately done the
different things, it would have then
417
:there'd be a whole section of stories.
418
:But when you're teaching, it's convenient
that's in order because you just go to the
419
:week of your class and then you can see
what to do that week.
420
:So that's, I used it to teach.
421
:Yeah, and I mean, I really love also your
focus on the stories, right?
422
:I see it's definitely a theme of your work
recently, and I really love that because I
423
:think it also puts an emphasis on the fact
that statistics is not done in a vacuum,
424
:right?
425
:And it's also done by humans.
426
:with their biases and also their
motivations and so on.
427
:And I found that way more interesting, way
more realistic.
428
:And also that captures more the
imagination of the students rather than
429
:teaching them theorems and formula, which
often is quite intimidating to a lot of
430
:them.
431
:So yeah, I hope to admit the stories are
like all things that I can personally
432
:relate to.
433
:Like either there are things that I was, I
was either it's research I was involved in
434
:or it's something close enough to what I
do.
435
:Like I'm interested in the question being
asked.
436
:Um, it's yeah, there were, there were, and
the same with the same with the
437
:activities.
438
:The activities have a lot of simulated
data.
439
:I'm a big fan of.
440
:Yeah, you are.
441
:Uh, in a, in a lot of your books, you, you
took up with that.
442
:Um, do you want to, do you want to talk
about.
443
:bit more about that or you think we've
covered already the idea of simulated data
444
:in the traditional data?
445
:Well, I'll just say briefly that I think
we are, as statisticians or computer
446
:scientists or whatever, we're used to the
idea of here is a data set, let's see what
447
:we can learn.
448
:But science, I mean, sometimes we proceed
that way in learning.
449
:We want to understand the world, you're
curious about something, someone gets a
450
:bunch of data from
451
:Basketball or whatever, and then you play
around and see what you can get.
452
:So that happens, but often things are more
directly motivated.
453
:Like, yes, in a public opinion poll,
you're really starting with the question.
454
:When in demonstrating a method encoding
examples, it's super great to have
455
:simulation.
456
:partly because it's like it's the dual
problem, right?
457
:If I can, I simulate the data, then I fit
the model.
458
:I can check, I can see if the parameter
estimates are similar to the true value,
459
:but also just the active simulation is the
time reversal of the active inference.
460
:So it makes sense to show the forward
process too.
461
:And I think it's kind of a bit of a power
thing.
462
:It's a student, like I can state, I can
simulate data.
463
:I can make fake data myself, right?
464
:That's.
465
:That's something that can be done.
466
:Traditionally, we do simulation when we're
teaching probability, like you'll teach
467
:the central limit theorem by simulating
draws.
468
:But just a lot of examples come up.
469
:It's very simulation is a kind of it's
like a universal solvent.
470
:Like, for example, I think one of our
discussion problems in classes, I show
471
:them data from some regression, which is
based on real data.
472
:And I don't remember the example, but
something where there's some treatment
473
:effect.
474
:which you maybe expect is positive.
475
:Maybe the estimate is, let's say the
estimate is 0 .3 and the standard error is
476
:0 .2.
477
:And so then I say, and it's based on 100
data points.
478
:So then I, so it's estimates, estimate is
0 .3, the standard error is 0 .2.
479
:So I'd say how large a sample would you
need to get a result that's two standard
480
:errors away from zero?
481
:That's statistically significant, a term
that I don't like to use, but of course
482
:they need to know how it gets used.
483
:So you'd say, oh well, the standard error
is 2, but really the standard error would
484
:have to be 1 and 1 half for it to be 2
standard errors away from 0.
485
:So the sample size would have to increase
by a factor of 2 divided by 1 .5 squared.
486
:So you take 2 over 1 .5 squared, and
that's, you know, so you can do that, you
487
:know, and you say here,
488
:2 over 1 .5 squared times 100, and that's
177.
489
:So you'd say, well, you need a sample size
of 177 to get your estimate to be true.
490
:So work that out.
491
:That's wrong.
492
:That's not the correct answer.
493
:Because if you redo a study with 177
people, there's no reason to think the
494
:point estimate will be the same.
495
:In fact,
496
:Like the whole point of saying that the
estimate is less than two standard errors
497
:away from zero and you don't know whether
to believe it, somehow the whole point
498
:from a Bayesian point of view, the point
is that it's likely to be closer to zero.
499
:From a classical point of view, the idea
is that you can't rule out zero as an
500
:explanation and zero is like typically a
privileged value there.
501
:So if you're replicating a study or even
doing it longer,
502
:you would have to, the answer depends on
the true treatment effect, not on the
503
:coefficient estimate.
504
:And well, that's harder, right?
505
:But the point is you can show that with a
simulation.
506
:If it's based on real data, it's trickier
to show because what are you doing?
507
:But if I then do a simulation and then I
say, well, look, let me try simulating
508
:100.
509
:with this true treatment effect and then I
see what I get.
510
:I say, well, shoot, I didn't get a
treatment effect of 0 .3.
511
:I was supposed to have to keep doing it.
512
:And then you realize you're selecting just
some.
513
:So to me, it brings it to life.
514
:The applied point gets demonstrated in a
way that's harder to do with just one data
515
:set.
516
:Yeah.
517
:Yeah, yeah, yeah.
518
:I really love that.
519
:I agree.
520
:And that's also something I tend to use.
521
:On a lot of questions people have on, you
know, A, B tests, settings, things like
522
:that.
523
:There's a lot of questions about these,
the sample size, the iteration, things
524
:like that.
525
:And I find personally, I have to do the
simulated data studies to answer these
526
:kinds of questions.
527
:Like I, I'm bad at like remembering, you
know, all those rules are awesome.
528
:Like, like let's do that kind of studies
with simulated data and that gives me a
529
:way better idea.
530
:So in a completely unrelated topic, I can
tell you about our two truths and a lie
531
:example.
532
:That's a demonstration we do.
533
:I'm mentioning that partly because writing
a book is like writing a hundred articles.
534
:So at one point I thought, well, maybe I
should publish these as a hundred articles
535
:because each story could be, well, that
just takes a lot of work and maybe more
536
:people will read it in book form.
537
:So I didn't do that, but.
538
:I did one of them.
539
:I did one or maybe I did one or two.
540
:It takes a while to publish an article.
541
:And for the bad reason that it's just
formatted in a different way, for the
542
:moderately good reason that you need to
explain more if it's in an article rather
543
:than a book because you need the context,
for the pretty good reason that you're
544
:forced to that, that like you have an
opportunity to expand because you have
545
:more space in the book.
546
:I can't take up too much.
547
:I can't have each thing take too long.
548
:And for the probably the biggest thing is
you get useful reviewer comments and
549
:people point out problems anyway.
550
:So the one of the the activities I did
write up as an article was two truths and
551
:a lie.
552
:And I gave a link to the article version,
which is longer than what's in the book.
553
:But I love the story.
554
:OK, is the story how it came out is that
there's this game which did not exist when
555
:I was a child.
556
:But I don't know if they do it in Europe.
557
:It's a big it was it's popular in.
558
:in the U .S.
559
:as the kids do it as an icebreaker in
class, you'll have a group of people and
560
:one person is the storyteller and this
person tells three things about
561
:themselves.
562
:Two of them have to be true and one has to
be a lie and then the other people discuss
563
:and try to figure out which is the truth
or which is the lie.
564
:So it's such a fun activity.
565
:I like to use it as an icebreaker in my
statistics class.
566
:But it has no statistics content.
567
:I mean, it is because there's uncertainty,
but what do you do with it?
568
:So I thought about and thought about and
well, I decided to put it in the second
569
:semester.
570
:I was ready for a good icebreaker and the
second semester started with logistic
571
:regression.
572
:Okay, I can make it logistic regression
problem because you can say, what's the
573
:probability you get it right?
574
:What's the probability you guess correct?
575
:But then you need some predictor.
576
:So, oh, predictor.
577
:Well, you can have when you guess, you
also have to give a certainty score, some
578
:number between zero and 10 representing
how certain you are that you're correct.
579
:Then it has to be done in groups.
580
:So I figured it out.
581
:Each, you divide the class into groups of
four.
582
:Usually we do pairs, but this one, four.
583
:Each group, you have one student is the
storyteller, tells the three statements.
584
:The other three discuss together.
585
:And then,
586
:come up with a guess of which they think
is true, which of them that they think is
587
:a lie, and a certainty score.
588
:So write the certainty score down in a
sheet of paper, then find out whether your
589
:guess was correct and write that down too.
590
:So they find out.
591
:Then there's four of you in the group, so
you rotate.
592
:Then the next person does it.
593
:So as a result, as a group, each group has
four certainty scores and four.
594
:correct or incorrect answers.
595
:So they have four numbers, they have eight
numbers, first four numbers between zero
596
:and 10, and then the four numbers which
are zeros and ones.
597
:And so, by the way, when you do this, I
have a slide prepared, or I write it on
598
:the board, the exact instructions.
599
:You need to give in, you can't just tell
it, people aren't paying attention for one
600
:second.
601
:I'm just doing this for you in that thing,
but actually we have the instructions
602
:there.
603
:Then did this thing I discovered a couple
of years ago.
604
:It's putting things on Google Forms.
605
:So live in class, I create a Google Form,
I open Google, type it in right there.
606
:So this is also it's a power thing for
them.
607
:Look at this.
608
:I didn't have to prepare this.
609
:I type the Google Form, I put question
one, certainty score, make it a response
610
:from zero to 10.
611
:Question two, yes or no, did you get it?
612
:Was your guess correct?
613
:So with each group, I want you to go, oh,
and then we use tiny URL to get a URL.
614
:And then for each group, I say, pull out
your phone or your computer, and one
615
:person from the group, enter your four
data points.
616
:So we set it up with four.
617
:So there's actually eight responses, the
first one, the first one.
618
:Then we get the data, it takes them a
minute to type it in.
619
:Then I have it all prepared.
620
:I've done it before, right?
621
:So I have the code ready.
622
:I.
623
:So I go to the Google page, I download it,
I put it on the desktops.
624
:It's not even my laptop, it's just a
computer that's in the classroom.
625
:Then I go, I open R, I read it in, and I
have the code prepared so I can do it.
626
:And then we can make graphs.
627
:So we fit a legit, so, but then I did
something I always like to do.
628
:I set it all up.
629
:Okay, we have the data.
630
:I type in the code for logistic
regression.
631
:Again, I have a pause.
632
:I say, well, write the code with your
neighbor what the logistic regression code
633
:would look like.
634
:So, yeah, and then I do it and then I type
it and I said, then I do display, you
635
:know, of the fitted regression.
636
:And before hitting carriage return, I
said, this is what it's going to look
637
:like.
638
:There's going to be coefficient estimate,
standard error.
639
:What are they going to be?
640
:You and your neighbor have to figure out,
try to guess what the estimate and the
641
:standard error are gonna be.
642
:Well, the standard error is tricky, like
that's hard.
643
:So I said, just figure out, guess what the
estimate will be.
644
:And so then I have them do it, I go around
the room, I make sure they're all drawing
645
:the curve, and then I have someone go on
the board and draw what they had done.
646
:And then I ask people, do you think this
is reasonable?
647
:Do you think this slope is reasonable?
648
:Now what do you think the standard error
will be?
649
:Do you think the slope will be more than
two standard errors away from zero?
650
:Then you fit it.
651
:and you have the scatter plot and they can
see and they've thought about that
652
:committed to it.
653
:So that's logistic regression.
654
:But when I wrote up the article, the
people in the journal said, well, what
655
:about other classes?
656
:And then I realized you can use this to
teach measurement.
657
:You can use it to teach experimentation,
like all sorts of things.
658
:You could do a lot with that.
659
:But I felt so satisfied because just I
felt like it was just created out of
660
:nothing.
661
:I wanted to true Snellai activity and now
there is one.
662
:So that was just felt so it felt so good
to have created.
663
:Now I want everyone to do it because now
that I created this this beautiful thing
664
:out of nothing, it did not exist.
665
:Anyway, just I'm very happy about that.
666
:Yeah, I love that.
667
:I definitely tried that in my own.
668
:My own classes seems like a good thing to
do on the first or second class, isn't it?
669
:Right, exactly.
670
:Now the point is that you're killing two
birds there.
671
:Yeah, yeah.
672
:No, that's super cool.
673
:Definitely going to try that for sure.
674
:So, and it's like, I have a commencement
device now.
675
:I have officially publicly committed to do
that.
676
:So I have to do it and then.
677
:Come back to you, Andrew, to tell you how
it went.
678
:The other thing you can do is there are
certain fun psychology experiments from
679
:the literature that can be done in class,
because things that have very large
680
:effects, like some of the classic Tversky,
Kahneman experiments of cognitive
681
:illusions, we have one of those examples
too.
682
:You can do it live in class.
683
:Yeah, that sounds also super cool.
684
:I also saw in preparing the episode that
you have a flipped classroom, like you
685
:emphasize a flipped classroom environment.
686
:I don't think I've ever heard you talk
about that.
687
:Could you explain what this approach is
and how you think that enhances the
688
:learning of client progression and calls
on inference?
689
:I think to me the flipped classroom is
pretty much the same as traditional high
690
:school classes, high school math class.
691
:So if you take math in high school, you
have a book you're supposed to read and
692
:there's homework assignments.
693
:Usually you read just enough of the book
to allow you to do the homework
694
:assignments.
695
:Then in class, the teacher does a couple
things in the board and most of the time
696
:in class you spend working on problems in
pairs or small groups and then people go
697
:up to the board and share their answers.
698
:That's kind of what I think should be.
699
:So that's the model of so it's very
traditional.
700
:The flipping part is, you know, I don't
have videos.
701
:I guess I could, but I don't.
702
:Akki has videos for his glasses that I
have.
703
:But the flip part is the reading.
704
:Right.
705
:So they I'm not lecturing because they're
supposed to have read the book.
706
:Now, what happens, you know, it works only
if you have a book that you can can lean
707
:on.
708
:But I think that's very important.
709
:This semester, I'm teaching in a
710
:statistics class teaching some multi
-level modeling and some other things.
711
:My book with Aki and Jennifer on advanced
regression and multi -level modeling
712
:doesn't exist yet.
713
:It's supposed to be the updated version of
my book with Jennifer.
714
:I couldn't quite bring myself to teach out
of my book with Jennifer just because the
715
:code is old, but then I don't have a new
book.
716
:And so as a result, the class I'm teaching
this semester,
717
:It's fun.
718
:I think the students are enjoying it, but
I'm not it's not going as perfectly as it
719
:could because I can't really do the flip
thing because I keep I end up spending a
720
:lot of time in class like my computer
demos typically end up being me doing the
721
:homeworks, working them out the homeworks
that were just do which is fine, but it's
722
:it's not they're a little bit more
elaborate than.
723
:Ideally, I think computer demos would be
shorter.
724
:They don't have enough to read before, so
I end up spending a lot of time lecturing.
725
:I think I spend most of today's class just
talking.
726
:I felt a little bad about that.
727
:I don't know.
728
:I think it's still fine.
729
:It's still a breath of fresh air compared
to other classes they're taking.
730
:I'm sure if all the classes were like
mine, then that would be horrible.
731
:But an occasional class that's like mine
can be good.
732
:I think in general, students like more
organization.
733
:A book is better.
734
:Even my
735
:My when I teach ever regression other
stories that's super organized, but it's
736
:not always what students want because they
want to set up methods and formulas and
737
:theorems and so forth.
738
:So I'm not always giving people what they
want.
739
:Anyway, I think that they again, I think
they're really looking for very clear.
740
:I don't I have this thing, the goal is to
be fluent in the foreign language, but I
741
:don't think people usually think of it
that way.
742
:I think that they're looking for.
743
:something different.
744
:But what that means is that it puts a
special burden on me to be super organized
745
:because if I'm not super organized, then I
think students will not see the point.
746
:So my class this semester, it doesn't use
the book.
747
:It's not as flipped as it could be.
748
:I still have them talking with each other
in class, but not having the flipped
749
:classroom makes it a little more of a
passive experience for them.
750
:And then when I do have them talking,
they're often just talking to each other
751
:saying, oh, I have no idea what's going on
here.
752
:It's like, oh, good that I know that, I
guess.
753
:That's true.
754
:Yeah.
755
:And I mean, I do relate to this idea of
the, you know, getting fluent in a foreign
756
:language.
757
:That's actually also a metaphor I use
quite a lot to people who are curious
758
:about what the...
759
:work of a statistical modeler is.
760
:And that's funny because there's that
weird human brain bias of just thinking
761
:that someone who is doing something that
looks hard to you, or they must have been
762
:good at it since the beginning.
763
:And at least for me, it couldn't be
further from the truth.
764
:It comes from a lot.
765
:As you were saying, I think you were
saying learning is a
766
:Vector is magnitude and direction, right?
767
:So definitely magnitude is very important
for me each time I learn something.
768
:And often I'm saying, yeah, well, it looks
hard because you have to learn kind of two
769
:languages, the language of stats and the
language, like the actual programming
770
:language that you need to do the stats.
771
:But it's just as any other language, you
need to...
772
:talk to people in that language and with
time you'll see your brain just getting
773
:there.
774
:So it does go through to people, but at
the same time they need to see some
775
:results along the way because otherwise
the motivation is gonna fall down.
776
:So it's always that needle that's a bit
hard to thread in my experience.
777
:Yeah, well, I like this book.
778
:See, I seriously think this book is just
fun to read.
779
:Although, as I said, I kind of I kind of
wish I had separated it out in a different
780
:way because I do feel when people when you
open at random, you end up you might see
781
:some code or you might see a homework
assignment or you might like it's not
782
:always clear what like you're not
necessarily opening into a middle of a
783
:story.
784
:And so like homework assignments don't
look like fun and code doesn't look like
785
:fun.
786
:So I'm.
787
:Don't think I realized you don't see the
book until it's a book before that's this
788
:PDF on the screen and it has it has a
different experience that way and and
789
:Akki's gonna kill me that I say this
because we went back and forth and and but
790
:like now I think we really should have of
I really think we made a mistake by not
791
:doing it the other way because I think it
would look a lot more fun that way if If
792
:like all the stories were in one place and
all the activities were in another place
793
:I'm really feeling bad about that.
794
:I still love it.
795
:It's just, we just have so many fun
things.
796
:Oh, then we have, for the final exam, we
made, it's multiple choice.
797
:So what I do is I have four or more
questions per chapter.
798
:It's like, it's, it's,
799
:The exam has so there's 12 chapters for
the fall and 12 for the spring.
800
:So each chapter, I have four or more
questions.
801
:What I do is I randomly sample one per
chapter and give that to the students as
802
:their practice exam.
803
:Then I randomly sample two per chapter and
give that and make that the final exam.
804
:So therefore, by construction, the
practice exam is representative of the
805
:final exam because they're two random
samples from the same population.
806
:So I think that's that that's great to be
able to do that now.
807
:Of course, all the problems are now in the
book, although without the answers.
808
:So you'd have to figure out which it is.
809
:But in theory, someone could read through
all of those.
810
:But of course, the usual story is if
someone really goes to the trouble of
811
:reading through all of them and figuring
them all out, that's probably good anyway.
812
:So I don't mind if they didn't do well on
the exam.
813
:But it took a lot of effort to write.
814
:These multiple choice questions are hard
to write, but I think they're easier to
815
:grade.
816
:And I think they're testing something
that's a bit more focused.
817
:It's very easy to write open -ended
questions and not know what you're
818
:testing.
819
:True.
820
:Yeah.
821
:Yeah.
822
:It's a bit more like astrology, where you
always find something you're satisfied
823
:about.
824
:Yeah, yeah, exactly.
825
:And it also encourages a certain behavior
among students to just keep writing and
826
:trying to like touch all the bases.
827
:True.
828
:Yeah, yeah.
829
:As a pure product of the French
educational system, I can tell you open
830
:ended questions are like my bread and
butter.
831
:I've been trained at that a lot.
832
:So if someone have to answer, like I have
a weird feeling of familiarity and that...
833
:At the same time, I like it and I dread
it.
834
:So that's what...
835
:Many years ago, I taught a class in France
and the students are supposed to do
836
:projects and it just happened.
837
:Yeah, everybody's busy.
838
:So one of the groups did, they did
nothing.
839
:They turned something in, which was pretty
much they had just like, it wasn't
840
:plagiarized, but they had just copied
stuff from the internet.
841
:Like, you know, they just literally copied
some images and it was essentially
842
:nothing.
843
:So I talked to the...
844
:The head instructor of the class, I said,
well, I want to give him a two out of 20
845
:on this.
846
:Like, I guess, you know, I, I, maybe I
don't give them zero because they wrote
847
:out sentence or two, but like, can I, can
I give them a two out of 20?
848
:He said, well, yeah, you're giving the
grade.
849
:I said, in the U S if you want to give
someone a low grade, you have to ask for
850
:permission because you're afraid they
might sue you or complain or something.
851
:And, but he said, no, in France, you can
give people, you know, two out of 20.
852
:They might even think it's a good grade.
853
:So it is a different...
854
:French system is a little more rough in
how the grading goes.
855
:I don't remember that.
856
:Yeah.
857
:I mean, it depends.
858
:I don't know at what level you're
teaching, but if you're teaching in the...
859
:especially in the class préparatoire, you
know, so that weird stuff we have in
860
:between high school and universities.
861
:These were graduate students.
862
:Yeah.
863
:So you can definitely do that.
864
:I know I was like my first philosophy...
865
:dissertations when I was in the class,
were absolutely a disaster.
866
:Um, it was, that was, I think I got four
out of 20, something like that.
867
:And that was not even the worst grades.
868
:You know how like in gymnastics, like it's
like 9 .8, 9 .9, 9 .93, like that, like
869
:the grading system did that.
870
:But statistics is, it's really hard.
871
:Like I think real world problems, I
wouldn't give myself.
872
:a 20 out of 20 in my analysis, because if
you're doing an experiment in political
873
:science or psychology or economics or an
observational study, everybody knows about
874
:identification being difficulty, but
there's a lot of other difficulties.
875
:So usually if you're doing a causal study,
you wanna have between person comparisons,
876
:or in political science or economics, it
would be called panel study.
877
:You wanna have...
878
:Ideally, you do the treatment and the
control on each person.
879
:But if you can't do that, you want to make
comparisons.
880
:That's super important, partly for
statistical efficiency and for balance.
881
:And it's also kind of a measurement issue
because measurements can be biased and
882
:biases can actually like the treatment
effect.
883
:The treatment can affect the measurement
bias and you can even have treatments that
884
:affect the measurement bias without
affecting the outcome.
885
:Like, it's so naive view that if you just.
886
:give randomly assigned treatment and
control that you have a kosher estimate,
887
:the causal effect, that's not really right
in general, because that assumes that the
888
:measurement bias doesn't vary with the
treatment, and that's often a mistake.
889
:So you really want to have panel structure
or repeated measurements with in -person
890
:designs.
891
:That means you want to start setting
multilevel models.
892
:So if you don't have a lot of observations
or a lot of groups, then your inferences
893
:can depend on the prior, which it really
does.
894
:You can't, you could act really tough and
say, oh, I'm really tough.
895
:I'm not using a prior, but then it just
means your inference is really noisy.
896
:And that's, that's not good either.
897
:It means you can get bad things.
898
:And then what predictors to include in
theory, everything should be interacted
899
:with everything because otherwise that can
induce bias.
900
:But in practice, if you do that, you have
a lot of the coefficients running around.
901
:So even the simplest problems are like,
like there's no right way of doing it.
902
:which gives me a lot of sympathy for
researchers.
903
:And I know here we're not talking about
like the crisis in science, but I'll say
904
:that like sometimes people will say that
you should pre -register your design and
905
:analysis.
906
:And I think that's great, but it's not
gonna solve a lot of problems because if I
907
:don't know the right analysis to do, I
don't know what I'm supposed to be pre
908
:-registering.
909
:It's really difficult.
910
:It's not, we can't just do better science
by just like.
911
:Like there's this phrase, questionable
research practices.
912
:Like it's not like you can just stop doing
questionable research practices and
913
:everything will be okay.
914
:It's not clear.
915
:Doing it right is not just the absence of
making mistakes.
916
:It's very difficult.
917
:And so when we're teaching or when you're
learning, I'll say, cause I really would
918
:like our book to be read by people who are
not necessarily teaching a class, but just
919
:want to learn the stuff that when you're
learning, there is this.
920
:weird thing where you have to learn the
skills and at the same time realize the
921
:limitations.
922
:And it is, it's hard to teach in that way.
923
:It's not like, it's easier to teach
something like physics or chemistry where
924
:you say, here's what we're doing.
925
:And then later on, we're going to tell you
why these ideas aren't correct.
926
:And we're going to do something more
elaborate in statistics.
927
:It's hard to reach that like plateau where
you say, well, here's the basics, learn
928
:the basics.
929
:Once you're learning the basics, you keep
930
:seeing all the problems at the same time.
931
:So it makes it very fun to learn, but also
challenging.
932
:Yeah, true.
933
:Yeah.
934
:And actually that makes me wonder, how do
you think, so for people who are going to
935
:use your book for teaching, so
instructors, how can they adapt the
936
:materials for different educational
settings like...
937
:such as introductory course or more
advanced courses.
938
:So it's set up for this class on applied
regression and causal inference.
939
:So if you're teaching out of regression
and other stories, it's very easy.
940
:It just gives you a whole template for a
two semester class.
941
:I've also taught a one semester version
where I just do one activity and each week
942
:I have two of everything.
943
:So instead I just pick one story, one
activity and so forth.
944
:That's what actually I did.
945
:Last semester, if it's a more advanced
class, and I would say, or or more basic,
946
:if it's a more basic class, I think it's
still pretty much works.
947
:You just have to simplify the code
demonstrations are going to be way too
948
:complicated for more basic class.
949
:But I think the stories work and the
activities work.
950
:You just maybe have to change it a little.
951
:So.
952
:In two truths and a lie, you wouldn't do
logistic regression, but for example, you
953
:could still make a scatter plot and you
could still compare the probability, the
954
:proportion of correct guesses for people's
certainty scores higher than five or lower
955
:than five.
956
:You can adapt it.
957
:I think a lot of the activities are like
that in the stories.
958
:For more advanced class, I think again, it
works in the other direction that this can
959
:be a starting point.
960
:You give the story and...
961
:And also people have their own stories.
962
:So reading my story might help you as a
teacher, think of your own story and tell
963
:it in the same way.
964
:Yeah.
965
:Okay.
966
:Yeah, I see what you mean.
967
:I'm thinking randomly.
968
:It sounds like you would be interested in
Andrew at some point in writing some
969
:fictional stats -based stories.
970
:something like, I think Carl Sagan, right,
did write some science fiction.
971
:Would you be like, do you see yourself
doing that at some point so that you are
972
:forced to maybe not use any modeling or
things like that in the book and you have
973
:to completely only tell stats through the
stories and all?
974
:Well, well, fake data for sure.
975
:I did have an idea.
976
:I was thinking about having a book where
it's
977
:all like it's learning statistics through
fake data simulation where everything is
978
:just you just start with some very simple
things like everything that's like the
979
:gimmick right the gimmick is here all the
principles of probability and statistics
980
:and you're only you're not allowed to use
any real data you're only allowed to do
981
:fake data simulation and you can cover a
lot like all sorts of things the the
982
:attenuation of the of the code the
treatment effect when you have measurement
983
:error in your predictor and
984
:Like anyway, all sorts of things you might
want to cover.
985
:You could do that way.
986
:So I thought that would be fun.
987
:Maybe a fun future book.
988
:I mean, fiction, you know, Jessica and I
wrote a play, Jessica Holman and I wrote a
989
:play recursion, which is fiction.
990
:It has computer science theme.
991
:It was performed at a computer science
conference recently.
992
:So, so I guess, yeah, we have written
fiction.
993
:It didn't really have, it had some
statistical principles in there.
994
:There were, there were some, it had some.
995
:Like we, yeah, I think we had some line
where one of the characters talked about
996
:their code being beautiful, and then
somebody else said, code that runs is
997
:beautiful.
998
:And then somebody else says, code that
runs and you know it runs is beautiful.
999
:So that's like some workflow principle.
:
00:56:19,447 --> 00:56:24,797
So we were able to put in some of our
thoughts about statistical workflow in
:
00:56:24,797 --> 00:56:25,777
fiction.
:
00:56:26,377 --> 00:56:27,997
So yeah, it's possible.
:
00:56:28,301 --> 00:56:30,001
I knew it.
:
00:56:30,001 --> 00:56:31,001
I knew it.
:
00:56:31,001 --> 00:56:31,721
Yeah.
:
00:56:31,801 --> 00:56:33,921
I love to hear that.
:
00:56:33,921 --> 00:56:35,401
I love to hear that.
:
00:56:35,401 --> 00:56:37,041
Read that book.
:
00:56:37,041 --> 00:56:43,301
And I was saying here, because I think,
and you could even record the audio
:
00:56:43,301 --> 00:56:44,381
version yourself.
:
00:56:44,381 --> 00:56:45,771
I think that'd be awesome.
:
00:56:45,771 --> 00:56:45,981
Yeah.
:
00:56:45,981 --> 00:56:49,221
Well, that performance apparently went
well, but they didn't video it.
:
00:56:49,221 --> 00:56:52,561
So we want to get it performed somewhere
else.
:
00:56:52,561 --> 00:56:53,741
Yeah.
:
00:56:54,101 --> 00:56:55,461
Well, let's try that.
:
00:56:55,461 --> 00:56:56,749
If there is...
:
00:56:56,749 --> 00:57:05,289
One day if I manage to do a live LBS
dinner, that should definitely be
:
00:57:05,289 --> 00:57:08,369
performed at that dinner.
:
00:57:08,449 --> 00:57:10,929
That's a must.
:
00:57:13,369 --> 00:57:21,769
Now I'd like to ask you something about, I
know a topic that's dear to your heart is
:
00:57:21,769 --> 00:57:25,609
visualization and it's time to
understanding.
:
00:57:25,889 --> 00:57:26,605
Because...
:
00:57:26,605 --> 00:57:32,465
the focus on visualization is a key aspect
of your book, Active Statistics.
:
00:57:32,465 --> 00:57:36,745
It's also a key aspect of almost all your
work.
:
00:57:36,885 --> 00:57:39,205
So I'd like to hear your thought about
that.
:
00:57:39,205 --> 00:57:46,185
How do you think visualization aids in the
comprehension of statistics and cost of
:
00:57:46,185 --> 00:57:47,045
models?
:
00:57:47,365 --> 00:57:49,505
Well, so I'll talk about two things.
:
00:57:49,505 --> 00:57:54,381
First, visualization in teaching and
second, visualization in statistical.
:
00:57:54,381 --> 00:57:55,801
Like applied statistics.
:
00:57:55,801 --> 00:58:01,561
So with teaching, I think like I think the
deterministic part is usually the more
:
00:58:01,561 --> 00:58:02,521
important part of the model.
:
00:58:02,521 --> 00:58:05,661
So I want people to be able to visualize
what is the line?
:
00:58:05,661 --> 00:58:07,761
Why goes a plus BX?
:
00:58:07,761 --> 00:58:10,241
What what does it look like if I have an
interaction?
:
00:58:10,241 --> 00:58:12,501
What would the two lines look like?
:
00:58:12,501 --> 00:58:15,681
What is logistic curve look like?
:
00:58:16,201 --> 00:58:22,081
I I don't I think it's a mistake when
statistics books start with things like a
:
00:58:22,081 --> 00:58:23,021
histogram.
:
00:58:23,021 --> 00:58:25,761
Histogram is not fundamental.
:
00:58:25,761 --> 00:58:27,541
Actually, it's very confusing.
:
00:58:27,541 --> 00:58:35,301
I used to do this assignment where I would
say to students, gather between 30 and 50
:
00:58:35,301 --> 00:58:39,681
data points on anything and make a
histogram of it.
:
00:58:39,681 --> 00:58:42,271
And about half the students would do it.
:
00:58:42,271 --> 00:58:46,221
Like they might gather data on 30
countries or 50 states, or they might take
:
00:58:46,221 --> 00:58:49,541
30 observations of something and make a
histogram.
:
00:58:49,981 --> 00:58:52,127
The other half would.
:
00:58:52,141 --> 00:58:57,641
make a bar chart showing their 30
observations in time order.
:
00:58:57,641 --> 00:59:00,921
So it would be like, basically it was a
time series except it would just be
:
00:59:00,921 --> 00:59:03,241
displayed in bars because it was a
histogram.
:
00:59:03,241 --> 00:59:07,681
And so like you see the problem is that a
histogram is supposed to convey a
:
00:59:07,681 --> 00:59:11,041
distribution, but what people are getting
out of it is it looks like a bunch of bars
:
00:59:11,041 --> 00:59:13,341
and half the students didn't get the
point.
:
00:59:13,341 --> 00:59:16,781
The concept of a distribution is very
abstract because...
:
00:59:16,781 --> 00:59:21,621
The height of the bar represents the
number of cases or the proportion of
:
00:59:21,621 --> 00:59:22,781
cases.
:
00:59:22,781 --> 00:59:25,011
It's not like a scatter plot.
:
00:59:25,011 --> 00:59:27,041
I think it's actually more intuitive.
:
00:59:27,041 --> 00:59:31,781
But I noticed that statistics classes were
always focusing on that because, oh,
:
00:59:31,781 --> 00:59:33,251
histogram is one dimensional.
:
00:59:33,251 --> 00:59:35,011
What could be more simple than that?
:
00:59:35,011 --> 00:59:37,851
I think a time series is really much more
basic.
:
00:59:37,851 --> 00:59:42,941
So when it comes to plotting data, I think
we really have to get a little closer to
:
00:59:42,941 --> 00:59:44,525
what we care about.
:
00:59:44,525 --> 00:59:47,905
Um, a lot of just stupid stuff, like box
plots.
:
00:59:47,905 --> 00:59:48,565
I hate that.
:
00:59:48,565 --> 00:59:49,455
I hate that stuff.
:
00:59:49,455 --> 00:59:51,285
And it's like, I don't see it.
:
00:59:51,285 --> 00:59:55,725
It's just like, people just do things that
are conventional and I think are
:
00:59:55,725 --> 00:59:56,745
absolutely horrible.
:
00:59:56,745 --> 01:00:02,345
But anyway, all this focus on
distributions, I think the linear, the
:
01:00:02,345 --> 01:00:04,535
deterministic part of the model is more
important.
:
01:00:04,535 --> 01:00:07,505
And so that's what I try to convey.
:
01:00:07,565 --> 01:00:08,621
I do.
:
01:00:08,621 --> 01:00:12,841
One thing I noticed is that students will
learn stuff if it's on the homework and on
:
01:00:12,841 --> 01:00:13,411
the exam.
:
01:00:13,411 --> 01:00:17,681
They won't learn it just because it's on
the blackboard in class or in your slides.
:
01:00:17,681 --> 01:00:26,381
So I found that when I did my work, I
often make sketches of graphs.
:
01:00:26,641 --> 01:00:29,521
And so I require like I have homework
assignments where you have to make a
:
01:00:29,521 --> 01:00:32,761
sketch, sketch what you think it's going
to look like, then fit the model.
:
01:00:32,761 --> 01:00:35,521
Because if you don't ask people to do
that, they won't.
:
01:00:35,521 --> 01:00:37,965
So teaching has to be.
:
01:00:37,965 --> 01:00:42,505
Like you want people to actually practice
that kind of workflow.
:
01:00:42,805 --> 01:00:45,985
So that's then I had something else to
say, but I won't.
:
01:00:45,985 --> 01:00:49,765
We can say it another time about
statistical graphics.
:
01:00:49,885 --> 01:00:51,765
It's already kind of going on a little
bit.
:
01:00:51,765 --> 01:00:55,825
So if we ever talk about statistical
graphics again, just ask me to tell you
:
01:00:55,825 --> 01:00:59,925
what I think is this really super
important aspect of statistical graphics
:
01:00:59,925 --> 01:01:01,805
within statistical inference.
:
01:01:01,805 --> 01:01:03,705
And I'll tell you about that.
:
01:01:03,705 --> 01:01:04,725
Okay, perfect.
:
01:01:04,725 --> 01:01:06,669
Well, definitely.
:
01:01:06,669 --> 01:01:08,829
Definitely tell you.
:
01:01:08,829 --> 01:01:12,849
Do you still have time for one or two
questions or should we?
:
01:01:12,849 --> 01:01:13,429
Yeah, sure.
:
01:01:13,429 --> 01:01:15,969
I have time for one or two questions,
sure.
:
01:01:15,969 --> 01:01:16,789
Okay, awesome.
:
01:01:16,789 --> 01:01:18,849
Let's continue.
:
01:01:18,849 --> 01:01:22,809
I'm curious about that.
:
01:01:23,169 --> 01:01:30,749
How do you handle the distinction and or
the transition from regression analysis to
:
01:01:30,749 --> 01:01:31,709
causal inference?
:
01:01:31,709 --> 01:01:36,429
How do you navigate these two topics in
the classroom setting?
:
01:01:36,429 --> 01:01:41,809
to ensure that students grasp both
concepts effectively.
:
01:01:42,109 --> 01:01:43,389
So I overlap.
:
01:01:43,389 --> 01:01:48,469
So I start talking about causal inference
at the very beginning, partly because they
:
01:01:48,469 --> 01:01:49,689
can't avoid it.
:
01:01:49,689 --> 01:01:51,949
So we'll have a regression.
:
01:01:52,408 --> 01:01:57,489
Maybe you fit one of the examples we use
in regression, other stories is predicting
:
01:01:57,489 --> 01:02:00,949
from some survey, predicting earnings from
height.
:
01:02:00,949 --> 01:02:04,709
Taller people make a little bit more money
than...
:
01:02:05,101 --> 01:02:11,461
shorter people and then you can also you
can throw sex into the model and men make
:
01:02:11,461 --> 01:02:13,061
more money than women taller men.
:
01:02:13,061 --> 01:02:15,621
So you can say how do you interpret the
coefficient of height?
:
01:02:15,621 --> 01:02:19,791
Well if you're one, you know for every
inch taller you make this much more money.
:
01:02:19,791 --> 01:02:20,921
So that's not right.
:
01:02:20,921 --> 01:02:26,701
You have to say comparing two people of
the same sex one of whom is one inch
:
01:02:26,701 --> 01:02:32,921
taller than the other under the model on
average the taller person will be making
:
01:02:32,921 --> 01:02:34,061
this much more money.
:
01:02:34,061 --> 01:02:35,941
So what are the things you need to say?
:
01:02:35,941 --> 01:02:38,801
You have to say comparing, because it's
all comparative.
:
01:02:38,801 --> 01:02:40,701
There's no causal language.
:
01:02:40,701 --> 01:02:46,661
You have to say, on average, you have to
say according to the model.
:
01:02:46,661 --> 01:02:52,981
And you have to say not controlling for
blah, blah, but comparing to people who
:
01:02:52,981 --> 01:02:54,721
are the same in these other predictors.
:
01:02:54,721 --> 01:02:56,621
You're not holding everything else
constant.
:
01:02:56,621 --> 01:02:58,157
You're doing this comparison.
:
01:02:58,157 --> 01:03:01,777
So I do this, I have a drilling class
where they have to do it.
:
01:03:01,777 --> 01:03:02,937
I can then they laugh.
:
01:03:02,937 --> 01:03:07,137
It's like a joke as I say, here's a
regression, explain each coefficient of
:
01:03:07,137 --> 01:03:07,637
words.
:
01:03:07,637 --> 01:03:11,117
And they say, like, what's the coefficient
of the intercept of this model?
:
01:03:11,117 --> 01:03:14,337
It's like something I'm predicting
something as a function of time.
:
01:03:14,337 --> 01:03:18,497
So this says in the year Jesus was born,
this is well, that's the intercept right
:
01:03:18,497 --> 01:03:19,977
at year zero.
:
01:03:19,981 --> 01:03:22,081
So is that interpretable?
:
01:03:22,081 --> 01:03:23,501
Well, maybe it's interpretable.
:
01:03:23,501 --> 01:03:27,741
have a time series going from:to 2000, maybe we're not particularly
:
01:03:27,741 --> 01:03:30,661
interested in what happened when the year
Jesus was born.
:
01:03:30,661 --> 01:03:34,261
That's a bit of an extrapolation that
implies.
:
01:03:34,381 --> 01:03:35,981
So, but same with the coefficient.
:
01:03:35,981 --> 01:03:37,421
So it's like a joke in class.
:
01:03:37,421 --> 01:03:42,361
It's a fun inside joke we have in class
that I'll ask them to explain the
:
01:03:42,361 --> 01:03:47,381
regression coefficient and they have to
say it without using the wrong language.
:
01:03:47,381 --> 01:03:48,865
And it's like,
:
01:03:48,941 --> 01:03:53,161
It's like the game you play as a kid where
like you're not like you say like you're
:
01:03:53,161 --> 01:03:54,281
not allowed to say the word no.
:
01:03:54,281 --> 01:03:55,741
If you say the word no, you lose.
:
01:03:55,741 --> 01:03:58,061
You have to figure out a way to decline.
:
01:03:58,061 --> 01:04:00,561
Will you give me your cake?
:
01:04:00,561 --> 01:04:02,921
I choose not to give you your cake.
:
01:04:02,921 --> 01:04:05,661
You know, like I choose to do something
else or whatever.
:
01:04:05,661 --> 01:04:08,441
So similarly, you're not allowed to use
this word.
:
01:04:08,441 --> 01:04:13,901
And so right away, we're introducing the
idea that causation is important.
:
01:04:13,901 --> 01:04:15,021
And.
:
01:04:15,021 --> 01:04:18,961
Then when we get the causal inference,
well, we have regression already.
:
01:04:18,961 --> 01:04:23,541
So we use that not for controlling for
things, but for adjusting for things.
:
01:04:23,541 --> 01:04:27,521
So we've already done non -causal
examples, like the survey example, where
:
01:04:27,521 --> 01:04:31,441
we adjust for differences in order to post
stratify.
:
01:04:31,441 --> 01:04:34,111
So then it fits in.
:
01:04:34,111 --> 01:04:38,961
So there's a lot of specific things about
causal inference, but we first half is we
:
01:04:38,961 --> 01:04:40,161
don't cheat at the beginning.
:
01:04:40,161 --> 01:04:42,661
We don't pretend to be causal when we're
not.
:
01:04:42,661 --> 01:04:44,749
Then when we get to causal inference,
:
01:04:44,749 --> 01:04:49,009
We make use of what we've already done
rather than treating it as an entirely new
:
01:04:49,009 --> 01:04:49,969
topic.
:
01:04:49,969 --> 01:04:56,309
My little particular pet thing is that the
usual way causal inference is taught is
:
01:04:56,309 --> 01:04:58,469
there's an outcome and a treatment.
:
01:04:58,469 --> 01:05:00,868
And some people get the treatment, some
get the control.
:
01:05:00,868 --> 01:05:05,379
I say the basic is there's pre -test
measurement, a treatment, and an outcome,
:
01:05:05,379 --> 01:05:06,729
and that's in time order.
:
01:05:06,729 --> 01:05:08,249
So it introduces time.
:
01:05:08,249 --> 01:05:11,169
You don't have to have a pre -test, but
you should.
:
01:05:11,169 --> 01:05:13,805
And so it's good practice, but also it...
:
01:05:13,805 --> 01:05:18,545
It puts you into the regression framework
already, which is helpful.
:
01:05:19,285 --> 01:05:22,505
So sometimes things that are too simple
are harder to understand.
:
01:05:22,505 --> 01:05:25,005
A little context can help.
:
01:05:25,805 --> 01:05:31,265
Yeah, I found so the...
:
01:05:31,265 --> 01:05:40,205
The Dirichlet graphs do help quite a lot
in teaching the causal inference concepts,
:
01:05:40,225 --> 01:05:43,237
especially because you can then...
:
01:05:43,277 --> 01:05:47,237
marry that with the graphical
representation of the Bayesian model that
:
01:05:47,237 --> 01:05:48,557
you can come up with.
:
01:05:48,557 --> 01:05:50,717
And then you use simulated data.
:
01:05:50,717 --> 01:05:56,257
You can come up with the model, then write
the model, and then just simulate data and
:
01:05:56,257 --> 01:05:58,327
see what the model tells you.
:
01:05:58,327 --> 01:06:04,597
And if it's able to recover the true
parameters, I find these fit pretty well
:
01:06:04,597 --> 01:06:06,283
together in the workflow.
:
01:06:08,525 --> 01:06:09,505
Good.
:
01:06:09,505 --> 01:06:14,685
Yeah, I think there's a lot of different
ways of teaching these things and using
:
01:06:14,685 --> 01:06:15,525
these.
:
01:06:15,525 --> 01:06:19,465
There are different frameworks that can
work well.
:
01:06:19,465 --> 01:06:23,005
And I think that's good that that's the
case.
:
01:06:23,005 --> 01:06:28,045
There's more than one way of explaining
things and understanding things.
:
01:06:28,045 --> 01:06:30,025
Yeah, true.
:
01:06:30,345 --> 01:06:37,389
Actually, I'm curious, based on the
methodologies and...
:
01:06:37,389 --> 01:06:43,369
Also, the philosophies that present in
active statistics, how do you see the
:
01:06:43,369 --> 01:06:49,289
future of statistical education evolving,
particularly with the advent of new
:
01:06:49,289 --> 01:06:50,609
technologies?
:
01:06:50,669 --> 01:06:54,429
And how do you see that play out in the
coming years?
:
01:06:54,449 --> 01:06:55,689
I don't know.
:
01:06:55,689 --> 01:07:00,529
I mean, I'm still unhappy with how
statistics is usually taught.
:
01:07:00,529 --> 01:07:03,853
So introductory statistics, it's really
been...
:
01:07:03,853 --> 01:07:08,653
Like the textbooks now are almost all
pretty much the same as the textbooks from
:
01:07:08,653 --> 01:07:09,813
40 years ago.
:
01:07:09,813 --> 01:07:17,113
I mean, they look different, but it's
based on this thing where they teach, like
:
01:07:17,113 --> 01:07:21,753
there is this, they teach these
distributions and it, so it starts by
:
01:07:21,753 --> 01:07:27,013
focusing on variation, which I think is
not even really quite right.
:
01:07:27,013 --> 01:07:30,653
And then, it's not really focusing on the
questions that are being asked, it's
:
01:07:30,653 --> 01:07:32,397
really focused on the error term.
:
01:07:32,397 --> 01:07:37,617
And then there's all this stuff about the
sampling distribution of the sample mean,
:
01:07:37,617 --> 01:07:38,957
which is just kind of weird.
:
01:07:38,957 --> 01:07:43,537
Nobody cares about the sample mean and or
rarely do.
:
01:07:43,537 --> 01:07:46,597
It becomes very abstract and hard to
follow.
:
01:07:46,597 --> 01:07:50,897
And then there are these like confidence
intervals, like a huge amount of work to
:
01:07:50,897 --> 01:07:55,177
create these little summaries that you
don't really want to be using along with a
:
01:07:55,177 --> 01:07:56,057
bunch of messages.
:
01:07:56,057 --> 01:07:58,297
If you don't have random assignment,
you're screwed.
:
01:07:58,297 --> 01:08:00,653
If you don't have random sampling, you're
screwed.
:
01:08:00,653 --> 01:08:04,413
Then at the end, there's some stuff like
regression and Chi -squared tests and
:
01:08:04,413 --> 01:08:06,013
things that people do.
:
01:08:06,013 --> 01:08:08,672
And it's just kind of a disaster.
:
01:08:08,672 --> 01:08:09,853
I really, I really hate it.
:
01:08:09,853 --> 01:08:14,363
And I, I would like things to be much more
focused on the questions being asked.
:
01:08:14,363 --> 01:08:18,473
It's hard for me to think exactly how to
construct the introductory class to do
:
01:08:18,473 --> 01:08:19,133
this.
:
01:08:19,133 --> 01:08:22,853
But for the second class in statistics,
like the one that we teach on applied
:
01:08:22,853 --> 01:08:26,853
regression and causal inference, I do like
how we do it in regression and other
:
01:08:26,853 --> 01:08:27,273
stories.
:
01:08:27,273 --> 01:08:30,221
I feel like we developed through the
models.
:
01:08:30,221 --> 01:08:32,901
in a way that makes sense.
:
01:08:33,061 --> 01:08:35,441
I try to do that in active statistics.
:
01:08:35,441 --> 01:08:40,881
But really, the most important part of
teaching are the most basic classes.
:
01:08:42,221 --> 01:08:47,060
And there, we're still working on how to
do that.
:
01:08:47,121 --> 01:08:51,201
So I don't really know what the future is.
:
01:08:51,201 --> 01:08:57,181
There's a lot of statistics and machine
learning methods out there, but a lot
:
01:08:57,181 --> 01:08:57,925
of...
:
01:08:58,189 --> 01:09:02,369
basic concepts, of course, are still
coming up no matter how you do it, like
:
01:09:02,369 --> 01:09:07,509
issues of adjustment and bias and
variation.
:
01:09:07,829 --> 01:09:12,069
So it's hard, it is hard to get it all
like feel like it's all in one place.
:
01:09:12,069 --> 01:09:12,889
It's frustrating.
:
01:09:12,889 --> 01:09:13,549
Yeah.
:
01:09:13,549 --> 01:09:14,229
Yeah.
:
01:09:14,229 --> 01:09:14,749
Yeah.
:
01:09:14,749 --> 01:09:15,869
Now I agree with that.
:
01:09:15,869 --> 01:09:20,569
I'm also asking the question because I'm
pretty curious about it because I'm also
:
01:09:20,569 --> 01:09:24,169
personally a bit lost when I start
thinking about these things.
:
01:09:24,169 --> 01:09:25,389
It's so cute.
:
01:09:25,389 --> 01:09:25,901
And, uh,
:
01:09:25,901 --> 01:09:31,921
Like for now, I don't have a clear
organization in my head, you know.
:
01:09:32,281 --> 01:09:38,121
Maybe one last question for you, Andrew,
before I let you go, because you've
:
01:09:38,121 --> 01:09:43,241
already been extremely generous with your
time and you know me, I could really
:
01:09:43,241 --> 01:09:44,941
interview you for like three hours, no
problem.
:
01:09:44,941 --> 01:09:46,781
I have so many questions.
:
01:09:46,901 --> 01:09:50,521
But maybe what's next for you?
:
01:09:50,521 --> 01:09:55,575
What are your coming projects in maybe in
the, in this coming year?
:
01:09:56,461 --> 01:09:58,361
Well, we're trying to finish.
:
01:09:58,361 --> 01:10:04,421
Well, Aki and I are trying to finish our
Bayesian workflow book, and we'd like to
:
01:10:04,421 --> 01:10:07,761
do our advanced regression and multilevel
models book.
:
01:10:07,761 --> 01:10:12,281
It would be fun to get recursion performed
somewhere by some university theater group
:
01:10:12,281 --> 01:10:13,261
somewhere.
:
01:10:13,261 --> 01:10:22,481
Doing this research on combining, you
know, multilevel regression and post
:
01:10:22,481 --> 01:10:24,229
-traffication and
:
01:10:24,269 --> 01:10:27,529
with sampling weights, which I think is
really important.
:
01:10:27,529 --> 01:10:31,709
And I think also this could be useful for
causal inference too, because people use
:
01:10:31,709 --> 01:10:32,729
weighting there.
:
01:10:32,729 --> 01:10:39,489
So that's probably the one project I'm
most excited about from that direction.
:
01:10:39,889 --> 01:10:42,109
And then we're trying to write.
:
01:10:42,109 --> 01:10:45,149
I have a list.
:
01:10:45,149 --> 01:10:49,089
I have on my web page, I have a list of
published, unpublished, and unwritten
:
01:10:49,089 --> 01:10:50,609
research articles.
:
01:10:50,609 --> 01:10:53,229
So the unwritten is a list of like,
:
01:10:53,229 --> 01:10:56,349
things that I want to do or write up.
:
01:10:56,349 --> 01:10:57,809
So there's a long list of that.
:
01:10:57,809 --> 01:11:00,009
I'm collaborating with an economist.
:
01:11:00,029 --> 01:11:07,809
We're trying to create a unified framework
for causal inference for panel data, which
:
01:11:07,809 --> 01:11:12,749
really includes things like before -after
studies and regression discontinuities and
:
01:11:12,749 --> 01:11:19,789
difference and difference and just regular
regression, time series.
:
01:11:19,789 --> 01:11:21,293
I have a...
:
01:11:21,293 --> 01:11:24,973
Like just as a simple example, if you're
doing linear regression, like you have a
:
01:11:24,973 --> 01:11:29,233
pretest, you regress, you condition on the
pretest, you adjust for that, really.
:
01:11:29,233 --> 01:11:34,153
But if you have a, usually things in Econ,
like things are measured with error.
:
01:11:34,153 --> 01:11:37,343
And so you won't really want to regress on
the pretest.
:
01:11:37,343 --> 01:11:40,833
What you really want to do is regress on
the latent value that the pretest is a
:
01:11:40,833 --> 01:11:41,873
measurement of.
:
01:11:41,873 --> 01:11:43,633
Well, you can do that in Stan now.
:
01:11:43,633 --> 01:11:48,393
So now in Stan, you can write these models
and do Bayesian models with latent
:
01:11:48,393 --> 01:11:49,613
variables and.
:
01:11:49,613 --> 01:11:54,773
I think there's some theoretical results
to be done to show how or see how these
:
01:11:54,773 --> 01:11:57,653
things reduce to other things in special
cases.
:
01:11:57,653 --> 01:12:02,773
It's a little related to my chickens paper
that I did a couple of years ago, which I
:
01:12:02,773 --> 01:12:04,233
really enjoyed.
:
01:12:04,233 --> 01:12:06,693
That's another story.
:
01:12:06,853 --> 01:12:11,833
The chicken story is not in the Act of
Statistics book.
:
01:12:11,933 --> 01:12:15,973
I don't think it's like there's more
stories.
:
01:12:15,973 --> 01:12:19,349
There's room for another 52 stories, I'm
sure.
:
01:12:19,725 --> 01:12:21,285
in the future.
:
01:12:22,125 --> 01:12:23,525
Yeah, for sure.
:
01:12:23,905 --> 01:12:29,825
And the, yeah, we should link to your
chicken paper, actually, in the show
:
01:12:29,825 --> 01:12:29,855
notes.
:
01:12:29,855 --> 01:12:30,945
I like the chicken paper.
:
01:12:30,945 --> 01:12:33,525
It's not the world's most readable.
:
01:12:33,525 --> 01:12:35,585
I mean, it's technical, but I like it.
:
01:12:35,585 --> 01:12:36,425
It's Bayesian.
:
01:12:36,425 --> 01:12:37,685
It's good.
:
01:12:37,945 --> 01:12:38,845
Yeah.
:
01:12:38,985 --> 01:12:43,665
Is it, are you referencing the one from
::
01:12:44,745 --> 01:12:45,125
Or is that...
:
01:12:45,125 --> 01:12:46,795
Yeah, yeah.
:
01:12:46,795 --> 01:12:48,205
Slamming the sham.
:
01:12:48,205 --> 01:12:52,164
A Bayesian model for adaptive adjustment
with noisy control data.
:
01:12:52,164 --> 01:12:56,045
Yeah, it's published in Statistics in
Medicine, which like a journal, nobody
:
01:12:56,045 --> 01:12:57,125
reads.
:
01:12:57,125 --> 01:12:58,315
But what can you do?
:
01:12:58,315 --> 01:13:00,365
I guess nobody reads any journal anymore.
:
01:13:00,365 --> 01:13:02,305
So that's fine, perhaps.
:
01:13:02,305 --> 01:13:04,665
Nobody reads anything.
:
01:13:05,265 --> 01:13:06,535
Nobody reads anything.
:
01:13:06,535 --> 01:13:08,605
They're too busy reading stuff.
:
01:13:09,605 --> 01:13:14,285
Yeah, I mean, definitely that's why it's
very good that you come on the show.
:
01:13:14,285 --> 01:13:16,845
And also that you write these books.
:
01:13:17,133 --> 01:13:21,192
I think it's extremely important because
definitely the general public doesn't read
:
01:13:21,192 --> 01:13:22,233
paper.
:
01:13:22,493 --> 01:13:27,533
I know I do read paper, but it's mainly
because I have to for my job.
:
01:13:27,533 --> 01:13:34,353
I almost never read a paper by pleasure
because it's just like, yeah, the way it's
:
01:13:34,353 --> 01:13:39,153
written is just like so dry, you know, and
I really love a story, as you were saying.
:
01:13:39,153 --> 01:13:42,973
That's also why I really love your
writings in your books, in your blog,
:
01:13:42,973 --> 01:13:46,605
because it's always wrapped.
:
01:13:46,605 --> 01:13:51,835
in a story and in a context and the papers
are mainly just, okay, this is the result,
:
01:13:51,835 --> 01:13:56,205
this is what we're doing, but it's just
too drawing to me and so I'm not reading
:
01:13:56,205 --> 01:14:00,285
that when I'm trying to just read for fun,
you know.
:
01:14:00,445 --> 01:14:04,985
But yeah, awesome, well thanks a lot
Andrew.
:
01:14:04,985 --> 01:14:11,745
I will, that being said, I will link to
this chicken paper in the show notes for
:
01:14:11,745 --> 01:14:13,665
people who want to dig deeper.
:
01:14:14,065 --> 01:14:16,013
Thank you so much Andrew for...
:
01:14:16,013 --> 01:14:20,093
again, taking the time and being on this
show.
:
01:14:20,693 --> 01:14:28,053
Two patrons will have the chance of
receiving for free a hard copy of your
:
01:14:28,053 --> 01:14:29,613
book, thanks to your editor.
:
01:14:29,613 --> 01:14:33,213
So thank you so much, Cambridge University
Press.
:
01:14:33,693 --> 01:14:39,833
And in the show notes, you will have the
links also to buy the book on the
:
01:14:39,833 --> 01:14:42,633
Cambridge University Press website.
:
01:14:42,873 --> 01:14:43,085
So...
:
01:14:43,085 --> 01:14:44,625
Go ahead and do that.
:
01:14:44,625 --> 01:14:50,605
You have a 20 % discount active until July
,::
01:14:51,385 --> 01:14:57,625
The code is in the show notes of these
episodes, so definitely go there.
:
01:14:57,905 --> 01:14:59,145
And By Andrew's book.
:
01:14:59,145 --> 01:15:03,725
This one is really fun and you can read it
on the beach this summer, you know, and
:
01:15:03,725 --> 01:15:09,025
then you'll have a lot of cool stories to
tell your children or at the bar at night,
:
01:15:09,025 --> 01:15:10,945
so definitely do that.
:
01:15:11,405 --> 01:15:16,245
Thanks again, Andrew, and of course,
welcome back on the show anytime you
:
01:15:16,245 --> 01:15:20,645
finish your 15 upcoming books.
:
01:15:21,565 --> 01:15:25,845
Merci encore pour l 'opportunité de parler
avec toi.
:
01:15:25,945 --> 01:15:30,865
Perfect, as you can hear, Andrew speaks
very good French.
:
01:15:34,989 --> 01:15:38,729
This has been another episode of Learning
Bayesian Statistics.
:
01:15:38,729 --> 01:15:43,689
Be sure to rate, review, and follow the
show on your favorite podcatcher, and
:
01:15:43,689 --> 01:15:48,609
visit learnbaystats .com for more
resources about today's topics, as well as
:
01:15:48,609 --> 01:15:53,349
access to more episodes to help you reach
true Bayesian state of mind.
:
01:15:53,349 --> 01:15:55,259
That's learnbaystats .com.
:
01:15:55,259 --> 01:16:00,119
Our theme music is Good Bayesian by Baba
Brinkman, fit MC Lass and Meghiraam.
:
01:16:00,119 --> 01:16:03,279
Check out his awesome work at bababrinkman
.com.
:
01:16:03,279 --> 01:16:04,429
I'm your host,
:
01:16:04,429 --> 01:16:05,429
Alex and Dora.
:
01:16:05,429 --> 01:16:09,669
You can follow me on Twitter at Alex
underscore and Dora like the country.
:
01:16:09,669 --> 01:16:14,749
You can support the show and unlock
exclusive benefits by visiting Patreon
:
01:16:14,749 --> 01:16:16,929
.com slash LearnBasedDance.
:
01:16:16,929 --> 01:16:19,389
Thank you so much for listening and for
your support.
:
01:16:19,389 --> 01:16:25,269
You're truly a good Bayesian change your
predictions after taking information and
:
01:16:25,269 --> 01:16:28,569
if you're thinking I'll be less than
amazing.
:
01:16:28,569 --> 01:16:31,725
Let's adjust those expectations.
:
01:16:31,725 --> 01:16:37,145
Let me show you how to be a good Bayesian
Change calculations after taking fresh
:
01:16:37,145 --> 01:16:43,185
data in Those predictions that your brain
is making Let's get them on a solid
:
01:16:43,185 --> 01:16:44,965
foundation