In this episode, Jonathan Templin, Professor of Psychological and Quantitative Foundations at the University of Iowa, shares insights into his journey in the world of psychometrics.
Jonathan’s research focuses on diagnostic classification models — psychometric models that seek to provide multiple reliable scores from educational and psychological assessments. He also studies Bayesian statistics, as applied in psychometrics, broadly. So, naturally, we discuss the significance of psychometrics in psychological sciences, and how Bayesian methods are helpful in this field.
We also talk about challenges in choosing appropriate prior distributions, best practices for model comparison, and how you can use the Multivariate Normal distribution to infer the correlations between the predictors of your linear regressions.
This is a deep-reaching conversation that concludes with the future of Bayesian statistics in psychological, educational, and social sciences — hope you’ll enjoy it!
Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !
Thank you to my Patrons for making this episode possible!
Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca and Dante Gates.
Visit https://www.patreon.com/learnbayesstats to unlock exclusive Bayesian swag ;)
Links from the show:
You have probably unknowingly already been exposed to this episode’s topic - psychometric testing - when taking a test at school or university. Our guest, Professor Jonathan Templin, tries to increase the meaningfulness of these tests by improving the underlying psychometric models, the bayesian way of course!
Jonathan explains that it is not easy to judge the ability of a student based on exams since they have errors and are only a snapshot. Bayesian statistics helps by naturally propagating this uncertainty to the results.
In the field of psychometric testing, Marginal Maximum Likelihood is commonly used. This approach quickly becomes unfeasible though when trying to marginalise over multidimensional test scores. Luckily, Bayesian probabilistic sampling does not suffer from this.
A further reason to prefer Bayesian statistics is that it provides a lot of information in the posterior. Imagine taking a test that tells you what profession you should pursue at the end of high school. The field with the best fit is of course interesting, but the second best fit may be as well. The posterior distribution can provide this kind of information.
After becoming convinced that Bayes is the right choice for psychometrics, we also talk about practical challenges like choosing a prior for the covariance in a multivariate normal distribution, model selection procedures and more.
In the end we learn about a great Bayesian holiday destination, so make sure to listen till the end!
This is an automatic transcript and may therefore contain errors. Please get in touch if you're willing to correct them.
In this episode, Jonathan Templin,
professor of Psychological and
Quantitative Foundations at the University
of Iowa, shares insight into his journey
in the world of psychometrics.
Jonathan's research focuses on diagnostic
classification models, psychometric models
that seek to provide multiple reliable
scores from educational and psychological
He also studies patient statistics as
applied in psychometrics, broadly.
So naturally, we discussed the
significance of psychometrics in
psychological sciences and how Bayesian
methods are helpful in this field.
We also talked about challenges in
choosing appropriate prior distributions,
best practices for model comparison, and
how you can use the multivariate normal
distribution to infer the correlations
between the predictors of your linear
This is a deep, reaching conversation that
concludes with the future of Bayesian
statistics in Psychological, Educational,
and Social Sciences.
Hope you'll enjoy it.
This is Learning Bayesian Statistics,,:
Hello, my dear Bayesians!
This time, I have the pleasure to welcome
three new members to our Bayesian crew,
Bart Trudeau, Noes Fonseca, and Dante
Thank you so much for your support, folks.
It's the main way this podcast gets
And Bart and Dante, get ready to receive
your exclusive merch in the coming month.
Send me a picture, of course.
Now let's talk psychometrics and modeling
with Jonathan Templin.
Jonathan Templin, welcome to learning
Thank you for having me.
It's a pleasure to be here.
Yeah, thanks a lot.
Quite a few patrons have mentioned you in
the Slack of the show.
So I'm very honored to honor their request
and have you on the show.
And actually thank you folks for bringing
me all of those suggestions and allowing
me to discover so many good patients out
there in the world doing awesome things in
a lot of different fields using our.
favorite tools to all of us based in
So Jonathan, before talking about all of
those good things, let's dive into your
How did you come to the world of
psychometrics and psychological sciences
and how sinuous of a path was it?
That's a good question.
So I was an odd student, I dropped out of
So I started my...
college degree and community college, that
would be the only place that would take
I happened to be really lucky to do that
though, because I had some really great
professors and I took a, once I discovered
that I probably could do school, I took a
statistics course, you know, typical
undergraduate basic statistics.
I found that I loved it.
I decided that I wanted to do something
with statistics and then in the process, I
took a research methods class in
psychology and I decided somehow I wanted
to do statistics in psychology.
So moved on from community college, went
to my undergraduate for two years at
Sacramento state and Sacramento,
California also was really lucky because I
had professor there that said, Hey,
there's this field called quantitative
You should look into it.
If you're interested in statistics and
psychology along the same time, he was
teaching me something called factor
I now look at it as more principal
components analysis, but I wanted to know
what was happening underneath the hood of
And so that's where he said, no, really,
you should go to the graduate school for
And so that's what started me.
I was fortunate enough to be able to go to
the University of Illinois for graduate
I did a master's, a PhD there, and in the
process, that's where I learned all about
So it was a really lucky route, but it all
wouldn't have happened if I didn't go to
community college, so I'm really proud to
say I'm a community college graduate, if
So it kind of happened.
somewhat easily in a way, right?
Good meeting at the right time and boom.
And the call of the eigenvalue is what
really sent me to graduate school.
So I wanted to figure out what that was
Yes, that is a good point.
And so nowadays,
What are you doing?
How would you define the work you're doing
and what are the topics that you are
particularly interested in?
I would put my work into the field of item
response theory, largely.
I do a lot of multidimensional item
There are derivative fields I think I'm
probably most known for, one of which is
something called cognitive diagnosis or
diagnostic classification modeling.
Basically, it's a classification based
method to try to...
Classify students, or I work in the
College of Education, so most of this is
applied to educational data from
assessments, and our goal is to, whenever
you take a test, not just give you one
score, give you multiple valid scores, try
to maximize the information we can give
My particular focus these days is in doing
so in classroom-based assessments, so how
do we understand what a student knows at a
given point in the academic year and try
to help make sure that they make the most
progress they can.
to remove the impact of the teacher
actually to provide the teacher with the
best data to work with the child, to work
with the parents, to try to move forward.
But all that boils down to interesting
measurements, psychometric issues, and
interesting ways that we look at test data
that come out of classrooms.
Yeah, that sounds fascinating.
Basically trying to give a distribution of
results instead of just one point
That's it also and tests have a lot of
So making sure that we don't over deliver
when we have a test score.
Basically understanding what that is and
accurately quantifying how much
measurement error is or lack of
reliability there is in the score itself.
Yeah, that's fascinating.
I mean, we can already dive into that.
I have a lot of questions for you, but it
sounds very interesting.
So what does it look like concretely?
these measurement errors and the test
scores attached to them, and basically how
do you try to solve that?
Maybe you can take an example from your
work where you are trying to do that.
Let me start with the classical example.
If this is too much information, I
But to set the stage, for a long time in
item response theory, we understand that a
Latentability estimate, if you want to
call it that, is applied in education.
So this latent variable that represents
what a person knows, it's put onto the
continuum where items are.
So basically items and people are sort of
However, the properties of the model are
such that how much error there might be in
a person's point estimate of their score
depends on where the score is located on
So this is what, you know, theory gavee to, you know, theory in the:
gave rise to our modern computerized
adaptive assessments and so forth, that
sort of pick an item that would minimize
the error, if you will, different ways of
describing what we pick an item for.
But that's basically the idea.
And so from a perspective of where I'm at
with what I do, a complicating factor in
this, so that architecture that I just
historic version of adaptive assessments
that really been built on large scale
So thousands of students and really what
happens in a classical census you would
take a marginal maximum likelihood
estimate of certain parameter values from
You'd fix those values as if you knew them
with certainty and then you would go and
estimate a person's parameter value along
with their standard error conditional
standard error measurement.
The situations I work in don't have large
sample size but we all in addition to
a problem with sort of the asthmatotic
convergence, if you will, of those models,
we also have a, not only we have not have
large sample sizes, we also have multiple,
multiple scores effectively, multiple
latent freqs that we can't possibly do.
So when you look at the same problem from
a Bayesian lens, sort of an interesting
feature happens that we don't often see,
frequentness or a classical framework in
that process of fixing the parameters of
the model, the item parameters to a value,
you know, disregards any error in the
estimate as well.
Whereas if you're in a simultaneous
estimate, for instance, in a markup chain
where you're sampling these values from a
posterior in addition to sampling
students, it turns out those that error
around those parameters can propagate to
the students and provide a wider interval
around them, which I think is a bit more
accurate, particularly in smaller sample
So I hope that's the answer to your
I may have taken a path that might have
been a little different there, but that's
where I see the value at least in using
Bayesian statistics and what I do.
Yeah, no, I love it.
Don't shy away from technical explanation
on these podcasts.
That's the good thing of the podcast.
Don't have to shy away from it.
It came at a good time.
I've been working on this, some problems
like this all day, so I'm probably in the
weeds a little bit.
Forgive me if I go at the deep end of it.
No, that's great.
And we already mentioned item response
theory on the show.
So hopefully people will refer back to
these episodes and that will give them a
Well, actually you mentioned it, but do
you remember how you first got introduced
to Bayesian methods and why did they stick
Very, very much.
I was introduced because in graduate
school, I had the opportunity to work for
a lab run by Bill Stout at the University
of Illinois with other very notable people
in my career, at least Jeff Douglas, Louis
Roussos, among others.
And I was hired as a graduate research
And my job was to take a program that was
a metropolis Hastings algorithm and to
make it run.
And it was written in Fortran.
So basically, I
It was Metropolis Hastings, Bayesian, and
it was written in language that I didn't
know with methods I didn't know.
And so I was hired and said, yeah, figure
it out with good luck.
Thankfully, I had colleagues that could
help actually probably figure it out more
than I did.
But I was very fortunate to be there
because it's like a trial by fire.
I was basically going line by line through
This was a little bit in the later partof, I think it was the year:
little early 2002.
But something instrumental to me at the
time were a couple papers by a couple
scholars in education at least, Rich Patesd Brian Junker had a paper in:
actually two papers in 1999, I can even,
you know, it's like Journal of Educational
It's like I have that memorized.
But in their algorithm, they had written
down the algorithm itself and it was a
matter of translating that to the
diagnostic models that we were working on.
But that is why it stuck with me because
it was my job, but then it was also
It was not like a lot of the research that
I was reading and not like a lot of the
work I was doing in a lot of the classes I
So I found it really mentally stimulating,
It took the whole of my brain to figure
And even then I don't know that I figured
So that helps answer that question.
So basically it sounds like you were
thrown into the Beijing pool.
Like you didn't have any choice.
When I was Bayesian, it was nice becauset the time, you know, this is:
in education, no measurement in
You know, we knew of Bayes certainly, you
know, there's some great papers from the
nineties that were around, but, you know,
we weren't, it wasn't prominent.
It wasn't, you know, I was in graduate
school, but at the same time I wasn't
learning it, I mean, I knew the textbook
Bayes, like the introductory Bayes, but
not, definitely not.
Like the estimation side.
And so it was timing wise, you know,
people would look back now and say, okay,
why didn't I go grab Stan or grab, at the
time I think we had, Jets didn't exist,
there was bugs.
And it was basically, you have to, you
know, like roll your own to do anything.
So it was, it was good.
No, for sure.
Like, yeah, no, it's like telling, it's
like asking Christopher Columbus or
It's a lot more direct.
Just hop on the plane and...
Wasn't an option.
But actually nowadays, what are you using?
Are you still doing your own sampler like
that in Fortran or are you using some open
I can hopefully say I retired from Fortran
as much as possible.
Most of what I do is install these days a
little bit of JAGS, but then occasionally
trying to write my own here or there.
The latter part I'd love to do more of,
because you can get a little highly
I just like that, I feel like the time to
really deeply do the development work in a
way that doesn't just have an R package or
some package in Python that would just
break all the time.
So I'm sort of stuck right now with that,
but it is something that I'm grateful for
having the contributions of others to be
able to rely upon to do estimation.
Yeah, no, exactly.
So first, Stan, I've heard he's quite
Of course, it's amazing.
A lot of Stan developers have been on this
show, and they do absolutely tremendous
And yeah, as you were saying, why code
your own sampler when you can rely on
samplers that are actually waterproof,
that are developed by a bunch of very
smart people who do a lot of math.
and who do all the heavy lifting for you,
well, just do that.
And thanks to that, Bayesian computing and
statistics are much more accessible
because you don't have to actually know
how to code your own MCMC sampler to do
You can stand on the shoulders of giants
and just use that and superpower your own
So it's definitely something we tell
people, don't code your own samplers now.
You don't need to do that unless you
really, really have to do it.
But usually, when you have to do that, you
know what you're doing.
Otherwise, people have figured that out
Just use the automatic samplers from Stan
or Pimsy or Numpyro or whatever you're
It's usually extremely robust and checked
by a lot of different pairs of eyes and
having that team and like you said, full
of people who are experts in not only just
mathematics, but also computer science
makes a big difference.
I mean, I would not be able to use patient
statistics nowadays if these samplers
didn't exist, right?
Because I'm not a mathematician.
So if I had to write my own sample each
time, I would just be discouraged even
It's just a challenge in and of itself.
I remember the old days where
That would be it.
That's my dissertation.
That was what I had to do.
So it was like six months work on just the
And even then it wasn't very good.
And then they might actually do the
I mean, to me really that probabilistic
programming is one of the super power of
the Beijing community because that really
almost anybody who can code in R or Python
or Julia to just use what's being done by
very competent and smart people and for
What a great community.
I'm really, really impressed with the size
and the scope and how things have
progressed in just 20 years.
It's really something.
And so actually...
Do you know why, well, do you have an idea
why Bayesian statistics is useful in your
What do they bring that you don't get with
the classical framework?
Yeah, in particular, we have a really
If we were to do a classical framework,
typically the gold standard in...
the field I work in is sort of a marginal
The marginal mean we get rid of the latent
variable to estimate models.
So that process of marginalization is done
We numerically integrate across likelihood
Most cases, there are some special case
models that we really are too simplistic
to use for what we do where we don't have
So if we want to do multidimensional
If you think about numeric integration,
for one dimension you have this sort of
discretized set of a likelihood to take
sums across different, what we call
quadrature points of some type of curve.
For the multidimensional sense now, going
from one to two, you effectively squared
the number of points you have.
So that's just too latent variable.
So if you want two bits of information
from an assessment from somebody, now
you've just made your
marginalization process exponentially more
difficult, more time-consuming.
But really, the benefit of having two
scores is very little compared to having
So if we wanted to do five or six or 300
scores, that marginalization process
becomes really difficult.
So from a brute force perspective, if we
take the a Bayesian sampler perspective,
there is not the exponential increase of
computation in the linear increase in the
And so from a number of steps the process
has to take from calculation is much
Now, of course, Markov chains have a lot
So, you know, maybe overall the process is
longer, but it is, I found it to be
necessity, basing statistics to estimate
in some form shows up in this
multidimensional likelihood, basically
created sort of hybrid versions of EM
algorithms where the E-step is replaced
with the Bayesian type method.
But for me, I like the full Bayesian
approach to everything.
So I would say that just in summary
though, what Bayes brings from a brute
force perspective is the ability to
estimate our models in a reasonable amount
of time with a reasonable amount of
There's the added benefit of what I
mentioned previously, which
which is the small sample size, sort of
the, I think, a proper accounting or
allowing of error to propagate in the
right way if you're going to report scores
and so forth, I think that's an added
But from a primary perspective, I'm here
because I have a really tough integral to
solve and Bayes helps me get around it.
Yeah, that's a good point.
And yeah, like as you were saying, I'm
guessing that having priors
And generative modeling helps for low
sample sizes, which tends to be the case a
lot in your field.
The prior distributions can help.
A lot of the frustration with
multidimensional models and psychometrics,
at least in practical sense.
You get a set of data, you think it's
The next process is to estimate a model.
in the classic sense that those models
sometimes would fail to converge.
Uh, and very little reason why, um,
oftentimes it's failed to emerge.
I had a class I taught four or five years
ago where I just asked people to estimate
five dimensions and not a single person
couldn't could get, I had a set of data
for each person.
Not a single person could get it in
marriage with the default options that
you'd see that like an IRT package.
Um, so having the ability to sort of.
Understand potentially where
non-convergence or why that's happening,
which parameters are finding a difficult
Then using priors to sort of aid an
estimation as one part, but then also sort
of the idea of the Bayesian updating.
If you're trying to understand what a
student knows throughout the year,
Bayesian updating is perfect for such
You know, you can assess a student in
November and update their results that you
have potentially from previous parts in
the year as well, too.
So there's a lot of benefits.
I guess I could keep going.
I'm talking to a BASE podcast, so probably
I already know most of it.
I mean, a lot of people are also listening
to understand what BASE is all about and
how that could help them in their own
So that's definitely useful if we have
some psychometricians in the audience who
haven't tried yet some BASE, well, I'm
guessing that would be useful for them.
And actually, could you share an example?
If you have one of a research project
where BASE and stats played a
a crucial role, ideally in uncovering
insights that might have been missed
otherwise, especially using traditional
Yeah, I mean, just honestly, a lot of what
we do just estimating the model itself, it
sounds like it should be trivial.
But to do so with a full information
likelihood function is so difficult.
I would say almost every single analysis
I've done using a multidimensional
has been made possible because of the
Bayesian analyses themselves.
Again, there are shortcut methods you
would call that.
I think there are good methods, but again,
there are people, like I mentioned, that
sort of a hybrid marginal maximum
There's what we would call limited
information approaches that you might see
in programs like M plus, or there's an R
package named Laban that do such things.
But those only use functions of the data,
not the full data themselves.
I mean, it's still good, but it's sort of
I have this sense that the full likelihood
is what we should be using.
So to me, just a simple example, take a, I
was working this morning with a four
dimensional assessment, an assessment, you
know, 20 item test, kids in schools.
And you know, I would have a difficult
time trying to estimate that with a full
maximum likelihood method.
And so Bayes made that possible.
But beyond that, if we ever want to do
something with the test scores afterwards,
So now we have a bunch of Markov chains of
people's scores themselves.
This makes it easy to be able to then not
forget that these scores are not measured
And take a posterior distribution and use
that in a secondary analysis as well, too.
So I was doing some work with one of the
Persian Gulf states where they were trying
like a vocational interest survey.
And some of the classical methods for
this, sort of they disregarded any error
And they basically said, oh, you're
interested in, I don't know, artistic work
or you know, numeric work of some sort.
And they would just tell you, oh, that's
That's your story.
Like, I don't know if you've ever taken
one of those.
What are you gonna do in a career?
You're in a high school student and you're
trying to figure this out.
But if you propagate, if you allow that
error to sort of propagate,
through the way Bayesian methods make it
very easy to do, you'll see that while
that may be the most likely choice of what
you're interested in or what your sort of
dimensions that may be most salient to you
in your interests, there are many other
choices that may even be close to that as
And that would be informative as well too.
So we sort of forget, we sort of overstate
how certain we are in results.
And I think a lot of the Bayesian methods
built around it.
That was one actually project where I did
write the own algorithm for it to try to
estimate these things because it was just
a little more streamlined.
But it seemed it seemed that would rather
than telling a high school student, hey,
you're best at artistic things.
What we could say is, hey, yeah, you may
be best at artistic, but really close to
that is something that's numeric, you
know, like something along those lines.
So while you're strong at art.
You're really strong at math too.
Maybe you should consider one of these two
rather than just go down a path that may
or may not really reflect your interests.
Hope that's a good example.
And I understand how that would be useful
And how does, I'm curious about the role
of priors in all that, because that's
often something that puzzles beginners.
And so you obviously have a lot of
experience in the Bayesian way of life in
So I'm curious, I'm guessing that you kind
of teach the way to do psychometric
analysis in the Bayesian framework to a
lot of people.
And I'm curious, especially on the prior
side, and if there are other interesting
things that you would like to share on
that, feel free.
My question is on the priors.
How do you approach the challenge of
choosing appropriate prior distributions,
especially when you're dealing with
And I'm sure each field does it a little
I mean, as it probably should, because
each field has its own data and models and
already established scientific knowledge.
So that's my way of saying.
This is my approach.
I'm 100% confident that it's the approach
that everybody should take.
But let me back it up a little bit.
So generally speaking, I teach a lot of
students who are going into, um, many of
our students end up in the industry for
educational measurement here in the United
Um, I like, we usually denote our score
parameters with theta.
I like to go around saying that, yeah, I'm
teaching you to have to sell
That's sort of what they do, you know, in
a lot of these industry settings, they're
selling test scores.
So if you think that that's what you're
trying to do, I think that guides to me a
set of prior choices that try to do the
least amount of speculation.
So what I mean by that.
So if you look at a measurement model,
like an item response model, you know,
there's a set of parameters to it.
One parameter in particular, in item
response theory, we call it the
discrimination parameter or
Factor analysis, we call it factor
loading, and linear regression, it would
be a slope.
This parameter tends to govern the extent
to which an item relates to the latent
So the higher that parameter is, the more
that item relates.
Then when we go and do a Bayes theorem to
get a point estimate of a person's score
or a posterior distribution of that
person's score, the contribution of that
is largely reflected by the magnitude of
The higher the parameter that is, the more
that item has weight on that distribution,
the more we think we know about a person.
So in doing that, when I look at setting
prior choices, what I try to do for that
is to set a prior that would be toward
zero, mainly, actually at zero mostly, try
to set it so that we want our data to tell
more of the job than our prior,
particularly if we're trying to, if this
score has a big,
uh, meaning to somebody you think of, um,
well, in the United States, the assessment
culture is a little bit out of control,
but, you know, we have to take tests to go
We have to take tests to go to graduate
school and so forth.
Uh, then of course, if you go and work in
certain industries, there's assessments to
do licensure, right?
So if you, you know, for instance, my
family is a, I come from that family of
nurses, uh, it's a very noble profession,
but to, to be licensed in a nurse in
California, you have to pass an exam.
provide that score for the exam that we're
not, that score reflects as much of the
data as possible unless a prior choice.
And so there are ways that, you know,
people can sort of use priors, they're
sort of not necessarily empirical science
benefit, you can sort of put too much
subjective weight onto it.
So when I talk about priors, when I talk
about the, I try to talk about the
ramifications of the choice of prior on
certain parameters, that discrimination
parameter or slope, I tend to want
to have the data to force it to be further
away from zero because then I'm being more
conservative, I feel like.
The rest of the parameters, I tend to not
use heavy priors on what I do.
I tend to use some very uninformative
priors unless I have to.
And then the most complicated prior for
what we do, and the one that's caused
historically the biggest challenge,
although it's, I think, relatively in good
place these days thanks to research and
science, is the prior that goes on a
covariance or correlation matrix.
That had been incredibly difficult to try
to estimate back in the day.
But now things are much, much easier in
modern computing, in modern ways of
looking, modern priors actually.
Would you like to walk us a bit through
What are you using these days on priors on
correlation or covariance matrices?
Because, yeah, I do teach those also
I love it.
Basically, if you're using, for instance,
a linear regression and want to estimate
not only the correlation of the
parameters, the predictors on the outcome,
but also the correlation between the
predictors themselves and then using that
additional information to make even better
prediction on the outcome, you would, for
instance, use a multivariate normal on the
parameters on your slopes.
of your linear regression, for instance,
what primaries do you use on that
What does the multivariate normal mean?
And a multivariate normal needs a
So what primaries do you use on the
So that's basically the context for
Now, John, basically try and take it from
What are you using in your field these
Yeah, so going with your example, I have
You know, like, if you have a set of
regression coefficients that you say are
multivariate normal, yes, there is a place
for a covariance in the prior.
I never try to speculate what that is.
I don't think I have, like, the human
judgment that it takes to figure out what
the, like, the belief, your prior belief
is for that.
I think you're talking about what would be
analogous to sort of the, like, the
asthmatotic covariance matrix.
The posterior distribution of these
parameters where you look at the
covariance between them is like the
asymptotic covariance matrix in ML, and we
just rarely ever speculate off of the
diagonal, it seems like, on that.
I mean, there are certainly uses for
linear combinations and whatnot, but
I'm more thinking about, like, when I have
a handful of latent variables and try to
estimate, now the problem is I need a
covariance matrix between them, and
they're likely to be highly correlated,
In our field, we tend to see correlations
of psychological variables that are 0.7,
These are all academic skills in my field
that are coming from the same brain.
The child has a lot of reasons why those
are going to be highly correlated.
And so these days, I love the LKJ prior
It makes it easy to put a prior on a
covariance matrix and then if you want to
That's one of the other weird features of
the psychometric world is that because
these variables don't exist, to estimate
covariance matrix, we'd have to make
certain constraints on the, on some of the
item parameters, the measurement model for
If we want a variance of the factor, we
have to set one of the parameters of the
discrimination parameters to a value to be
able to estimate it.
Otherwise, it's not identified.
work that we talk about for calibration
when we're trying to build scores or build
assessments and their data for it, we fix
that value of the variance of a factor to
We standardize the factor zero, meaning
variance one, very simple idea.
The models are equivalent in a classic
sense, in that the likelihoods are
equivalent, whether we do one way or the
When we put products on the posteriors
aren't entirely equivalent, but that's a
matter of a typical Bayesian issue with
In the sense where we want a correlation
matrix, prior to the LKJ, prior, there
were all these sort of, one of my mentors,
Rod McDonald, called devices, little hacks
or tricks that we would do to sort of keep
covariance matrix, sample it, right?
I mean, you think about statistically to
sample it, I like a lot of rejection
So if you were to basically propose a
covariance or correlation matrix, it has
to be positive.
semi-definite, that's a hard term.
It has to be, you have to make sure that
the correlation is bounded and so forth.
But LKJ takes care of almost all of that
for me in a way that allows me to just
model the straight correlation matrix,
which has really made life a lot easier
when it comes to estimation.
Yeah, I mean, I'm not surprised that does.
I mean, that is also the kind of priors I
tend to use personally and that I teach
In this example, for instance, of the
linear regression, that's what I probably
end up using LKJPrior on the predictors on
the slopes of the linear regression.
And for people who don't know,
Never used LKJ prior.
LKJ is decomposition of the covariance
That way, we can basically sample it.
Otherwise, it's extremely hard to sample
from a covariance matrix.
But the LKJ decomposition of the matrix is
a way to basically an algebraic trick.
that makes use of the Cholesky
decomposition of a covariance matrix that
allows us to sample the Cholesky
decomposition instead of the covariance
matrix fully, and that helps the sampling.
Thank you for putting that out there.
I'm glad you put that on.
Yeah, so yeah.
And basically, the way you would
parametrize that, for instance, in Poem C,
use pm.lkj, and basically you would have
to parameterize that with at least three
parameters, the number of dimensions.
So for instance, if you have three
predictors, that would be n equals 3.
The standard deviation that you are
expecting on the predictors on the slopes
of the linear regression, so that's
something you're used to, right?
If you're using a normal prior on the
slope, then the sigma of the slope is just
standard deviation that you're expecting
on that effect for your data and model.
And then you have to specify a prior on
the correlation of these slopes.
And that's where you get into the
And so basically, you can specify a prior.
So that would be called eta in PIME-Z on
the LKJ prior.
bigger eta, the more suspicious of high
correlations your prior would be.
So if eta equals 1, you're basically
expecting a uniform distribution of
That could be minus 1, that could be 1,
that could be 0.
All of those have the same weight.
And then if you go to eta equals 8, for
instance, you would put much more prior
weight on correlations eta.
Close to zero, much of them will be close
to zero in 0.5 minus 0.5, but it would be
very suspicious of very big correlations,
which I guess would make a lot of sense,
for instance, social science.
I don't know in your field, but yeah.
I typically use the uniform, the one
setting, at least to start with, but yeah,
I think that's a great description.
Very good description.
Yeah, I really love these kinds of models
because they make linear regression even
To me, linear regression is so powerful
and very underrated.
You can go so far with plain linear
regression and often it's hard to really
You have to work a lot to do better than a
really good linear regression.
I completely agree with you.
Yeah, I'm 100% right there.
And actually then you get into sort of
quadratic or the nonlinear forms in linear
regression that map onto it that make it
even more powerful.
So yeah, it's absolutely wonderful.
And I mean, as Spider-Man's uncle said,
great power comes with great
So you have to be very careful about the
priors when you have all those features,
so inversing functions because they
the parameter space, but same thing, well,
if you're using a multivariate normal, I
mean, that's more complex.
So of course you have to think a bit more
about your model structure, about your
And also the more structure you add, if
the size of the data is kept equal, well,
that means you have more risk for
overfitting and you have less informative
power per data point.
Let's say so.
That means the prior.
increase in importance, so you have to
think about them more.
But you get a much more powerful model
after once and the goal is to get much
more powerful predictions after once.
I do agree.
These weapons are hard to wield.
They require time and effort.
And on my end, I don't know for you.
Jonathan, but on my end, they also require
a lot of caffeine from time to time.
I mean, so that's the key.
You see how I did the segue.
I should have a podcast.
So as a first time I do that in the
podcast, but I had that.
So I'm a big coffee drinker.
I love coffee.
I'm a big coffee nerd.
But from time to time, I try to decrease
my caffeine usage, you know, also because
you have some habituation effects.
So if I want to keep the caffeine shot
effect, well, I have to sometimes do a
decrease of my usage.
And funnily enough, when I was thinking
about that, a small company called Magic
Mind, they came to me...
They sent me an email and they listened to
the show and they were like, hey, you've
got a cool show.
I would be happy to send you some bottles
for you to try and to talk about it on the
And I thought that was fun.
So I got some Magic Mind myself.
I drank it, but I'm not going to buy
Jonathan because I got Magic Mind to send
some samples to Jonathan.
And if you are watching the YouTube video,
Jonathan is going to try the Magic Mind
right now, live.
So yeah, take it away, Jon.
Yeah, this is interesting because you
reached out to me for the podcast and I
had not met you, but you know, it's a
conversation, it's a podcast, you have to
do great work.
Yes, I'll say yes to that.
Then you said, how would you like to try
the Magic Mind?
And I thought...
being a psych major as an undergraduate,
this is an interesting social psychology
experiment where a random person from the
internet says, hey, I'll send you
So I thought there's a little bit of
safety in that by drinking it in front of
you while we're talking on the podcast.
But of course, I know you can cut this out
if I hit the floor, but here it comes.
So you're drinking it like, sure.
Yeah, I decided to drink it like a shot,
if you will.
It was actually tasted much better than I
It came in a bottle with green.
It tasted tangy, so very good.
And now the question will be, if I get
better at my answers to your questions by
the end of the podcast, therefore we have
now a nice experiment.
But no, I noticed it has a bit of
caffeine, certainly less than a cup of
But at the same time, it doesn't seem
Yeah, that's pretty good.
Yeah, I mean, I'm still drinking caffeine,
if that's all right.
But yeah, from time to time, I like to
My habituation, my answer to that is just
Oh yeah, and decaf and stuff like that.
But yeah, I love the idea of the product
I liked it.
So I was like, yeah, I'm going to give it
And so the way I drank it was also
basically making myself a latte
coffee, I would use the Magic Pint and
then I would put my milk in the milk foam.
And that is really good.
I have to say.
See how that works.
So it's based on, I mean, the thing you
taste most is the matcha, I think.
And usually I'm not a big fan of matcha
and that's why I give it the green color.
I think usually I'm not, but I had to say,
I really appreciated that.
You and me both, I was feeling the same
When I saw it come in the mail, I was
like, ooh, that added to my skepticism,
I'm trying to be a good scientist.
I'm trying to be like, yeah.
But yeah, it was actually surprisingly,
tasted more like a juice, like a citrus
juice than it was matcha.
So it was much nicer than I expected.
Yeah, I love that because me too, I'm
obviously extremely skeptical about all
I like doing that.
It's way better, way more fun to do it
with you or any other nerd from the
community than doing it with normal people
from the street because I'm way too
skeptical for them.
They wouldn't even understand my
I felt like in a scientific community,
I've seen some of the people you've had on
the podcast, we're all a little bit
skeptical about what we do.
I could bring that skepticism here and I'd
feel like at home, hopefully.
I'm glad that you allowed me to do that.
And that's the way of life.
Thanks for trusting me because I agree
that seeing from a third party observer,
you'd be like, that sounds like a scam.
That guy is just inviting me to sell him
something to me.
In a week, he's going to send me an email
to tell me he's got some financial
troubles and I have to wire him $10,000.
Waiting for that or is it, what level of
paranoia do I have this morning?
I was like, well, who are my enemies and
who really wants to do something bad to
So, I don't believe I'm at that level.
So I don't think I have anything to worry
It seems like a reputable company.
So it was, it was amazing.
No, that was good.
Thanks a lot MagicMine for sending me
those samples, that was really fun.
Feel free to give it a try, other people
if you want, if that sounded like
something you'd be interested in.
And if you have any other product to send
me, send them to me, I mean, that sounds
I mean, I'm not gonna say yes to
everything, you know, I have standards on
the show, and especially scientific
But you can always send me something.
And I will always analyze it.
You know, somehow you can work out an
agreement with the World Cup, right?
Some World Cup tickets for the next time.
That would be nice.
Well, what we did is actually kind of
related, I think, I would say to the
other, another aspect of your work.
And that is model comparison.
So, and it's again, a topic that's asked a
lot by students.
Especially when they come from the
classical machine learning framework where
model comparison is just everywhere.
So often they ask how they can do that in
the Bayesian framework.
Again, as usual, I am always skeptical
about just doing model comparison and just
picking your model based on some one
I always say there is no magic one
matching bullet, you know, in the Bayesian
framework where it's just, okay, model
comparisons say that, so for sure.
That's the best model.
I wouldn't say that's how it works.
And you would need a collection of
different indicators, including, for
instance, the LOO, the LOO factor, that
tells you, yeah, that model is better.
But not only that, what about the
What about the model structure?
What about the priors?
What about just the generative story about
But talking about model comparison, what
can you tell us, John, about the
some best practices for carrying out
effective model comparisons?
Kajen is best practice.
I'll just give you what my practice is.
I will make no claim that it's best.
I think you hit on all the aspects of it
in introducing the topic.
If you have a set of models that you're
considering, the first thing I'd like to
think about is not the comparison between
them as much as how each model would fit a
data set of data
post-serial predictive model checking is,
you know, from an amazing sense is where
really a lot of the work for me is focused
Interestingly, what you choose to check
against is a bit of a challenge,
particularly, you know, in certain fields
in psychometrics, at least the ones I'm
I do see a lot of, first of all, model
well-researched area in psychometrics in
Really, there's millions of papers in the:
like that many.
And then another, it's always been
something that people have studied.
I think recently there's been a resurgence
of new ideas in it as well.
So it's well-covered territory from the
It's less well-covered, at least in my
view, in Bayesian psychometrics.
So what I've tried to do,
with my work to try to see if a model fits
absolutely is to look at, there's this,
one of the complicating factors is that a
lot of my data is discrete.
So it's correct and correct scored items.
And in that sense, in the last 15, 20
years, there's been some good work in the
non-Bayesian world about how to use what
we call limited information methods to
assess model fit.
So instead of,
looking at model fit to the entire
So if you have a set of binary data, let's
say 10 variables that you've observed,
technically you have 1,024 different
probabilities that have permutations of
ways they could be zeros and ones.
And model fit should be built toward that
1,024 vector of probabilities.
Good luck with that, right?
You're not gonna collect enough data to do
What a group of scientists Alberto Medeo
Alavarez, Lissai and others have created
are sort of model fit to lower level
So each marginal moment of the day, each
mean effectively, and then like a two-way
table between all pairs of observed
In work that I've done with a couple of
students recently, we've tried to
replicate that idea, but more on a
So could we come up with
and M, like a statistic, this is called an
Could we come up with a version of a
posterior predictive check for what a
model says the two-way table should look
And then similar to that, could we create
a model such that we know saturates that?
So for instance, if we have 10 observed
variables, we could create a model that
has all 10 shoes to two-way tables
estimated perfect, what we would expect to
Now, of course, there's posterior
distributions, but you would expect with
you know, plenty of data and, you know,
very diffused priors that you would get
point estimates, EAP estimates, and that
should be right about where you can
observe the frequencies of data.
So, um, the idea then is now we have two
models, one of which we know should fit
the data absolutely.
And one of which we know, uh, we're, we're
wondering if it fits now that the
comparison comes together.
So we have these two predictive
Um, how do we compare them?
Uh, and that's where, you know,
different approaches we've taken.
One of those is just simply looking at the
We tried to calculate a, we use the
Kilnogorov Smirnov distribution, sort of
the sea where moments are percent wise of
the distributions with overlap, because if
your model's data overlaps with what you
think that the data should look like, you
think the model fits well.
And if it doesn't, it should be far apart
and won't fit well.
That's how we've been trying to build.
It's weird because it's a model
comparison, but one of the comparing
models we know to be
what we call saturated, it should fit the
data the best and no other model, all the
other models should be subsumed into it.
So that's the approach I've taken recently
with posterior predictive checks, but then
a model comparison.
We could have used, as you mentioned, the
LOO factor or the LOO statistic.
And maybe that's something that we should
look into also.
We haven't yet, but one of my recent
graduates, new assistant professor at
University of Arkansas here in the United
Ji Hang Zhang had done a lot of work on
this in his dissertation and other studies
So that's sort of the approach I take.
The other thing I want to mention though
is when you're comparing amongst models,
you have to establish that model for that
absolute fit first.
So the way I envision this is you sort of
compare your model to this sort of
You do that for multiple versions of your
models and then effectively choose amongst
the set of models you're comparing that
sort of fit.
But what that absolute fit is, is like you
mentioned, it's nearly impossible to tell
There's a number of ideas that go into
what makes a good for a good fitting
And definitely I encourage people to go
take a look at the Lou paper.
I will put a link in the show note to that
And also if you're using Arvies, whether
in Julia or Python, we do have.
implementation of the Loo algorithm.
So comparing your models with obviously
extremely simple, it's just a call to
compare and then you can even do a plot of
And yeah, as you were saying, the Loo
algorithm doesn't have any meaning by
The Loo score of a model doesn't mean
It's in comparison to another, to other
So yeah, basically having a baseline model
that you think is already good enough.
And then all the other models have to
compare to that one, which basically could
be like the placebo, if you want, or the
already existing solution that there is
And then any model that's more complicated
than that should be in competition with
that one and should have a reason to be
used, because otherwise, why are you using
a more complicated model if you could just
a simple linear regression, because that's
what I use most of the time for my
Baseline model, just use a simple linear
regression, and then do all the fancy
modeling you want and compare that to the
linear regression, both in predictions and
with the Loo algorithm.
And well, if there is a good reason to
make your life more difficult, then use
But otherwise, why would you?
And yeah, actually talking about these
complexities, something I see is also that
many, many people, many practitioners
might be hesitant to adopt the patient
methods due to the fact that they perceive
them as complex.
So I'm wondering yourself, what resources
or strategies would you recommend to those
who want to learn and apply patient
techniques in their research?
And especially in your field of
I think, um, starting with an
understanding of sort of just the output,
you know, the basics of if you're, if you
have data and if your responsibility is
providing analysis for it, uh, finding
either a package or somebody else's
program that makes the coding quick.
So like you've mentioned linear
regression, if you use VRMS and R, you
know, which will translate that into Stan.
You can quickly go about getting a
Bayesian result fast.
And I found that to me, the conceptual
consideration of what a posterior
distribution is actually is less complex
than we think about when we think about
all the things that we're drilled into in
the classical methods, like, you know,
what, where does the standard error come
from and all this other, you know,
asymptotic features in Bayes it's, it's
visible, like you can see a posterior
distribution, you can plot it, you can,
you know, touch it, almost like touch it
and feel it, right?
It's right there in front of you.
So for me, I think the thing I try to get
people to first is just to understand what
the outputs are.
Sort of what are the key parts of it.
And then, you know, hopefully that gives
that mental representation of where that,
where they're moving toward.
And then at that point, start to add in
all the complexities.
Um, but it is, I think it's, it's
incredibly challenging to try to, to teach
Bayesian methods and I actually think the
further along a person goes, not learning
the Bayesian version of things.
Makes it even harder because now you have
all this well-established, um, can we say
routines or statistics that you're used to
seeing that are not Bayesian, uh, that may
or may not have a direct, um, analog in
the Bayes world.
Um, but that may not be a bad thing.
So, um, thinking about it, actually, I'm
going to take a step back here.
Can conceptually, I think it's, this is
the challenge, um, we face in a program
like I do right here.
I'm working right now.
I work with, um, nine other tenure track.
or Tender to Tender Tech faculty, which is
a very large program.
And we have a long-running curriculum, but
sort of the question I like to ask is,
what do we do with Bayes?
Do we have a parallel track in Bayes?
Do we do Bayes in every class?
Because that's a heavy lift for a lot of
people as well.
Right now, it's, I teach the Bayes
classes, and occasionally some of my
colleagues will put Bayesian statistics in
their classes, but it's tough.
I think if I were
you know, anointed myself king of how we
do all the curriculum.
I don't know the answer I'd come to.
I go back and forth each way.
So, um, I would love to see what a
curriculum looks like where they only
started with base and only kept it in
Cause I think that would be a lot of fun.
00:57:32,723 --> 00:57:35,665
Um, and the quit, the thought question I
asked myself that I don't have an answer
00:57:35,665 --> 00:57:40,488
for is would that be a better mechanism to
get students up to speed on the models
00:57:40,488 --> 00:57:45,251
they're using, then it would be in other
contexts and other classical contexts, I
00:57:45,251 --> 00:57:45,832
don't, I don't know.
00:57:45,832 --> 00:57:47,873
00:57:47,873 --> 00:57:48,398
00:57:48,398 --> 00:57:49,258
00:57:49,859 --> 00:57:51,199
Yeah, two things.
00:57:51,199 --> 00:57:54,742
First, King of Curriculum, amazing title.
00:57:54,822 --> 00:57:59,145
I think it should actually be renamed to
that title in all campuses around the
00:57:59,145 --> 00:57:59,945
00:58:00,466 --> 00:58:03,728
The world's worst kingdom is the
00:58:03,728 --> 00:58:06,170
00:58:06,170 --> 00:58:07,731
I mean, that's really good.
00:58:07,731 --> 00:58:10,593
Like you're going to party, you know, and
so what are we doing on King of
00:58:10,593 --> 00:58:11,613
00:58:12,494 --> 00:58:15,136
So long as the crown is on the head,
that's all that matters, right?
00:58:15,136 --> 00:58:17,477
That would drop some jaws for sure.
00:58:23,191 --> 00:58:29,173
And second, I definitely would like the
theory of the multiverse to be true,
00:58:29,193 --> 00:58:33,735
because that means in one of these
universes, there is at least one where
00:58:33,735 --> 00:58:36,135
Bayesian methods came first.
00:58:36,315 --> 00:58:42,197
And I am definitely curious to see what
that world looks like and see how...
00:58:42,657 --> 00:58:43,550
00:58:43,550 --> 00:58:47,912
What's that world where people were
actually exposed to patient methods first
00:58:47,933 --> 00:58:50,955
and maybe to frequency statistics later?
00:58:50,955 --> 00:58:56,398
Were they actually exposed to frequency
00:58:56,398 --> 00:58:57,619
That's the question.
00:58:57,739 --> 00:59:01,341
No, but yeah, jokes aside, I would be
definitely curious about that.
00:59:02,302 --> 00:59:07,266
Yeah, well, I don't know that I'll have
that experiment in my lifetime, but maybe
00:59:07,266 --> 00:59:09,727
like in a parallel universe somewhere.
00:59:15,010 --> 00:59:22,713
Before we close up the show, I'm wondering
if you have a personal anecdote or example
00:59:22,713 --> 00:59:27,315
of a challenging problem you encountered
in your research or teaching related to
00:59:27,315 --> 00:59:30,817
vision stats and how you were able to
navigate through it?
00:59:30,817 --> 00:59:30,917
00:59:30,917 --> 00:59:40,301
I mean, maybe it's too much in the weeds,
but that first experience I was in
00:59:40,301 --> 00:59:41,941
graduate school trying to learn.
00:59:45,151 --> 00:59:45,631
00:59:45,631 --> 00:59:53,176
It was coding a correlation matrix of
00:59:53,176 --> 00:59:56,657
And that was incredibly difficult.
00:59:57,138 --> 01:00:02,021
One day, one of my colleagues, Bob Henson,
figured it out with the likelihood
01:00:02,021 --> 01:00:02,841
function and so forth.
01:00:02,841 --> 01:00:04,882
But that was the holdup that we had.
01:00:05,723 --> 01:00:09,910
And it's incredible because I say this
because again, we're not, I mentioned it.
01:00:09,910 --> 01:00:11,630
do a lot of my own package coding or
01:00:11,630 --> 01:00:16,473
But I think you see a similar phenomenon
if you misspecify something in your model
01:00:16,473 --> 01:00:20,996
in general and you get results and the
results are either all over the place or
01:00:20,996 --> 01:00:21,776
entire number line.
01:00:21,776 --> 01:00:24,858
For me, it was the correlations, posterior
distribution looked like a uniform
01:00:24,858 --> 01:00:26,339
distribution from negative one to one.
01:00:26,339 --> 01:00:28,980
That was, that's a bad thing to see,
01:00:28,980 --> 01:00:35,884
So just the, the anecdote I have with this
is, it's less, I guess it's less like
01:00:35,884 --> 01:00:38,318
awesome, like when you're like, oh, Bayes
did this and then.
01:00:38,318 --> 01:00:42,339
couldn't have done it otherwise, but it's
more the perseverance that goes to
01:00:42,339 --> 01:00:47,981
sticking with the Bayesian side, which is,
um, Bayes also provides you the ability to
01:00:47,981 --> 01:00:53,083
check a little bit of your work to see if
it's completely gone sideways.
01:00:53,083 --> 01:00:53,404
01:00:53,404 --> 01:00:55,404
So, uh, you see a result like that.
01:00:55,404 --> 01:00:57,665
You have that healthy dose of skepticism.
01:00:57,865 --> 01:01:02,727
You start to investigate more in my case,
it took years, a couple of years of my
01:01:02,727 --> 01:01:08,169
life, uh, working in concert with other
people, uh, as grad students, but, um,
01:01:08,242 --> 01:01:10,544
was fixed, it was almost obvious that it
01:01:10,544 --> 01:01:15,488
I mean, it was, you went from this uniform
distribution across negative one to one to
01:01:15,488 --> 01:01:18,010
something that looked very much like a
posterior distribution that we're used to
01:01:18,010 --> 01:01:21,192
seeing, send around a certain value of the
01:01:21,373 --> 01:01:25,957
And again, it was, for us, it was figuring
out what the likelihood was, but for most
01:01:25,957 --> 01:01:27,738
packages, at least that's not a big deal.
01:01:27,738 --> 01:01:31,161
I think it's already specified in your
choice of model and prior.
01:01:31,201 --> 01:01:36,185
But at the same time, just remembering
01:01:36,270 --> 01:01:40,031
Uh, it's sort of the, the frustration part
of it, not making it work is actually
01:01:40,031 --> 01:01:40,791
01:01:40,791 --> 01:01:44,472
Uh, you get that and you, you can build
and you can sort of check your work if you
01:01:44,472 --> 01:01:45,912
go forward analytically.
01:01:45,912 --> 01:01:50,153
I mean, not analytically brute force, the
sampling part, but that's sort of a check
01:01:50,153 --> 01:01:51,174
on your work.
01:01:51,794 --> 01:01:57,235
Trying to say, so not a great example, not
a super inspiring example, but, um, more
01:01:57,235 --> 01:01:59,536
perseverance pays off in days and in life.
01:01:59,536 --> 01:02:01,617
So it's sort of the analog that I get from
01:02:01,617 --> 01:02:03,037
01:02:03,037 --> 01:02:04,377
Yeah, no, for sure.
01:02:04,377 --> 01:02:05,297
I mean, um,
01:02:06,066 --> 01:02:11,950
is perseverance is so important because
you're definitely going to encounter
01:02:12,091 --> 01:02:12,411
01:02:12,411 --> 01:02:18,336
I mean, none of your models is going to
work as you thought it would.
01:02:18,336 --> 01:02:23,400
So if you don't have that drive and that
passion for the thing that you're
01:02:23,400 --> 01:02:30,466
standing, it's going to be extremely hard
to just get it through the finish line
01:02:30,466 --> 01:02:32,267
because it's not going to be easy.
01:02:32,267 --> 01:02:35,186
So, you know, it's like choosing a new
01:02:35,186 --> 01:02:40,867
If you don't like what the sport is all
about, you're not going to stick with it
01:02:40,867 --> 01:02:42,788
because it's going to be hard.
01:02:42,788 --> 01:02:51,370
So that perseverance, I would say, come
from your curiosity and your passion for
01:02:51,510 --> 01:02:54,351
your field and the methods you're using.
01:02:54,851 --> 01:02:57,592
And the other thing I was going to add,
this is tangential, but let me just add
01:02:57,592 --> 01:03:01,553
it, you have the chance to go visit Bay's
grave in London, take it.
01:03:01,553 --> 01:03:03,570
I had to do that last summer.
01:03:03,570 --> 01:03:06,891
I just, I was in London, I had my children
with me and we all picked some spot we
01:03:06,891 --> 01:03:07,851
wanted to go to.
01:03:07,851 --> 01:03:12,373
And I was like, I'm going to go find and
take a picture in front of Bayes grave.
01:03:12,373 --> 01:03:14,234
And I sort of brought up an interesting
01:03:14,234 --> 01:03:18,136
Like I don't know the etiquette of taking
photographs in front of a deceased grave
01:03:18,136 --> 01:03:18,756
01:03:18,756 --> 01:03:20,736
This is at least providing it.
01:03:21,417 --> 01:03:25,298
But then ironically, as you're sitting
there, as I was sitting there on the tube,
01:03:25,499 --> 01:03:29,700
leaving, I sat next to a woman and she had
Bayes theorem on her shirt.
01:03:29,700 --> 01:03:31,681
It was the Bayes School of Economics.
01:03:31,681 --> 01:03:32,874
So something like this.
01:03:32,874 --> 01:03:36,757
in London, I was like, it was like, okay,
I have reached the Mecca.
01:03:36,757 --> 01:03:41,722
Like the perseverance led to like, like a
trip, you know, my own version of the trip
01:03:41,722 --> 01:03:42,983
to, to London.
01:03:42,983 --> 01:03:45,465
Uh, but definitely, uh, definitely worth
the time to go.
01:03:45,465 --> 01:03:49,669
If you want to be surrounded, uh, once you
reach that, that level of perseverance,
01:03:49,669 --> 01:03:52,271
uh, you're part of the club and then you
can do things like that.
01:03:52,711 --> 01:03:56,475
Fine vacations around, you know, holidays
around base, base graves.
01:03:56,475 --> 01:03:59,117
01:03:59,117 --> 01:03:59,877
01:03:59,970 --> 01:04:02,811
I am definitely gonna do that.
01:04:02,811 --> 01:04:06,732
Thank you very much for giving me another
idea of a nerd holiday.
01:04:06,732 --> 01:04:10,894
My girlfriend is gonna hate me, but she
always wanted to visit London, so you
01:04:10,894 --> 01:04:12,755
know, that's gonna be my bait.
01:04:13,796 --> 01:04:17,417
It's not bad to get to, it's off of Old
Street, you know, actually well marked.
01:04:17,417 --> 01:04:21,059
I mean the grave site's a little
weathered, but it's in a good spot, a good
01:04:21,059 --> 01:04:25,341
part of town, so you know, not really
heavily touristy, amazingly.
01:04:25,341 --> 01:04:26,401
Oh yeah, I'm guessing.
01:04:26,401 --> 01:04:27,381
But you know.
01:04:28,314 --> 01:04:30,355
I am guessing that's the good thing.
01:04:31,015 --> 01:04:34,697
Yeah, no, I already know how I'm gonna ask
01:04:34,697 --> 01:04:36,238
Honey, when I go to London?
01:04:36,278 --> 01:04:36,898
01:04:36,898 --> 01:04:37,599
Let's go to Bay's.
01:04:37,599 --> 01:04:38,579
Let's go check out Bay's Grave.
01:04:38,579 --> 01:04:42,362
Yeah, I mean, that's perfect.
01:04:42,362 --> 01:04:43,882
01:04:43,882 --> 01:04:48,045
So say, I mean, you should send me that
picture and that should be your picture
01:04:48,045 --> 01:04:49,746
for these episodes.
01:04:49,746 --> 01:04:55,409
I always take a picture from guests to
illustrate the episode icon, but you
01:04:55,409 --> 01:04:57,130
definitely need that.
01:04:57,130 --> 01:04:58,190
picture for your icon.
01:04:58,190 --> 01:04:58,590
I can do that.
01:04:58,590 --> 01:05:00,211
I'll be happy to.
01:05:00,211 --> 01:05:01,212
01:05:01,492 --> 01:05:02,652
01:05:03,113 --> 01:05:03,573
01:05:03,573 --> 01:05:08,856
So before asking you the last two
questions, I'm just curious how you see
01:05:09,036 --> 01:05:15,620
the future of patient stats in the context
of psychological sciences and
01:05:15,620 --> 01:05:16,740
01:05:16,981 --> 01:05:22,684
And what are some exciting avenues for
research and application that you envision
01:05:22,684 --> 01:05:25,705
in the coming years or that you would
really like to see?
01:05:26,494 --> 01:05:28,754
Oh, that's a great question.
01:05:28,754 --> 01:05:29,134
01:05:29,134 --> 01:05:37,357
So I, you know, interestingly, in
psychology, you know, quantitative
01:05:37,357 --> 01:05:41,338
psychology sort of been on a downhillswing for, I don't know,:
01:05:41,338 --> 01:05:44,278
there's fewer and fewer programs, at least
in the United States, where people are
01:05:44,278 --> 01:05:45,139
01:05:45,219 --> 01:05:49,020
But despite that, I feel like the use of
Bayesian statistics is up in a lot of a
01:05:49,020 --> 01:05:50,260
lot of different other areas.
01:05:50,260 --> 01:05:55,382
And I think that I think that affords a
01:05:55,382 --> 01:05:56,922
better model-based science.
01:05:56,922 --> 01:06:00,384
So you have to specify a model, you have
to model in mind, and then you go and do
01:06:00,384 --> 01:06:00,584
01:06:00,584 --> 01:06:03,705
I think that benefit makes the science
01:06:03,705 --> 01:06:07,607
You're not just using sort of what's
always been done.
01:06:07,607 --> 01:06:10,848
You can sort of push the envelope
methodologically a bit more.
01:06:10,848 --> 01:06:14,109
And I think that that, and Bayesian
statistics in one way, another benefit of
01:06:14,109 --> 01:06:18,291
them is now you can code an algorithm that
likely will work without having to know,
01:06:18,291 --> 01:06:21,952
like you said, all of the underpinnings,
the technical side of things, you can use
01:06:21,952 --> 01:06:24,453
an existing package to do so.
01:06:25,670 --> 01:06:29,751
I like to say that that's going to
continue to make science a better
01:06:29,751 --> 01:06:30,812
01:06:31,332 --> 01:06:39,635
I think the fear that I have is sort of
the sea of the large language model-based
01:06:39,676 --> 01:06:43,137
version of what we're doing in machine
learning, artificial intelligence.
01:06:43,137 --> 01:06:49,360
But I will be interested to see how we
incorporate a lot of the Bayesian ideas,
01:06:49,360 --> 01:06:51,801
Bayesian methods into that as well.
01:06:51,801 --> 01:06:53,581
I think that there's potential.
01:06:53,846 --> 01:06:57,527
Clearly, people are doing this, I mean,
that's what runs a lot of what is
01:06:57,527 --> 01:06:58,608
01:06:58,608 --> 01:07:00,948
So I look forward to seeing that as well.
01:07:01,349 --> 01:07:07,351
So I get a sense that what we're talking
about is really what may be the foundation
01:07:07,351 --> 01:07:08,872
for what the future will be.
01:07:08,872 --> 01:07:12,033
I mean, maybe we will, maybe instead of
that parallel universe, if we could come
01:07:12,033 --> 01:07:16,615
back or go into the future just in our own
universe in 50 years, maybe what we will
01:07:16,615 --> 01:07:19,356
see is curriculum entirely on Bayesian
01:07:19,356 --> 01:07:21,966
And from, you know, I just looked at your.
01:07:21,966 --> 01:07:26,027
topic list you had recently talking about
variational inference and so forth.
01:07:26,387 --> 01:07:32,910
The use of that in very large models
themselves, I think that is very important
01:07:32,910 --> 01:07:33,250
01:07:33,250 --> 01:07:37,792
So it may just be the thing that crowds
out everything else, but that's
01:07:37,792 --> 01:07:42,114
speculative and I don't make a living
making prediction, unfortunately.
01:07:42,114 --> 01:07:43,874
So that's the best I can do.
01:07:43,874 --> 01:07:45,155
01:07:45,155 --> 01:07:46,015
01:07:46,015 --> 01:07:48,756
I mean, that's also more of a wishlist
01:07:48,756 --> 01:07:50,297
So that's all good.
01:07:50,757 --> 01:07:51,217
01:07:51,217 --> 01:07:51,826
01:07:51,826 --> 01:07:53,847
Well, John, amazing.
01:07:54,708 --> 01:07:55,888
I learned a lot.
01:07:55,908 --> 01:07:57,309
We covered a lot of topics.
01:07:57,309 --> 01:07:58,670
I'm really happy.
01:07:59,531 --> 01:08:04,254
But of course, before letting you go, I'm
going to ask you the last two questions I
01:08:04,254 --> 01:08:06,295
ask every guest at the end of the show.
01:08:06,836 --> 01:08:10,778
Number one, you had unlimited time and
01:08:10,778 --> 01:08:14,001
Which problem would you try to solve?
01:08:14,001 --> 01:08:18,343
Well, I would be trying to figure out how
we know what a student knows every day of
01:08:18,343 --> 01:08:21,685
the year so that we can best teach them
where to go next.
01:08:22,062 --> 01:08:23,982
That would be it.
01:08:23,982 --> 01:08:29,285
Right now, there's not only the problem of
the technical issues of estimation,
01:08:29,285 --> 01:08:33,186
there's also the problem of how do we best
assess them, how much time do they spend
01:08:33,186 --> 01:08:34,387
doing it and so forth.
01:08:34,387 --> 01:08:39,429
That to me is what I would spend most of
my time on.
01:08:39,429 --> 01:08:41,510
That sounds like a good project.
01:08:41,510 --> 01:08:42,390
I love it.
01:08:43,510 --> 01:08:49,633
And second question, if you could have
dinner with any great scientific mind that
01:08:49,633 --> 01:08:51,173
life are fictional.
01:08:51,234 --> 01:08:52,914
who did be.
01:08:52,914 --> 01:08:53,294
01:08:53,294 --> 01:08:55,595
I got a really obscure choice, right?
01:08:55,595 --> 01:08:59,016
It's not like I'm picking Einstein or
01:08:59,016 --> 01:09:01,656
I really, I have like two actually, I've
sort of debated.
01:09:01,656 --> 01:09:06,238
One is economist Paul Krugman, who writes
for the New York Times, works at City
01:09:06,238 --> 01:09:07,418
University of New York now.
01:09:07,418 --> 01:09:09,299
You know, Nobel laureate.
01:09:09,299 --> 01:09:13,720
Loved his work, loved his understanding
of, for the interplay between model and
01:09:13,720 --> 01:09:18,121
data and understanding is fantastic.
01:09:18,341 --> 01:09:20,282
So I would just.
01:09:20,282 --> 01:09:23,204
sit there and just have to listen to
everything you had to say, I think.
01:09:23,224 --> 01:09:26,767
The other is there's a, again, obscure
01:09:26,767 --> 01:09:31,151
One of my things I'm fascinated by is
weather and weather forecasting.
01:09:31,151 --> 01:09:35,033
Uh, if you know, I'm in education or
01:09:35,234 --> 01:09:38,457
Uh, and there's a guy who started the
company called the weather underground.
01:09:38,457 --> 01:09:39,738
His name is Jeff Masters.
01:09:39,738 --> 01:09:43,941
Uh, you can read his work on a blog at
Yale these days, climate connections,
01:09:43,941 --> 01:09:45,262
something along those lines.
01:09:45,262 --> 01:09:49,385
Anyway, since sold the company, but he's
fascinating about modeling, you know,
01:09:49,546 --> 01:09:52,148
Right now we're in the peak of hurricane
season in the United States.
01:09:52,148 --> 01:09:56,532
We see these storms coming off of Africa
or spinning up everywhere and sort of the
01:09:56,532 --> 01:10:01,416
interplay between, unfortunately, the
climate change and then other atmospheric
01:10:01,416 --> 01:10:01,996
01:10:01,996 --> 01:10:07,060
This just makes for an incredibly complex
system that's just fascinating and how
01:10:07,201 --> 01:10:08,742
science approaches prediction there.
01:10:08,742 --> 01:10:10,404
So I find that to be great.
01:10:10,404 --> 01:10:11,464
But those are the two.
01:10:11,464 --> 01:10:14,107
I had to think a lot about that because
there's so many choices, but those two
01:10:14,107 --> 01:10:17,769
people are the ones I read the most,
certainly when it's not just in my field.
01:10:18,942 --> 01:10:19,702
01:10:19,702 --> 01:10:21,983
Yeah, sounds fascinating.
01:10:22,063 --> 01:10:24,505
And weather forecasting is definitely
01:10:25,445 --> 01:10:30,188
Also, because the great thing is you have
feedback every day.
01:10:30,828 --> 01:10:33,010
So that's really cool.
01:10:33,010 --> 01:10:34,070
You can improve your predictions.
01:10:34,070 --> 01:10:35,751
Like the missing data problem.
01:10:35,992 --> 01:10:37,973
You can't sample every part of the
01:10:37,973 --> 01:10:41,895
So how do you incorporate that into your
analysis as well?
01:10:42,615 --> 01:10:43,856
No, that's incredible.
01:10:43,856 --> 01:10:45,697
Multiple average models and stuff.
01:10:45,697 --> 01:10:46,646
01:10:46,646 --> 01:10:51,529
Yeah, that's also a testimony to the power
of modeling and parsimony, you know, where
01:10:51,529 --> 01:10:56,533
it's like, because I worked a lot on
electoral forecasting models and, you
01:10:56,533 --> 01:11:01,937
know, classic way people dismiss models in
01:11:01,937 --> 01:11:06,340
Well, you cannot really predict what
people are going to do at an individual
01:11:06,340 --> 01:11:08,061
level, which is true.
01:11:08,061 --> 01:11:11,624
I mean, you cannot, people have free will,
you know, so you cannot predict at an
01:11:11,624 --> 01:11:14,705
individual level what they are going to
do, but you can.
01:11:14,766 --> 01:11:19,249
quite reliably predict what masses are
going to do.
01:11:19,329 --> 01:11:27,836
Yeah, basically, where the aggregation of
individual points, you can actually kind
01:11:27,836 --> 01:11:30,077
of reliably do it.
01:11:30,939 --> 01:11:35,002
And so the power of modeling here where
you get something that, yeah, you know,
01:11:35,002 --> 01:11:36,143
it's not good.
01:11:36,143 --> 01:11:44,829
It's, you know, the model is wrong, but it
works because it simplifies
01:11:45,378 --> 01:11:51,541
things, but doesn't simplify them to a
point where it doesn't make sense anymore.
01:11:51,801 --> 01:11:55,783
Kind of like the standard model in
physics, where we know it doesn't work, it
01:11:55,783 --> 01:12:02,027
breaks at some point, but it does a pretty
good job of predicting a lot of phenomena
01:12:02,027 --> 01:12:02,527
and we observe.
01:12:02,527 --> 01:12:04,988
So, do you prefer that?
01:12:04,988 --> 01:12:09,431
Is it free will or is it random error?
01:12:09,431 --> 01:12:11,852
Well, you have to come back for another
episode on that because otherwise, yes.
01:12:11,852 --> 01:12:13,893
That's a good one.
01:12:16,547 --> 01:12:16,787
01:12:16,787 --> 01:12:16,888
01:12:16,888 --> 01:12:22,172
Well, Jonathan, thank you so much for your
01:12:22,172 --> 01:12:26,835
As usual, I will put resources and a link
to your website in the show notes for
01:12:26,835 --> 01:12:28,336
those who want to dig deeper.
01:12:28,436 --> 01:12:31,819
Thank you again, Jonathan, for taking the
time and being on this show.
01:12:32,440 --> 01:12:32,940
Happy to be here.
01:12:32,940 --> 01:12:34,521
Thanks for the opportunity.
01:12:34,521 --> 01:12:41,947
It was a pleasure to speak with you and I
hope it makes sense for a lot of people.
01:12:41,947 --> 01:12:43,488
Appreciate your time.