#117 Unveiling the Power of Bayesian Experimental Design, with Desi Ivanova

Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!

Visit our Patreon page to unlock exclusive Bayesian swag ;)

Takeaways:

Designing experiments is about optimal data gathering.
The optimal design maximizes the amount of information.
The best experiment reduces uncertainty the most.
Computational challenges limit the feasibility of BED in practice.
Amortized Bayesian inference can speed up computations.
A good underlying model is crucial for effective BED.
Adaptive experiments are more complex than static ones.
The future of BED is promising with advancements in AI.

Chapters:

00:00 Introduction to Bayesian Experimental Design

07:51 Understanding Bayesian Experimental Design

19:58 Computational Challenges in Bayesian Experimental Design

28:47 Innovations in Bayesian Experimental Design

40:43 Practical Applications of Bayesian Experimental Design

52:12 Future of Bayesian Experimental Design

01:01:17 Real-World Applications and Impact

Thank you to my Patrons for making this episode possible!

Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser, Julio, Edvin Saveljev, Frederick Ayala, Jeffrey Powell, Gal Kampel, Adan Romero, Will Geary, Blake Walters, Jonathan Morgan, Francesco Madrisotti, Ivy Huang and Gary Clarke.

Links from the show:

Come see the show live at PyData NYC: https://pydata.org/nyc2024/
Desi’s website: https://desirivanova.com/
Desi on GitHub: https://github.com/desi-ivanova
Desi on Google Scholar: https://scholar.google.com/citations?user=AmX6sMIAAAAJ&hl=en
Desi on Linkedin: https://www.linkedin.com/in/dr-ivanova/
Desi on Twitter: https://x.com/desirivanova
LBS #34, Multilevel Regression, Post-stratification & Missing Data, with Lauren Kennedy: https://learnbayesstats.com/episode/34-multilevel-regression-post-stratification-missing-data-lauren-kennedy/
LBS #35, The Past, Present & Future of BRMS, with Paul Bürkner: https://learnbayesstats.com/episode/35-past-present-future-brms-paul-burkner/
LBS #45, Biostats & Clinical Trial Design, with Frank Harrell:https://learnbayesstats.com/episode/45-biostats-clinical-trial-design-frank-harrell/
LBS #107, Amortized Bayesian Inference with Deep Neural Networks, with Marvin Schmitt: https://learnbayesstats.com/episode/107-amortized-bayesian-inference-deep-neural-networks-marvin-schmitt/
Bayesian Experimental Design (BED) with BayesFlow and PyTorch: https://github.com/stefanradev93/BayesFlow/blob/dev/examples/michaelis_menten_BED_tutorial.ipynb
Paper – Modern Bayesian Experimental Design: https://arxiv.org/abs/2302.14545
Paper – Optimal experimental design; Formulations and computations: https://arxiv.org/pdf/2407.16212
Information theory, inference and learning algorithms, by the great late Sir David MacKay: https://www.inference.org.uk/itprnn/book.pdf
Patterns, Predictions and Actions, Moritz Hard and Ben Recht https://mlstory.org/index.html

Transcript

This is an automatic transcript and may therefore contain errors. Please get in touch if you're willing to correct them.

Speaker: 00:00:04

Today I am delighted to host Desi Ivanova, a distinguished research fellow in machine

learning at the University of Oxford.

2

: 00:00:13

Desi's fascinating journey in statistics has spanned from quantitatifiance to the

frontiers of Bayesian experimental design, or BED, BED.

3

: 00:00:24

In our conversation, Desi dives into the deep

4

: 00:00:28

world of BED where she has made significant contributions.

5

: 00:00:33

She begins by elucidating the core principles of experimental design, discussing both the

theoretical underpinnings and the complex computational challenges that arise in its

6

: 00:00:43

application.

7

: 00:00:44

Desi shares insights into the innovative solutions she's developed to make BED more

practical and applicable in real-world scenarios, particularly

8

: 00:00:54

highlighting its impact in sectors like healthcare and technology.

9

: 00:00:58

Throughout the discussion, Desi also touches on the exciting future of BED, especially in

light of recent advancements in AI and machine learning.

10

: 00:01:07

She reflects on the critical role of real-time decision-making in today's data-driven

landscape and how patient methods can enhance the speed and accuracy of such decisions.

11

: 00:01:18

This is Learning Vision Statistics, episode 117.

12

: 00:01:22

recorded September 26, 2024.

13

: 00:01:42

Welcome to Learning Bayesian Statistics, a podcast about Bayesian inference, the methods,

the projects and the people who make it possible.

14

: 00:01:51

I'm your host.

15

: 00:01:52

Alex and Dora.

16

: 00:01:53

You can follow me on Twitter at Alex underscore and Dora like the country for any info

about the show.

17

: 00:01:59

Learnbasedats.com is Laplace to be.

18

: 00:02:02

Show notes, becoming a corporate sponsor, unlocking Beijing Merch, supporting the show on

Patreon.

19

: 00:02:08

Everything is in there.

20

: 00:02:09

That's Learnbasedats.com.

21

: 00:02:11

If you're interested in one-on-one mentorship, online courses or statistical consulting,

feel free to reach out and book a call at topmate.io slash Alex underscore and Dora.

22

: 00:02:22

See you around, folks, and best patient wishes to you all.

23

: 00:02:25

And if today's discussion sparked ideas for your business, well, our team at PMC Labs can

help bring them to life.

24

: 00:02:33

Check us out at PMC-labs.com.

25

: 00:02:39

Hello my dear vegans, today I want to welcome our two new patrons in the full posterior

tier.

26

: 00:02:44

Thank you so much Ivy Hwing and Garrett Clark, your support literally makes this show

possible.

27

: 00:02:51

I am

28

: 00:02:51

looking forward to interacting with you guys in the LBS Slack channel.

29

: 00:02:56

Now, before we start the episode, I have a short story for you guys.

30

: 00:03:01

A few years ago, I started learning machine stats by watching all the tutorials I could

find that a teacher I really liked was teaching.

31

: 00:03:11

That teacher was no other than Chris Fonsbeck, PMC's creator and BDFL.

32

: 00:03:18

And five years down the road, senior

33

: 00:03:21

unpredictable road, am beyond excited to share that I will now be teaching a tutorial

alongside Chris.

34

: 00:03:31

That will happen at Pi Data New York from November 6 to 8, 2024.

35

: 00:03:36

And I would be delighted to see you there.

36

: 00:03:40

We will be teaching you everything you need to know to master Gaussian processes with IMC.

37

: 00:03:45

And of course, I will record a few live LBS episodes while I'm there.

38

: 00:03:50

But

39

: 00:03:51

I'll tell you more about that in the next episode.

40

: 00:03:54

In the meantime, you can get your ticket at pine.at.org slash NYC 2024.

41

: 00:04:00

I can't wait to see you there.

42

: 00:04:02

Okay, on to the show now.

43

: 00:04:07

Desi Ivanova, welcome to Learning Bayesian Statistics.

44

: 00:04:12

Thank you for having me, Alex.

45

: 00:04:14

Pleased to be here.

46

: 00:04:15

Yeah, yeah.

47

: 00:04:16

Thanks a lot for taking the time, for being on the show.

48

: 00:04:19

a to Marvin Schmidt for putting us in contact.

49

: 00:04:23

He was kind enough to do that on the base flow stack where we interact from time to time.

50

: 00:04:32

Today, though, we're not going to talk a lot about advertised Bayesian inference.

51

: 00:04:36

We're going to talk mostly about experimental design, Beijing experimental design.

52

: 00:04:42

So BED or BED, I like the acronym.

53

: 00:04:47

But before that, as usual, we'll start with your origin story, Daisy.

54

: 00:04:54

Can you tell us what you're doing nowadays and also how you ended up working on what

you're working today?

55

: 00:05:01

Yeah, of course.

56

: 00:05:03

So broadly speaking, I work in probabilistic machine learning research, where I've worked

on a few different things, actually.

57

: 00:05:12

So the audience here would be mostly familiar with Bayesian inference.

58

: 00:05:16

So I've worked on approximate inference methods, namely, know, variational inference.

59

: 00:05:22

You mentioned Marvin, right?

60

: 00:05:24

So we've actually collaborated with him on some amortized inference work.

61

: 00:05:30

I've also done some work in causality.

62

: 00:05:34

But my main research focus so far has been in an area called Bayesian experimental design,

as you correctly pointed out, BED for short, a nice acronym.

63

: 00:05:47

So BED, Bayesian experimental design was the topic of my PhD.

64

: 00:05:52

And yeah, will be the topic of this podcast episode.

65

: 00:05:55

Yeah, really, really keen on discussing.

66

: 00:05:58

and very, very close to my heart.

67

: 00:06:02

You know, how I ended up here.

68

: 00:06:04

That's actually a bit quite random.

69

: 00:06:07

So before, before getting into research, right, so before my PhD, I actually worked in

finance for quite a few years as a, as a quantitative researcher.

70

: 00:06:20

At some point,

71

: 00:06:22

I really started missing sort of the rigor in a sense of, you know, conducting research,

you know, being quite principled about, you know, how we measure uncertainty, how we

72

: 00:06:33

quantify robustness of our models and of the systems that we're building.

73

: 00:06:39

And right at the height of COVID, I decided to start my PhD back in 2020.

74

: 00:06:48

And

75

: 00:06:50

Indeed, the area, right, based on experimental design, that was originally not the topic

of my PhD.

76

: 00:06:58

I was supposed to work on certain aspects of variational autoencoders.

77

: 00:07:03

If you're familiar with these types of models, they're not as popular anymore, right?

78

: 00:07:09

So if I had ended up working on variational autoencoders, I guess a lot of my research

would have been, I mean, not wasted, but not as relevant.

79

: 00:07:17

not as relevant today as it was, you know, four or five years ago.

80

: 00:07:22

And how I ended up working with Bayesian experimental design specifically, basically,

approached my supervisor a few months before starting my PhD and I said, Hey, can I can I

81

: 00:07:35

read about something interesting to prepare for a PhD?

82

: 00:07:39

And he was like, yeah, just with these papers on Bayesian experimental design.

83

: 00:07:42

And that's how it happened.

84

: 00:07:44

Really?

85

: 00:07:44

Yeah.

86

: 00:07:47

Okay, cool.

87

: 00:07:48

Yeah, I love these.

88

: 00:07:49

I love asking this question because often, you know, with hindsight bias, when you're a

beginner, you like it's easy to trick yourself into thinking that people like you who are

89

: 00:08:05

experts on a on a particular topic and know that topic really well, because they did a PhD

on on it, like they

90

: 00:08:14

They have been doing that since they were, I don't know, 18 or even 15 or it was like all,

all planned and part of being big plan.

91

: 00:08:23

But most of the time when you ask people, it was not at all.

92

: 00:08:26

And it's the result of experimenting with things and also the result of different people

they have met and, and, and encounters and mentors.

93

: 00:08:37

And so I think this is also very valuable to, to

94

: 00:08:42

tell that to beginners because otherwise they can be very daunting.

95

: 00:08:47

100 % Yeah, I would 100 % agree with that.

96

: 00:08:50

And actually experimenting is good.

97

: 00:08:52

You know, again, we'll be talking about experimental design, I think.

98

: 00:08:55

Yeah, many times, you know, just by virtue of trying something new, you discover, you

know, I actually quite liked that.

99

: 00:09:02

And it actually works better, you know, for whatever purpose it might be, it might be your

commute to work, right?

100

: 00:09:09

There was this

101

: 00:09:10

very interesting research.

102

: 00:09:12

You know, when there is like a tube closure, right, if the metro is getting closed, you

know, some people, like 5 % of people actually discover an alternative route that actually

103

: 00:09:22

is much better for the daily commute.

104

: 00:09:25

But they wouldn't have done that had the closure not happened.

105

: 00:09:29

So almost like being forced to experiment may lead to actually better outcomes, right?

106

: 00:09:35

So it's quite interesting.

107

: 00:09:38

Yeah, yeah, no.

108

: 00:09:39

mean,

109

: 00:09:40

completely agree with that and that's also something I tell to a lot of people who reach

out to me, know, wondering how they could start working on Bayesian stats and often I'm

110

: 00:09:54

like, you know, trying to find something you are curious about, interested in and then

start from there because it's gonna be hard stuff and there are gonna be a lot of

111

: 00:10:04

obstacles.

112

: 00:10:05

So if you're not, you know, really curious about

113

: 00:10:10

what you are studying, it's going to be fairly hard to maintain the level of work that you

have to maintain to, to in the end enjoy what you're doing.

114

: 00:10:22

So experimenting is very important.

115

: 00:10:24

I completely agree.

116

: 00:10:27

and actually do you remember yourself?

117

: 00:10:30

so I'm curious first how Bajan is your work.

118

: 00:10:34

And also if you remember when you were, were introduced to, to Bajan stance.

119

: 00:10:42

When was I introduced to Bayesian stats?

120

: 00:10:44

That must have been probably in my undergrad days.

121

: 00:10:50

I remember I took some courses on kind of Bayesian data analysis, but then I didn't do any

of that during my time in industry.

122

: 00:11:04

Yeah.

123

: 00:11:06

And again, as I said, I ended up working on

124

: 00:11:08

Bayesian experimental design a little bit, a little bit randomly.

125

: 00:11:12

The work itself is, you know, it does use Bayesian principles quite a lot.

126

: 00:11:16

You know, we do Bayesian inference, we do, we start with a Bayesian model, right?

127

: 00:11:20

So the modeling aspect is also quite important.

128

: 00:11:22

You know, it's very important to have a good Bayesian model for all of these things to

actually make sense and work well in practice.

129

: 00:11:29

So I would say overall, the work is quite, quite Bayesian, right?

130

: 00:11:36

Yeah.

131

: 00:11:36

Yeah.

132

: 00:11:37

Yeah.

133

: 00:11:37

Yeah, for sure.

134

: 00:11:39

so actually, I think that's a good segue to introduce now, Bayesian experimental design.

135

: 00:11:45

So it's the first time we not talk, not the first time we talk about it on the show, but

it's a really dedicated episode about about that.

136

: 00:11:55

So could you introduce the topic to our listeners and basically explain and define what

Bayesian experimental design is?

137

: 00:12:07

Yeah, of course.

138

: 00:12:09

So can I actually take a step back and talk a little bit about experimental design first?

139

: 00:12:14

Yeah.

140

: 00:12:14

yeah.

141

: 00:12:14

And then we'll add the Bayesian kind of the Bayesian aspect to it.

142

: 00:12:19

So, you know, when, when I say, I work on Bayesian experimental design, most people

immediately think lab experiments, right?

143

: 00:12:28

For example, you're in a chemistry lab and you're trying to synthesize a new drug or a new

compound or something.

144

: 00:12:36

But actually, you know, the field of experimental design is much, broader than that,

right?

145

: 00:12:43

And to, you know, give a few concrete examples, you can think about surveys, right?

146

: 00:12:48

You may need to decide what questions to ask.

147

: 00:12:52

Maybe you want to tailor your questions as, you know, the survey progresses so that, you

know, you're asking very tailored, customized questions to each of your survey

148

: 00:13:03

participants.

149

: 00:13:05

You can think of clinical trials, right?

150

: 00:13:08

So how do you dose drugs appropriately?

151

: 00:13:10

Or, you know, when should you test for certain properties of these drugs, things like

absorption and so on?

152

: 00:13:20

So all of these things can be, you know, cast as a as an experimental design problem, as

an optimal experimental design problem.

153

: 00:13:31

So in my mind, designing experiments really boils down to optimal or at least intelligent

data gathering.

154

: 00:13:42

Does that make sense?

155

: 00:13:43

So we're trying to kind of optimally collect data in order to kind of learn about the

thing that we want to learn about.

156

: 00:13:51

So some underlying quantity of interest, right?

157

: 00:13:57

And the Bayesian framework, so the Bayesian experimental design framework specifically

takes an information theoretic approach to what intelligent or optimal means in this

158

: 00:14:11

context.

159

: 00:14:12

So as I already mentioned, it is a is a model based approach, right?

160

: 00:14:17

So we start with an underlying Bayesian model that actually describes or simulates the

outcome of our experiment.

161

: 00:14:27

And then the optimality part, right?

162

: 00:14:29

So the optimal design will be the one that maximizes the amount of information about the

thing that we're trying to learn about.

163

: 00:14:37

Yeah.

164

: 00:14:39

That makes sense?

165

: 00:14:41

can actually give a concrete example.

166

: 00:14:43

Maybe that will make it easier for you and for the listeners, right?

167

: 00:14:47

So if you think about, you know, the survey, the survey example, right?

168

: 00:14:56

kind of a simple but I think very easy to understand concept is you know trying to learn

let's say about time value of money preferences of different people right yeah so what

169

: 00:15:14

does that mean imagine your

170

: 00:15:21

a behavioral economist, right?

171

: 00:15:23

And you're trying to understand some risk preferences, let's say, of people.

172

: 00:15:27

Generally, the way that you do that is by asking people a series of questions of the form,

do you prefer some money now or you prefer some money later?

173

: 00:15:36

Right?

174

: 00:15:36

So do you prefer 50 pounds now or you prefer 100 pounds in one year?

175

: 00:15:41

Right.

176

: 00:15:42

And then you can choose, are you going to propose 50 pounds or 60 pounds or 100 pounds

now?

177

: 00:15:51

how much money you're going to propose in what time, right?

178

: 00:15:53

So you're going to do a hundred pounds in one month or in three months or in one year,

right?

179

: 00:15:57

So there is like a few choices that you can make.

180

: 00:16:01

And there is a strong incentive to do that with as few questions as possible because you

end up paying actually the money to the participants, right?

181

: 00:16:12

So basically,

182

: 00:16:15

we can start with an underlying Bayesian model that sort of models this type of

preferences of different human participants in this survey.

183

: 00:16:27

There's plenty of such models from psychology, from behavioral economics.

184

: 00:16:34

And at the end of the day, what we want to learn is a few parameters, right?

185

: 00:16:38

You can think about this model almost like a mechanistic

186

: 00:16:41

model that explains how preferences relate to things like, you know, are described by

things like a discount factor or sensitivity to various other things.

187

: 00:16:56

And by asking these series of questions, we're learning about these underlying parameters

in our Bayesian model.

188

: 00:17:06

Did that make sense?

189

: 00:17:09

Yeah.

190

: 00:17:10

Yeah.

191

: 00:17:11

I understand better now.

192

: 00:17:12

And so I'm wondering, it sounds a lot like, you know, just doing also causal modeling,

right?

193

: 00:17:22

So you write your causal graph and then based on that, you can have a generative model and

then, and fitting the model to data is just one part, but it's not what you start with to

194

: 00:17:34

write the model.

195

: 00:17:36

How is that related?

196

: 00:17:39

Right.

197

: 00:17:41

The fields are, in a sense, closely related in the sense that, you know, in order for you

to uncover kind of the true underlying causal graph, let's say if, you know, you start

198

: 00:17:55

with some assumptions, you don't know if X causes Y or Y causes X or, you know, or

something else, the way that you need to do this is by intervening in the system.

199

: 00:18:07

Right.

200

: 00:18:08

So

201

: 00:18:10

You can only, in a sense, have causal conclusions if you have rich enough data and by rich

enough data we generally mean experimental or interventional data, right?

202

: 00:18:24

So you're totally right in kind of drawing parallels in this, right?

203

: 00:18:33

And indeed we may...

204

: 00:18:35

design experiments that actually maximize information about the underlying causal graph,

right?

205

: 00:18:41

So if you don't know the graph and you want to uncover the graph, you can set up a

Bayesian experimental design framework that will allow you to, you know, select, let's

206

: 00:18:51

say, which nodes in my causal graph should I intervene on, with what value should I

intervene on, so that with as few experiments as possible, with as few interventions as

207

: 00:19:00

possible,

208

: 00:19:02

can I actually uncover the true, the ground truth, right?

209

: 00:19:05

The true underlying causal graph, right?

210

: 00:19:09

And, you know, kind of the main thing that you're optimizing for is this notion of

information content.

211

: 00:19:17

So how much information is each intervention, each experiment actually bringing us, right?

212

: 00:19:24

And...

213

: 00:19:27

And I think that's part of the reason why I find the Bayesian framework quite appealing as

opposed to, I guess, non-Bayesian frameworks.

214

: 00:19:33

You know, it really centers around this notion of information gathering.

215

: 00:19:38

And with the Bayesian model, we have a very precise definition of or a precise way to

measure an information content of an experiment.

216

: 00:19:50

Right.

217

: 00:19:52

If you think about

218

: 00:19:56

Imagine again, we're trying to learn some parameters in a model, right?

219

: 00:20:03

The natural, again, once we have the Bayesian model, the natural way to define information

content of an experiment is to look at, you know, what is our uncertainty about these

220

: 00:20:13

parameters under our prior, right?

221

: 00:20:16

So we start with a prior.

222

: 00:20:18

We have uncertainty that is embedded or included in our prior beliefs.

223

: 00:20:26

We're going to perform an experiment to collect some data, right?

224

: 00:20:30

So perform an experiment, collect some data.

225

: 00:20:33

we can update our prior to a posterior.

226

: 00:20:37

So that's classic Bayesian inference, right?

227

: 00:20:41

And now we can compare the uncertainty of that posterior to the uncertainty of our prior.

228

: 00:20:49

And the larger the drop, the better our experiment is, the more informative our experiment

is.

229

: 00:20:58

And so the best...

230

: 00:21:01

or the optimal experiment in this framework would be the one that maximizes this

information gain.

231

: 00:21:09

So the reduction in entropy, we're going to use entropy as a measure of uncertainty in

this framework.

232

: 00:21:21

So it is the experiment that reduces our entropy the most.

233

: 00:21:29

Does that make sense?

234

: 00:21:31

Yeah.

235

: 00:21:32

Total sense.

236

: 00:21:33

Yeah.

237

: 00:21:34

Total sense.

238

: 00:21:35

That's amazing.

239

: 00:21:36

I didn't know.

240

: 00:21:38

So yeah, I mean, that's, that's pretty natural then to include the causal framework into

that.

241

: 00:21:43

And I think that's one of the most powerful features of experimental design, because I

guess most of the time what you want to do when you design an experiment is you want to

242

: 00:21:54

intervene.

243

: 00:21:55

on a causal graph and see actually if your graph is close to reality or not.

244

: 00:22:01

So that's amazing.

245

: 00:22:02

And I love the fact that you can use experimental design to validate or invalidate your

causal graph.

246

: 00:22:11

That's really amazing.

247

: 00:22:14

Correct.

248

: 00:22:15

100%.

249

: 00:22:15

But I do want to stress that

250

: 00:22:19

The notion of causality is not necessary for the purposes of describing what Bayesian

experimental design is.

251

: 00:22:31

I'll give you a couple of other examples, actually.

252

: 00:22:35

So you may...

253

: 00:22:45

You may want to do something like model calibration.

254

: 00:22:50

Let's say you have a simulator with a few parameters that you can tweak, right?

255

: 00:22:57

So that it, I don't know, produces the best outcomes, right?

256

: 00:23:00

Or is optimally calibrated for the thing that you're trying to measure, right?

257

: 00:23:05

It is like, again, I don't think you need, you know, any concepts of causality here,

right?

258

: 00:23:12

It's you're turning a few knobs.

259

: 00:23:14

And you know, again, you can formulate this as an experimental design problem where, you

you are trying to calibrate your system with as few kind of no turns as possible.

260

: 00:23:29

Yeah.

261

: 00:23:35

Yeah, yeah, yeah.

262

: 00:23:38

That makes a ton of sense.

263

: 00:23:40

Something I'm curious about hearing you talk is, and that's also something you've worked

extensively on, is the computational challenges.

264

: 00:23:52

Can you talk about that?

265

: 00:23:53

What are the computational challenges associated with traditional bed, so Bayesian

experimental design, and how they affect the feasibility?

266

: 00:24:04

of bed in real world applications.

267

: 00:24:07

Yeah.

268

: 00:24:09

Yeah, that's that's an excellent point.

269

: 00:24:10

Actually.

270

: 00:24:11

I Yeah, I see you read some of of my papers.

271

: 00:24:16

So, all right.

272

: 00:24:18

So all of these kind of information objectives.

273

: 00:24:24

So what I just described, you know, we can look at the information content, we can

maximize information, and so on, like, it's all very natural.

274

: 00:24:34

And it's all very mathematically precise and beautiful.

275

: 00:24:39

But working with those information, theoretical objectives is quite difficult in practice.

276

: 00:24:47

And the reason for that is precisely as you say, they're extremely computationally costly

to compute or to estimate, and they're even more computationally costly to optimize.

277

: 00:24:58

And the careful listener would have noticed that I mentioned posterior inference.

278

: 00:25:04

Right.

279

: 00:25:04

So in order to compute the information content of an experiment, you actually need to

compute a posterior.

280

: 00:25:14

Right.

281

: 00:25:14

You need to compute a posterior given your data.

282

: 00:25:18

Now, where the problem lies is that you need to do this before you have collected your

data.

283

: 00:25:25

Right.

284

: 00:25:26

Because you designing an experiment and then only you will be performing it and then

observing the outcome and then you can do.

285

: 00:25:32

the actual posterior update.

286

: 00:25:35

Now, what we have to then do is look at our prior entropy minus our posterior entropy and

integrate over all possible outcomes that we may observe under the selected experiment.

287

: 00:25:51

And we have to do that for a number of experiments to actually find the optimal one.

288

: 00:25:57

So what we end up with is this sort of nesting

289

: 00:26:01

of expectations.

290

: 00:26:02

So we have an expectation, we have an average with respect to all possible outcomes that

we can observe.

291

: 00:26:09

And inside of our expectation, inside of this average, we have this nasty posterior

quantity that, generally speaking, is intractable.

292

: 00:26:20

Unless you're in a very specific case where you have a conjugate model, where your

posterior is available in close form, you actually don't have access to that posterior.

293

: 00:26:31

which means that you will need to do some form of approximation, right?

294

: 00:26:38

Whether it's exact like MCMC or is going to be a variational posterior computation.

295

: 00:26:45

Again, there is many ways of doing this.

296

: 00:26:48

The point is that for each design that you may want to try, you need to compute all of

these posteriors.

297

: 00:26:59

for every sample of your potential outcome, right?

298

: 00:27:03

Of your possible outcome under the experiment.

299

: 00:27:07

So what I was just describing is what is known as a doubly intractable quantity, right?

300

: 00:27:16

So again, this podcast audience is very familiar with Bayesian inference and how Bayesian

inference is intractable in general.

301

: 00:27:25

Now computing...

302

: 00:27:27

the EIG, the sort of computing the objective function that we generally use in Bayesian

experimental design is what is known as doubly intractable objective, which is quite

303

: 00:27:41

difficult to work with in practice, right?

304

: 00:27:46

Now, what this means for sort of real world applications is that you either need to throw

a lot of compute

305

: 00:27:56

on the problem.

306

: 00:27:57

Or you need to do, you know, some, you need to sort of give up on the idea of being

Bayesian optimal, right?

307

: 00:28:04

You may use some heuristics or something else.

308

: 00:28:08

And what this problem really becomes limiting is when we start to think about, you know,

running experiments in real time, for example.

309

: 00:28:19

So the survey example that I started with, you know,

310

: 00:28:24

you know, asking participants in your survey, do you prefer somebody now or somebody

later?

311

: 00:28:30

You know, it becomes quite impractical for you to, you know, run all these posterior

inferences and optimize all of these information theoretic objectives in between

312

: 00:28:44

questions, right?

313

: 00:28:45

So it's a little bit, you know, I asked you the first question now, let me run by MCMC.

314

: 00:28:50

Let me optimize some doubly intractable objective.

315

: 00:28:54

Can you just wait five minutes, please?

316

: 00:28:55

And then I'll get back to you with the next question.

317

: 00:28:58

Obviously, it's not something that you can realistically do in practice.

318

: 00:29:05

So I think, historically, the computational challenge of the objectives that we use for

Bayesian experimental design has really...

319

: 00:29:20

limited the feasibility of applying these methods in kind of real-world applications.

320

: 00:29:29

And how, so how did you, which innovations, which work did you do on that front?

321

: 00:29:37

That make all that better.

322

: 00:29:43

Right.

323

: 00:29:43

So there is a few things that I guess we can discuss here.

324

: 00:29:52

So number one, I mentioned posterior inference, right?

325

: 00:29:56

And I mentioned we have to do many posterior inference approximations for every possible

outcome of our experiment.

326

: 00:30:09

Now, I think it was the episode with Marvin.

327

: 00:30:12

right, where you talked about amortized Bayesian inference.

328

: 00:30:16

So in the context of Bayesian experimental design, amortized Bayesian inference plays a

very big role as well, right?

329

: 00:30:23

So one thing that we can do to sort of speed up these computations is to learn a...

330

: 00:30:35

to learn a posterior that is amortized over all the outcomes that we can observe, all the

different outcomes that we can observe, right?

331

: 00:30:49

And the beautiful part is that we know how to do that really well, right?

332

: 00:30:56

So we have all of these very expressive, variational families.

333

: 00:31:02

that we can pick from and optimize with data that we simulate from our underlying Bayesian

model.

334

: 00:31:14

So this aspect of Bayesian experimental design definitely touches on related fields of

amortized Bayesian inference and simulation-based inference.

335

: 00:31:25

So we're using simulations from our model to learn an approximate posterior.

336

: 00:31:32

that we can very quickly draw samples from, as opposed to having to fit an HMC for every

new data set that we may observe.

337

: 00:31:48

That makes sense.

338

: 00:31:50

Yeah.

339

: 00:31:50

And so I will refer listeners to the episode with Marvin, episode 107, where we dive into

amortized patient inference.

340

: 00:31:59

put that in the show notes.

341

: 00:32:01

I also put for reference three other episodes where we mentioned, you know, experimental

design.

342

: 00:32:09

So episode 34 with Lauren Kennedy, 35 with Paul Burkner and 45 with Frank Harrell, that

one.

343

: 00:32:17

focuses more on clinical trial design.

344

: 00:32:20

But that's going to be very interesting to people who are looking to these.

345

: 00:32:27

And yeah, so I can definitely see how amortized patient inference here can be extremely

useful based on everything you used it before.

346

: 00:32:40

Maybe do you have an example, especially I saw that during your PhD,

347

: 00:32:48

You worked on policy-based patient experimental design and you've developed these methods.

348

: 00:32:56

Maybe that will give a more concrete idea to listeners about what all of these means.

349

: 00:33:06

Exactly.

350

: 00:33:06

One way in which we can speed up computations is by utilizing, as I said, amortized

variational inference.

351

: 00:33:15

Now this will speed up the estimation of our information theoretic objective, but we still

need to optimize it.

352

: 00:33:23

Now, given that we have to do after each experiment iteration, right?

353

: 00:33:28

So we have collected our first data point, we have collected our first data point with a

need to...

354

: 00:33:37

update our model and with this new model under this new model, updated model, we need to

kind of decide what to do next.

355

: 00:33:46

Now, this is clearly also very computationally costly, right?

356

: 00:33:50

The optimization step of our information theoretic objective is quite computationally

costly, meaning that it is very hard to do in real time, right?

357

: 00:33:59

Again, going back to the survey example, you still can do it, right?

358

: 00:34:03

You can estimate it a little bit more quickly, but you still can't optimize it.

359

: 00:34:07

And this is where a lot of my PhD work has actually been focused on, right?

360

: 00:34:13

So developing methods that will allow you to run Bayes Bayesian Optimal Design in real

time.

361

: 00:34:19

Now, how are we going to do that?

362

: 00:34:20

So there is a little bit of a conceptual shift in the way that we think about designing

experiments, right?

363

: 00:34:29

What we will do is rather than choosing

364

: 00:34:35

the design, the single design that we're going to perform right now, right in this

experiment iteration.

365

: 00:34:43

What we're going to do is learn or train a design policy that will take as an input our

experimental data that we have gathered so far, and it will produce as an output the

366

: 00:35:01

optimal design for the next experiment iteration.

367

: 00:35:05

So our design policy is just a function, right?

368

: 00:35:07

It's just a function that takes past experimental data as an input and produces the next

design as an output.

369

: 00:35:15

Does that make sense?

370

: 00:35:17

Yeah, yeah, yeah.

371

: 00:35:18

I can see what that means.

372

: 00:35:23

How do you integrate that though?

373

: 00:35:27

You know, like I'm really curious concretely.

374

: 00:35:31

Yeah.

375

: 00:35:31

what does integrating all those methods, so a multi-spatial inference, variational

inference to the Bayesian experimental design, and then you have the Bayesian model that

376

: 00:35:43

underlies all of that.

377

: 00:35:46

How do you do that completely?

378

: 00:35:50

Yes, excellent.

379

: 00:35:53

When we say the model, I generally mean the underlying Bayesian model.

380

: 00:35:57

This is our model that we use to train our

381

: 00:36:01

let's say, variational amortized posterior.

382

: 00:36:03

And this is the same model that we're going to train our design policy network.

383

: 00:36:08

And I already said it's a design policy network, which means that we're going to be using,

again, deep learning.

384

: 00:36:13

We're going to be using neural networks to actually learn a very expressive function that

will be able to take our data as an input, produce the next design as an output.

385

: 00:36:26

Now, how we do that concretely?

386

: 00:36:29

There is, you know, by now a large number of architectures that we can pick that is

suitable for, you know, our concrete problem that we're considering.

387

: 00:36:46

So one very important aspect in everything that we do is that our policy, our neural

network should be able to take variable size data sets as an input.

388

: 00:37:00

Right?

389

: 00:37:00

Because every time we're calling our policy, every time we want a new design, we will be

feeding it with the data that we have gathered so far.

390

: 00:37:08

Right?

391

: 00:37:09

And so it is quite important to be able to condition on or take as an input variable

length sequences.

392

: 00:37:20

Right?

393

: 00:37:21

And so concretely, how can we do that?

394

: 00:37:22

Well, you

395

: 00:37:24

One kind of standard way of doing things is to basically take our experimental data that

we've gathered so far and embed each data point.

396

: 00:37:38

So we have an X for our design, a Y for our outcome.

397

: 00:37:43

Take this pair and embed it to a fixed dimensional representation, right, in some latent

space.

398

: 00:37:49

Let's say with a small neural network, right?

399

: 00:37:51

And we do that for each

400

: 00:37:53

individual design outcome pair, right?

401

: 00:37:56

So if we have n design outcome pairs, we're gonna end up with n fixed dimensional

representations after we have embedded all of this data.

402

: 00:38:11

Now, how can we then produce the next sort of optimal design for the next experiment

iteration?

403

: 00:38:17

There is many choices, and I think it will very much depend on the application.

404

: 00:38:24

So certain Bayesian models, certain underlying Bayesian models are what we call

exchangeable, right?

405

: 00:38:30

So the data conditional on the parameters can be, the data conditional on the parameters

is IID, right?

406

: 00:38:39

Which means that the order in which our data points arrive doesn't matter.

407

: 00:38:46

And again, the survey example.

408

: 00:38:49

is quite a good example of this precisely, right?

409

: 00:38:54

Like it doesn't really matter which question we ask first or second, you know, we can

interchange them and the outcomes will be unaffected.

410

: 00:39:06

This is very different to time series models where, you know, if we design, if we are

choosing the time points at which to take blood pressure, right, for example,

411

: 00:39:18

If we decide to take blood pressure at t equals five, we cannot then go back and take the

blood pressure at t equals two.

412

: 00:39:28

So the choice of architecture will very much depend on, as I said, the underlying problem.

413

: 00:39:34

And generally speaking, we have found it quite useful to explicitly embed the structure

that is known.

414

: 00:39:45

So if we know that our model is exchangeable, we should be using an appropriate

architecture, which will ensure that the order of our data doesn't matter.

415

: 00:39:56

If we have a time series, we can use an architecture that takes into account the order of

the data.

416

: 00:40:07

So for the first one, we have...

417

: 00:40:12

kind of standard architecture such as I don't know how familiar the audience would be, but

you know, in deep learning, there is an architecture called deep sets, right?

418

: 00:40:21

So basically, take our fixed dimensional representations and we simply add them together.

419

: 00:40:25

Very simple, right?

420

: 00:40:27

Okay, we have our end design outcome pairs.

421

: 00:40:31

We add them together, they're all of them are in in the same fixed dimensional

representation, we add them together.

422

: 00:40:38

Now this is our

423

: 00:40:39

representation or a summary of the data that we have gathered so far.

424

: 00:40:44

We take that and we maybe map it through another small neural network to produce the next

design.

425

: 00:40:53

If we have a time series model, then we can, you know, pass everything through an LSTM or

some form of recurrent neural network to then produce the next design.

426

: 00:41:03

And that will keep sort of the order in

427

: 00:41:09

and it will take the order into account.

428

: 00:41:13

Did that answer your question in terms of like how specifically we think about these

policies?

429

: 00:41:19

Yeah.

430

: 00:41:19

Yeah, that's fascinating.

431

: 00:41:20

And so basically, and we talked about that a bit with Marvin already, but the choice of

neural network is very important depending on the type of data because if you have, many

432

: 00:41:31

time series are complicated, right?

433

: 00:41:32

Like they already are, even if you're not using a neural network, time is always

complicated to work with.

434

: 00:41:38

because there is an autocorrelation, right?

435

: 00:41:39

So you have to be very careful.

436

: 00:41:44

So basically that means changing the neural network you're working with.

437

: 00:41:49

then so concretely, like what, you know, for practitioners, someone who is listening to us

or watching us on YouTube, and they want to start implementing BED in their projects,

438

: 00:42:05

what's practical advice

439

: 00:42:07

you would have for them to get started?

440

: 00:42:11

Like how, why, and also when, you know, because there may be some moments, some cases

where you don't really want to use BED.

441

: 00:42:21

And also what kind of packages you're using to actually do that in your own work.

442

: 00:42:25

So that's a big question, I know, but like, again, repeat it as you give the answers.

443

: 00:42:31

Yeah, yeah, yeah.

444

: 00:42:33

Let me start with kind of...

445

: 00:42:36

If people are looking to implement BASE in their projects, I think it is quite important

to sort of recognize where BASE experimental design is applicable, right?

446

: 00:42:50

So it can be applied whenever we can construct an appropriate model for our experiments,

right?

447

: 00:42:57

So the modeling part, like the underlying BASE model is actually doing a lot of the heavy

lifting in sort of in this framework.

448

: 00:43:05

simply because this is basically what we're to assess quality of the designs, right?

449

: 00:43:11

So the model is informing what a valuable information is.

450

: 00:43:17

And so I would definitely advise not to gloss over that part.

451

: 00:43:21

If your model is bad, if your model doesn't represent the data generating process, in

reality, the results might be quite poor.

452

: 00:43:36

Now, I think it's also good to mention that you don't need to know the exact probability

distribution of the outcomes of the experiment, right?

453

: 00:43:46

So you can, you know, as long as you can simulate, right?

454

: 00:43:49

So you can have a simulator-based model that simply samples outcomes of the experiment,

given the experiment.

455

: 00:43:58

which I think, you know, it simplifies things a little bit.

456

: 00:44:01

You know, don't have to write down exact probability distributions, but still you need to

be able to sample or simulate this outcome.

457

: 00:44:13

So that would be step number one, right?

458

: 00:44:16

So ensuring that you have a decent model that you can start sort of experimenting with,

you know, in the sense of like...

459

: 00:44:24

designing the policies or like training the policies or sort of designing experiments.

460

: 00:44:34

The actual implementation aspect in terms of software, unfortunately, based on

experimental design is not as well developed.

461

: 00:44:50

from software point of view as, for example, amortized Bayesian inferences, right?

462

: 00:44:55

So I'm sure that you spoke about the baseflow package with Marvin, which is a really

amazing sort of open source effort.

463

: 00:45:06

They have done a great job of implementing many of the kind of standard architectures that

you can basically, you know,

464

: 00:45:15

pick whatever works or like pick something that is relatively appropriate for your problem

and it will work out, right?

465

: 00:45:22

I think that is like a super powerful, super powerful framework that includes, know,

latest and greatest architectures in fact.

466

: 00:45:30

Unfortunately, we don't have anything like this for Bayesian experimental design yet, but

I am in touch with the Baystow guys and I'm definitely looking into

467

: 00:45:43

implementing some of these experimental design workflows in their package.

468

: 00:45:48

So I have it on my to-do list to actually write a little tutorial in Baseflow, how you can

use Baseflow and your favorite deep learning framework of choice, whether it's PyTorch or

469

: 00:46:02

JAX or like whatever, TensorFlow, to train sort of a...

470

: 00:46:11

a policy, a design policy along with all of the amortized posteriors and all the bells and

whistles that you may need to run, you know, some pipeline like that.

471

: 00:46:25

Right.

472

: 00:46:25

So I mentioned the modeling aspect, I mentioned the software aspect.

473

: 00:46:30

I think thinking about the problem.

474

: 00:46:35

in like other aspects of like, you going to run an adaptive experiment or are you going to

run a static experiment?

475

: 00:46:40

Right.

476

: 00:46:41

So adaptive experiments are much more complicated than static experiments.

477

: 00:46:45

So in an adaptive experiment, you're always conditioning on the data that you have

gathered so far, right?

478

: 00:46:51

In a static experiment, you just design a large batch of experiments and then you run it

once, you collect your data and then you do your Bayesian analysis from there.

479

: 00:47:01

Right.

480

: 00:47:02

And so

481

: 00:47:04

I generally always recommend starting with the simpler case, figure out whether the

simpler case works, do the proof of concept on a static or non-adaptive type of Bayesian

482

: 00:47:21

experimental design.

483

: 00:47:22

And then, and only then, start to think about, let me train a policy or let me try to do

an adaptive experimental design.

484

: 00:47:33

pipeline.

485

: 00:47:34

I think this is a bit of a common pitfall if I may say, like people tend to like jump to

the more complicated thing before actually figuring out kind of the simple case.

486

: 00:47:46

Other than that, I think, again, I think it's a kind of an active area of research to, you

know, figure out ways to evaluate your designs.

487

: 00:47:57

I think by now we have

488

: 00:47:59

pretty good ways of evaluating the quality of our posteriors, for example, right?

489

: 00:48:03

You have various posterior diagnostic checks and so on that doesn't really exist as much

for designs, right?

490

: 00:48:11

So what does it like, you know, I've maximized my information objective, right?

491

: 00:48:15

I have collected as much information as I can, right?

492

: 00:48:18

According to this information objective.

493

: 00:48:20

But what does that mean in practice, right?

494

: 00:48:22

Like there is no kind of real world information that I can...

495

: 00:48:28

test with, right?

496

: 00:48:29

Like if you're doing predictions, can, you can predict, can observe and then compare,

right?

497

: 00:48:33

And you can compute an accuracy score or a root mean squared error or like whatever makes

sense.

498

: 00:48:39

There doesn't really exist anything like this in, in sort of in, in, in design, right?

499

: 00:48:45

So it becomes much harder to quantify the success of such a pipeline.

500

: 00:48:53

And I think it's, it's, it's a super interesting

501

: 00:48:55

area for development.

502

: 00:48:57

It's part of the reason why I work in the field.

503

: 00:49:00

I think there is many open problems that if we figure out, I think we can advance the

field quite a lot and make data gathering an actual thing, principled and robust and

504

: 00:49:16

reliable so that you run your expensive pipeline, but you end up with

505

: 00:49:22

you kind of want to be sure that the day that you end up with is actually useful for the

purposes that you want to use it.

506

: 00:49:30

yeah, did that answer the question?

507

: 00:49:32

So you have the modeling aspect, you have the software aspect, which we are developing, I

think, you know, we will hopefully eventually get there.

508

: 00:49:39

Think about your problem, start simple.

509

: 00:49:43

Try to think about diagnostics.

510

: 00:49:45

And I think, again, I mentioned, you know, it's very much an open

511

: 00:49:50

an open problem, but maybe for your concrete problem at hand, you might be able to kind of

intuitively say, this looks good or this doesn't look good.

512

: 00:49:59

Automating this is like, it's a very interesting open problem and something that I'm

actively working on.

513

: 00:50:07

Yeah.

514

: 00:50:08

Yeah.

515

: 00:50:09

And thank you so much for all the work you're doing on that because I think it's super

important.

516

: 00:50:13

I'm really happy to see you on the base flow side because yeah, those guys are doing

517

: 00:50:18

Amazing work.

518

: 00:50:19

There is the new version that's now been merged on the dev branch, which is back in

agnostics.

519

: 00:50:29

So people can use it with their preferred deep learning package.

520

: 00:50:35

So I always forget the names, but TensorFlow, PyTorch, and JAX, I'm guessing.

521

: 00:50:44

I'm mostly familiar with JAX because that's the one.

522

: 00:50:48

and a bit of PyTorch because that's the ones we're interacting with in PymC.

523

: 00:50:54

This is super cool.

524

: 00:50:57

I've linked to the Baseflow documentation in the show notes.

525

: 00:51:01

Is there maybe, I don't know, a paper, a blog post, something like that you can link

people to with a workflow of patient experimental design and that way people will get an

526

: 00:51:14

idea of how to do that.

527

: 00:51:17

So hopefully by the time the episode is out, I will have it ready.

528

: 00:51:23

Right now I don't have anything kind of practical.

529

: 00:51:28

I'm very happy to send some of the kind of review papers that are out there on Bayesian

Experimental Design.

530

: 00:51:39

hopefully in the next couple of weeks I'll have the tutorial, like a very basic

introductory tutorial.

531

: 00:51:46

you know, we have a simple model, we have our simple parameters, you know, what we want to

learn, here is how you define your posteriors, here is how we define your policy, you

532

: 00:51:56

know, and then you switch on base flow, and then you know, voila, you have you have you

have your results.

533

: 00:52:02

So yeah, I'm hoping to get a blog post like of this of this sort done in in the next

couple of weeks.

534

: 00:52:11

So once ready, I will thank you.

535

: 00:52:13

I will thank you with that.

536

: 00:52:14

Yeah, for sure.

537

: 00:52:14

yeah, for sure.

538

: 00:52:16

Can't wait and Marvin and I are gonna start working on setting up a modeling webinar

Amazing which is a you know another format I have on the show So this is like, you know,

539

: 00:52:30

I'm like Marvin will welcome and share his screen and show how to do the the amortized

patient inference workflow with base flow also using pinc and all that cool stuff now that

540

: 00:52:44

the new API

541

: 00:52:45

is merged, we're going to be able to work on that together and set up the modeling

webinar.

542

: 00:52:52

So listeners, definitely stay tuned for that.

543

: 00:52:56

I will, of course, announce the webinar a bit in advance so that you all have a chance to

sign up.

544

: 00:53:02

And then you can join live, ask questions to Marvin.

545

: 00:53:06

That's going to be super fun.

546

: 00:53:08

And mainly see how you would do Amortize Bayesian inference, concretely.

547

: 00:53:15

Great.

548

: 00:53:16

Amazing.

549

: 00:53:17

Sounds fun.

550

: 00:53:19

Yeah, that's going to be super fun.

551

: 00:53:21

Something I was thinking about is that your work mentions enabling real-time design

decisions.

552

: 00:53:31

And that sounds really challenging to me.

553

: 00:53:33

So I'm wondering how critical is this capability in today's data-driven decision-making

processes?

554

: 00:53:42

Yeah.

555

: 00:53:44

I do think it really is quite critical, right?

556

: 00:53:46

In most kind of real world practical aspects, practical problems, you really do need to

run to make decisions fairly quickly.

557

: 00:53:56

Right?

558

: 00:53:58

Again, all the surveys as an example, you know, you have anything that involves a human,

and you want to adapt as you're performing the experiment, you kind of need to ensure that

559

: 00:54:10

things are

560

: 00:54:13

you're able to run things in real time.

561

: 00:54:15

And honestly, I think part of the reason why we haven't seen a big explosion or based on

experimental design in practice is partly because we couldn't until recently actually run

562

: 00:54:30

these things fast enough, both because of the computational challenges, now that we know

how to do amortize inference very well, now that we know how to train policies.

563

: 00:54:41

that will produce designs very well.

564

: 00:54:43

I am expecting things to, you know, to improve, right?

565

: 00:54:50

And to start to see some of these some of these methods applied in practice.

566

: 00:54:58

Having said that, I do think and please stop me if that's totally unrelated, but to make

things

567

: 00:55:09

successful in practice, there are a few other things that in my opinion have to be

resolved before, you know, we're confident that we can, you apply, you know, such black

568

: 00:55:24

boxes in a sense, right, because we have all these neural networks all over the place.

569

: 00:55:28

And it's not entirely clear whether all of these things are robust to various aspects of

the complexities of the real world.

570

: 00:55:38

Right.

571

: 00:55:38

So

572

: 00:55:41

things like model mis-specification, right?

573

: 00:55:43

So is your Bayesian model actually a good representation of the thing that you're trying

to study?

574

: 00:55:50

That's a big open problem again.

575

: 00:55:52

And again, I'm going to make a parallel to Bayesian inference actually.

576

: 00:56:00

For inference purposes, model mis-specification may not be as bad as it is for design

purposes.

577

: 00:56:09

And the reason for that is you will still get valid under some of the assumptions, of

course, you will still get valid inferences or like you will still be close.

578

: 00:56:18

You still get the best that you can do under the under the the the assumption of a wrong

model.

579

: 00:56:25

Now, when it comes to design, we have absolutely no guarantees.

580

: 00:56:30

And oftentimes we end up in very pathological situations where because we're using our

model to inform the data collection.

581

: 00:56:39

and then to also evaluate, right, fix that model on the same data that we've gathered.

582

: 00:56:45

If your model is misspecified, you might not even be able to detect the misspecification

because of the way that the data was gathered.

583

: 00:56:55

It's not IID, right?

584

: 00:56:56

Like it's very much a non-IID data collection process.

585

: 00:57:02

And so I think when we talk about practical things,

586

: 00:57:07

we really, really need to start thinking about how are we going to make our systems or the

methods that we develop a little bit more robust to misspecification.

587

: 00:57:20

And I don't mean we should solve model misspecification.

588

: 00:57:23

I think that's a very hard task that is basically unsolvable, right?

589

: 00:57:28

Like it is solvable under assumptions, right?

590

: 00:57:31

If you tell me what your misspecification is, you know, we can improve things, but in

general, this is not

591

: 00:57:37

something that we can sort of realistically address uniformly.

592

: 00:57:44

But yeah, so again, going back to practicalities, I do think it's of crucial importance to

sort of make our pipeline in diagnostics sort of robust to some forms of

593

: 00:57:58

mis-specification.

594

: 00:58:00

Yeah.

595

: 00:58:02

Yeah, yeah, for sure.

596

: 00:58:03

And that's where also

597

: 00:58:06

I really love Amortized Patient Inference because it allows you to do simulation-based

calibration.

598

: 00:58:13

And I find that especially helpful and valuable when you're working on developing a model

because already before fitting to data, you already have more confidence about what your

599

: 00:58:27

model is actually able to do and not do and where the possible pain points would be.

600

: 00:58:34

And I find that.

601

: 00:58:35

super helpful.

602

: 00:58:39

And actually talking about all that, I'm wondering where you see the future of Bayesian

experimental design heading, particularly with advancements in AI and machine learning

603

: 00:58:52

technologies.

604

: 00:58:53

Wow.

605

: 00:58:53

Okay.

606

: 00:58:54

So I do view this

607

: 00:58:57

type of work.

608

: 00:58:58

So this type of research is a little bit orthogonal to all of the developments in sort of

modern AI and machine learning.

609

: 00:59:06

And the reason for this is that we can literally borrow the latest and greatest

development in machine learning and plug it into our pipelines.

610

: 00:59:18

there is a better architecture to do X, right?

611

: 00:59:22

Like we can take that architecture and, you know, utilize it for our purposes.

612

: 00:59:26

So I think

613

: 00:59:27

you know, when it comes to the future of Bayesian experimental design, given, you know,

all of the advancements, I think this is great because it's kind of helping the field even

614

: 00:59:37

more, right?

615

: 00:59:38

Like we have more options to choose from, we have better models to choose from, and kind

of the data gathering aspect will always be there, right?

616

: 00:59:50

Like we will always want to collect better data for the purposes of, you know, our data

analysis.

617

: 00:59:57

And so, you know, the design aspect will still be there and with the better models, we'll

just be able to gather better data, if that makes sense.

618

: 01:00:13

Yeah, that definitely makes sense.

619

: 01:00:15

Yeah, for sure.

620

: 01:00:16

And that's interesting.

621

: 01:00:18

Yeah, I didn't anticipate that kind of answer, that's okay.

622

: 01:00:24

definitely.

623

: 01:00:26

see what you mean.

624

: 01:00:27

Maybe before like, yeah, sorry, but even if you think, you know, now you're everybody's

gonna have their AI assistant, right?

625

: 01:00:36

Now, wouldn't it be super frustrating if your AI assistant takes three months to figure

out what you like for breakfast?

626

: 01:00:47

And like, it's experimenting or like, it's just randomly guessing.

627

: 01:00:51

do you like fish soup for breakfast?

628

: 01:00:54

Like,

629

: 01:00:54

How about I prepare you a fish soup for breakfast or like, or I propose you something like

that, right?

630

: 01:00:59

And so I think again, like this personalization aspect, right?

631

: 01:01:06

Like again, kind of sticking to, I don't know, personal AI assistance, right?

632

: 01:01:13

The sooner or the quicker they are able to learn about your preferences, the better that

is.

633

: 01:01:19

And again, you know, we're learning about preferences.

634

: 01:01:21

Again, I'm gonna refer back to the...

635

: 01:01:23

you know, time value of many preference learning, like it is just a more complicated

version of that.

636

: 01:01:28

Right.

637

: 01:01:29

And so if your latest and greatest AI assistant is able to learn and customize itself to

your preferences much more quickly than otherwise, you know, that's a huge win.

638

: 01:01:41

Right.

639

: 01:01:41

And I think this is precisely where all these sort of principle data gathering techniques

can really shine.

640

: 01:01:48

Once we figure out, you know, the

641

: 01:01:51

the sort of the issues that I was talking about, I mean, that makes sense.

642

: 01:01:57

Maybe to play us out, I'm curious if you have something like applications in mind,

practical applications of bed that you've encountered in your research, particularly in

643

: 01:02:13

the fields of healthcare or technology that you found particularly impactful.

644

: 01:02:19

Right.

645

: 01:02:20

Excellent question.

646

: 01:02:21

And actually, I was going to mention that aspect as, you know, where you see the future of

Bayesian experimental design.

647

: 01:02:30

Part of kind of our blind spots, if I may refer to that as sort of blind spots, is that in

our research so far, we have very much focused on developing methods, developing

648

: 01:02:43

computational methods to sort of make some of these

649

: 01:02:48

based on experimental design pipelines actually feasible to run in practice.

650

: 01:02:56

Now, we haven't really spent much time working with practitioners, and this is a gap that

we're actively trying to sort of close.

651

: 01:03:09

In that spirit, we have a few applications in mind.

652

: 01:03:13

We sort of apply people.

653

: 01:03:16

particularly in the context of healthcare, as you mentioned.

654

: 01:03:19

So clinical trials design is a very big one.

655

: 01:03:24

So again, things like getting to the highest safe dose as quickly as possible, For, again,

being personalized to the human, given their context, given their various characteristics.

656

: 01:03:43

is one area where we're looking to sort of start some collaborations and explore this

further.

657

: 01:03:50

Our group in Oxford have a new PhD student joining that will be working in collaboration

with biologists to actually design experiments for something about cells.

658

: 01:04:06

I don't know anything about biology, so.

659

: 01:04:09

I'm not the best person to actually describe that line of work.

660

: 01:04:15

But hopefully there will be some concrete, exciting applications in the near future.

661

: 01:04:23

So that's applications in biology.

662

: 01:04:25

And finally, you know, there's constantly lots of different bits and pieces like, you

know, people from chemistry saying, hey, I have this thing, can we...

663

: 01:04:39

Can we work on maybe, you know, performing or like setting up a basic experimental design

pipeline?

664

: 01:04:45

I think the problem that we've had so far or I've had so far is just lack of time.

665

: 01:04:51

There's just so many things to do and so little time.

666

: 01:04:54

But I am very much actively trying to find time in my calendar to actually work on a few

applied projects because I do think, you know,

667

: 01:05:04

It's all like developing all these methods is great, right?

668

: 01:05:08

I mean, it's very interesting math that you do.

669

: 01:05:10

It's very interesting coding that you do.

670

: 01:05:13

But at the end of the day, you kind of want these things to make someone life's better,

right?

671

: 01:05:19

Like a practitioner that will be able to save some time or save some money or, you know,

improve their data gathering and therefore, you know, downstream analysis much better.

672

: 01:05:34

and more efficient thanks to some of this research.

673

: 01:05:38

So I hope that answered your question in terms of concrete applications.

674

: 01:05:42

think we'll see more of that.

675

: 01:05:44

But so far, you know, the two things are, yeah, clinical trial design that we're exploring

and some of this biology cell stuff.

676

: 01:05:53

Yeah, yeah, no, that's worrying.

677

: 01:05:56

that's, I mean, definitely looking forward to it.

678

: 01:05:59

That sounds absolutely fascinating.

679

: 01:06:01

yeah, if you can make that.

680

: 01:06:04

happening in important fields like that, that's going to be extremely impactful.

681

: 01:06:08

Awesome, Desi.

682

: 01:06:10

I've already...

683

: 01:06:14

Do you have any sort of applications that you think design, basic experimental design

might be suitable?

684

: 01:06:22

I know you're quite experienced in various modeling aspects based in modeling.

685

: 01:06:27

So, yeah, do you have anything in mind?

686

: 01:06:31

Yeah, I mean a lot.

687

: 01:06:33

So marketing, know is already using that a lot.

688

: 01:06:37

Yeah.

689

: 01:06:37

Clinical trials for sure.

690

: 01:06:40

Also now that they work in sports analytics, well, definitely sports, you know, you could

include that into the training of elite athletes and design some experiments to actually

691

: 01:06:54

test causal graphs and see if pulling that lever during training is actually something

that makes a difference during

692

: 01:07:04

the professional games that actually count.

693

: 01:07:07

yeah, I can definitely see that having a big impact in the sports realm, for sure.

694

: 01:07:15

Nice.

695

: 01:07:15

Well, if you're open to collaborations, can do some designing of experiments once you up

your sports models.

696

: 01:07:24

Yeah.

697

: 01:07:24

Yeah.

698

: 01:07:25

Yeah.

699

: 01:07:26

mean, as soon as I work on that, I'll make sure to reach out because that's going to be...

700

: 01:07:32

It's definitely something I'm work on and dive into.

701

: 01:07:36

So that's gonna be fascinating to work on that with you for sure.

702

: 01:07:42

Sounds fun, yeah.

703

: 01:07:44

Yeah, exactly.

704

: 01:07:46

Very, very exciting.

705

: 01:07:48

well, thanks, Stacey.

706

: 01:07:49

That was amazing.

707

: 01:07:50

I think we covered a lot of ground.

708

: 01:07:52

I'm really happy because Alina had a lot of questions for you.

709

: 01:07:56

But thanks a lot for keeping your answers very...

710

: 01:08:00

focused and not getting distracted by all my decorations.

711

: 01:08:06

Of course, I have to ask you the last two questions I ask every guest at the end of the

show.

712

: 01:08:12

So first one, if you had unlimited time and resources, which problem would you try to

solve?

713

: 01:08:19

it's a really hard one.

714

: 01:08:21

Honestly, I because I know you asked those questions.

715

: 01:08:24

I was like, what am going to say?

716

: 01:08:26

I honestly don't know.

717

: 01:08:27

There's so many things.

718

: 01:08:29

But

719

: 01:08:31

Again, I think it would be something of high impact for humanity in general, probably so

in climate change would be something that I would dedicate my unlimited time and

720

: 01:08:50

resources.

721

: 01:08:52

That's good answer.

722

: 01:08:54

That's definitely a popular one.

723

: 01:08:55

So you're in great company and I'm sure the team already working on that is going to be

very...

724

: 01:09:01

be happy to welcome you.

725

: 01:09:04

No, let's hope.

726

: 01:09:05

I think we need to speed up the solutions.

727

: 01:09:09

Like seeing what is happening, right?

728

: 01:09:11

I think it's rather unfortunate.

729

: 01:09:17

And second question, if you could have dinner with any great scientific mind, dead, alive

or fictional, who would it be?

730

: 01:09:25

Yeah.

731

: 01:09:26

So I think it will be Claude Shannon.

732

: 01:09:29

So, you

733

: 01:09:30

the godfather of information theory or like actually the father of information theory.

734

: 01:09:37

Again, partly because a lot of my research is inspired by information theory principles in

Bayesian experimental design, but also outside of Bayesian experimental designs.

735

: 01:09:49

It sort of underpins a lot of the sort of modern machine learning development, right?

736

: 01:09:58

And

737

: 01:09:59

What I think will be really quite cool is that if you were to have dinner with him, if I

were to have dinner with him and basically tell him like, hey, look at all these language

738

: 01:10:14

models that we have today.

739

: 01:10:15

Like Claude Shannon was the person that invented language models back in 1948, right?

740

: 01:10:21

So that's many years ago.

741

: 01:10:23

And like, literally not even having computers, right?

742

: 01:10:26

So he would, he calculated things by hand and produced output that actually looks like

English, right?

743

: 01:10:34

In 1948.

744

: 01:10:35

And so I think, you know, a brilliant mind like him, you know, just seeing kind of the

progress that we've made since then.

745

: 01:10:44

And like, we actually have language models on computers that behave like humans.

746

: 01:10:49

I'll be super keen to hear like, what is next from him.

747

: 01:10:53

And I think he will have some very interesting answers to that.

748

: 01:10:57

What is the future for information processing and the path to artificial, I guess people

call it artificial general intelligence.

749

: 01:11:08

So what would be the path to AGI from here onwards?

750

: 01:11:13

Yeah, for sure.

751

: 01:11:15

That'd be a fascinating dinner.

752

: 01:11:18

Make sure it comes like that.

753

: 01:11:22

Awesome.

754

: 01:11:22

Well.

755

: 01:11:22

Desi, thank you so much.

756

: 01:11:24

think we can call it a show.

757

: 01:11:28

I learned so much.

758

: 01:11:29

I'm sure my listeners did too, because as you showed, is a very, this is a topic that's

very much on the frontier of science.

759

: 01:11:38

So thank you so much for all the work you're doing on that.

760

: 01:11:42

And as usual, I put resources and a link to your website in the show notes for those who

want to dig deeper.

761

: 01:11:49

Thank you again, Desi, for taking the time and being on this show.

762

: 01:11:53

Thank you so much for having me.

763

: 01:11:54

It was my pleasure.

764

: 01:11:59

This has been another episode of Learning Bayesian Statistics.

765

: 01:12:03

Be sure to rate, review, and follow the show on your favorite podcatcher, and visit

learnbaystats.com for more resources about today's topics, as well as access to more

766

: 01:12:14

episodes to help you reach true Bayesian state of mind.

767

: 01:12:18

That's learnbaystats.com.

768

: 01:12:20

Our theme music is Good Bayesian by Baba Brinkman, fit MC Lass and Meghiraam.

769

: 01:12:25

Check out his awesome work at bababrinkman.com.

770

: 01:12:28

I'm your host.

771

: 01:12:29

Alex and Dora.

772

: 01:12:30

can follow me on Twitter at Alex underscore and Dora like the country.

773

: 01:12:34

You can support the show and unlock exclusive benefits by visiting Patreon.com slash

LearnBasedDance.

774

: 01:12:41

Thank you so much for listening and for your support.

775

: 01:12:44

You're truly a good Bayesian.

776

: 01:12:46

Change your predictions after taking information in.

777

: 01:12:50

And if you're thinking I'll be less than amazing.

778

: 01:12:53

Let's adjust those expectations.

779

: 01:12:56

me show you how to be a good Bayesian Change calculations after taking fresh data in Those

predictions that your brain is making Let's get them on a solid foundation

Share Episode

Shownotes

Transcripts

Follow

Links

Chapters

Video

More from YouTube