#100 Reactive Message Passing & Automated Inference in Julia, with Dmitry Bagaev

Episode 100 •
21st February 2024 • Learning Bayesian Statistics • Alexandre Andorra

*Proudly sponsored by **PyMC Labs**, the Bayesian Consultancy. **Book a call**, or **get in touch**!*

In this episode, Dmitry Bagaev discusses his work in Bayesian statistics and the development of RxInfer.jl, a reactive message passing toolbox for Bayesian inference.

Dmitry explains the concept of reactive message passing and its applications in real-time signal processing and autonomous systems. He discusses the challenges and benefits of using RxInfer.jl, including its scalability and efficiency in large probabilistic models.

Dmitry also shares insights into the trade-offs involved in Bayesian inference architecture and the role of variational inference in RxInfer.jl. Additionally, he discusses his startup Lazy Dynamics and its goal of commercializing research in Bayesian inference.

Finally, we also discuss the user-friendliness and trade-offs of different inference methods, the future developments of RxInfer, and the future of automated Bayesian inference.

Coming from a very small town in Russia called Nizhnekamsk, Dmitry currently lives in the Netherlands, where he did his PhD. Before that, he graduated from the Computational Science and Modeling department of Moscow State University.

Beyond that, Dmitry is also a drummer (you’ll see his cool drums if you’re watching on YouTube), and an adept of extreme sports, like skydiving, wakeboarding and skiing!

*Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at **https://bababrinkman.com/** !*

**Thank you to my Patrons for making this episode possible!**

*Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser and Julio*.

Visit https://www.patreon.com/learnbayesstats to unlock exclusive Bayesian swag ;)

**Takeaways:**

- Reactive message passing is a powerful approach to Bayesian inference that allows for real-time updates and adaptivity in probabilistic models.

- RxInfer.jl is a toolbox for reactive message passing in Bayesian inference, designed to be scalable, efficient, and adaptable.

- Julia is a preferred language for RxInfer.jl due to its speed, macros, and multiple dispatch, which enable efficient and flexible implementation.

- Variational inference plays a crucial role in RxInfer.jl, allowing for trade-offs between computational complexity and accuracy in Bayesian inference.

- Lazy Dynamics is a startup focused on commercializing research in Bayesian inference, with the goal of making RxInfer.jl accessible and robust for industry applications.

**Links from the show:**

- LBS Physics & Astrophysics playlist: https://learnbayesstats.com/physics-astrophysics/
- LBS #51, Bernoulli’s Fallacy & the Crisis of Modern Science, with Aubrey Clayton: https://learnbayesstats.com/episode/51-bernoullis-fallacy-crisis-modern-science-aubrey-clayton/
- Dmitry on GitHub: https://github.com/bvdmitri
- Dmitry on LinkedIn: https://www.linkedin.com/in/bvdmitri/
- RxInfer.jl, Automatic Bayesian Inference through Reactive Message Passing: https://rxinfer.ml/
- Reactive Bayes, Open source software for reactive, efficient and scalable Bayesian inference: https://github.com/ReactiveBayes
- LazyDynamics, Reactive Bayesian AI: https://lazydynamics.com/
- BIASlab, Natural Artificial Intelligence: https://biaslab.github.io/
- Dmitry's PhD dissertation: https://research.tue.nl/en/publications/reactive-probabilistic-programming-for-scalable-bayesian-inferenc
*Effortless Mastery*, by Kenny Werner: https://www.amazon.com/Effortless-Mastery-Liberating-Master-Musician/dp/156224003X- The Book of Why, by Judea Pearl: https://www.amazon.com/Book-Why-Science-Cause-Effect/dp/046509760X
- Bernoulli’s Fallacy, by Aubrey Clayton: https://www.amazon.com/Bernoullis-Fallacy-Statistical-Illogic-Science/dp/0231199945
- Software Engineering for Science: https://www.amazon.com/Software-Engineering-Science-Chapman-Computational/dp/1498743854

**Transcript**

*This is an automatic transcript and may therefore contain errors. Please **get in touch** if you're willing to correct them.*

Speaker:

In this episode, Dmitry Bagaev discusses

his work in Bayesian statistics and the

2

:development of RxInfer.jl, a reactive

message passing toolbox for Bayesian

3

:inference.

4

:Dmitry explains the concept of reactive

message passing and its applications in

5

:real-time signal processing and autonomous

systems.

6

:He discusses the challenges and benefits

of using RxInfer.jl, including

7

:its scalability and efficiency in large

probabilistic models.

8

:Dimitri also shares insight into the

trade-offs involved in Bayesian inference

9

:architecture and the role of variational

inference in rxinfer.jl.

10

:Additionally, he discusses his startup

Lazy Dynamics and its goal of

11

:commercializing research in Bayesian

inference.

12

:Finally, we also discussed the user

friendliness and trade-offs of different

13

:inference methods, the future developments

of rxinfer,

14

:and the future of automated patient

entrance.

15

:Coming from a very small town in Russia

called Nizhny Komsk, Dmitry currently

16

:lives in the Netherlands, where he did his

PhD.

17

:Before that, he graduated from the

computational science and modeling

18

:department of Moscow State University.

19

:Beyond that, Dmitry is also a drummer,

you'll see his cool drums if you're

20

:watching on YouTube, and an adept of

extreme sports like skydiving,

21

:wakeboarding, and skiing.

22

:Learning Basin Statistics, episode 100,

,:

23

:Dmitry Pagaev, welcome to Learning Basin

Statistics.

24

:Thanks.

25

:Thanks for inviting me for your great

podcast.

26

:Really, I feel very honored.

27

:Yeah, thanks a lot.

28

:The honor is mine.

29

:That's really great to have you on the

show.

30

:So many questions for you and yeah, we're

also gonna be able to talk again about

31

:Julia, so that's super cool.

32

:And I wanna thank of course Albert

Podusenko for putting us in contact.

33

:Thanks a lot Albert, it was a great idea.

34

:I hope you will love the episode.

35

:Well I'm sure you're gonna love Dmitry's

part, and mine is always...

36

:more in the air, right?

37

:And well, Dmitry, thanks again, because I

know you're a bit sick.

38

:So I appreciate it even more.

39

:And so let's start by basically defining

what you're doing nowadays, and also how

40

:did you end up doing what you're doing

basically?

41

:Yes.

42

:So I'm currently working at the University

of Technology in bias lab.

43

:And I just recently finished my PhD in

Bayesian statistics, essentially.

44

:So now I'm just like supervised students.

45

:I did some of the projects there and bias

lab itself is a group in the university

46

:that primarily work on like a real time

Bayesian signal processing.

47

:And we do research in that field.

48

:And the slogan, let's say of the lab is

sort of like, is natural artificial

49

:intelligence and it's phrased.

50

:Uh, like specifically like that, because

there's, there cannot be natural

51

:artificial intelligence.

52

:So it's like a play words, let's say.

53

:Um, and the, the lab is basically trying

to like develop automated, um, control

54

:systems or like novel signal processing

applications.

55

:And it's basically inspired by, uh,

neuroscience.

56

:I know.

57

:And we also opened a startup with my

colleagues.

58

:which is called Lazy Dynamics.

59

:And the idea is basically to commercialize

the research in the lab, but also to find

60

:the new funding for new PG students for

the university.

61

:But they're still quite young, so we are

still like less than one year, and we are

62

:currently like in search of clients and

potential investors.

63

:But yeah, my main focus still remains

being a postdoc in the university.

64

:Yeah, fascinating.

65

:So many things already.

66

:Um, maybe what do you do in your postdoc?

67

:Um, so my main focus, like primary is, uh,

supporting, uh, the toolbox that we wrote,

68

:uh, in our lab that I am a primary author.

69

:We call this toolbox, uh, RX and Ferb.

70

:Uh, and this is like essential part of my

PhD project.

71

:Um, and basically I love to code.

72

:So, um, more or less like, uh,

73

:my scientific career was always aligned

with software development.

74

:And the Erikson FUR project was a really

big project and many other projects in

75

:BiasLab, they depend on it.

76

:And it requires maintenance, like box

fixing, adding new features, performance

77

:improvements.

78

:And, and we are currently have several sub

projects that we develop alongside for the

79

:Erikson FUR.

80

:And that's just like the main focus for

me.

81

:And as something else, I also supervise

students for this project.

82

:Yeah, yeah.

83

:Of course.

84

:That must also take quite some time,

right?

85

:Yes, exactly.

86

:Yeah.

87

:Yeah, super cool.

88

:So let me start basically by diving a bit

more into the concepts you've just named,

89

:because you've already talked about a lot

of the things you work on, which is.

90

:a lot, as I guess listeners can hear.

91

:So first, let's try and explain the

concept of reactive message passing in the

92

:context of Bayesian inference for

listeners who may not be familiar with it,

93

:because I believe it's the first time we

really talk about that on the show.

94

:So yeah, talk to us about that.

95

:Also, because from what I understand, it's

really the main focus of your work, be it

96

:through RxInfR.

97

:infer.jl or lazy dynamics or biaslam.

98

:So let's start by having the landscape

here about reactive message passing.

99

:Yes, good.

100

:So yeah, ARIKS and FER is what we call

reactive message passing based Bayesian

101

:inference toolbox.

102

:And basically in the context of Bayesian

inference, we usually work with

103

:probabilistic models.

104

:And the probabilistic model is usually a

function of some variables and some

105

:variables are being observed.

106

:And we want to infer some probability

distribution over unobserved variables.

107

:And what is interesting about that is that

if we have a probabilistic model, we can

108

:actually represent it as a graph.

109

:And for example, if we can factorize our

probabilistic model into a set of factors,

110

:such that each node will be a factor and

each edge will be a variable of the model,

111

:more like hidden state, and some of them

are observed or not.

112

:And basically message passing by itself is

a very interesting idea of solving Bayes

113

:rule for a probabilistic model defined in

terms of the graph.

114

:So it does it by sending messages between

nodes in the graph, along edges.

115

:And it's quite a very big topic actually.

116

:But essentially here to understand is that

we can do that, right?

117

:So we can reframe the base rule as

something that has this messages in the

118

:ground, uh, reactive message passing, uh,

is a particular implementation, uh, of

119

:this idea.

120

:So, because in the traditional message

passing, we usually have to define an

121

:order of messages, like how, in what order

do we compute them?

122

:It may be very crucial, for example, if

the graph structure has loops.

123

:So there is like some structural

dependencies in the graph and reactive

124

:message passing basically says, okay, no,

we will not do that.

125

:We will not specify any order.

126

:Instead we will react on data.

127

:So, and, uh, the, the order of message

computations, uh, becomes essentially data

128

:driven and we do not enforce any

particular, uh,

129

:order of competition.

130

:OK, so if I try to summarize, that would

be something like, usually when you work

131

:on a Bayesian model, you have to specify

the graph and the order of the graph in

132

:which direction the nodes are going.

133

:In reactive message passing, it's more

like a non-parametric version in a way

134

:where you just say, there are these stuff,

but you're not specifying the.

135

:the directions and you're just trying to

infer that through the data.

136

:How wrong is that characterization?

137

:Not exactly like that.

138

:So indeed the graph that we work with,

they don't have any direction in them,

139

:right?

140

:Because messages, they can flow in any

direction.

141

:The main difference here is that reactive

message passing reacts on changes in data

142

:and updates posteriors automatically.

143

:Right?

144

:So.

145

:There is no particular order in which we

update the series.

146

:For example, if we have some variables in

our mode, like ABC, we don't know which

147

:will be updated first and which will be

the last.

148

:It basically depends on our observations.

149

:Uh, but, uh, it works like that, that as

soon as we have new observation, uh, the

150

:graph reacts in this observation and

updates the series as soon as it can.

151

:without explicitly specifying this order.

152

:And why would you do that?

153

:Why would that be useful?

154

:So it's a very good question.

155

:So because in BiasLab, we essentially work

with, we try to work with autonomous

156

:systems.

157

:And autonomous systems, they have to work

in the field, right?

158

:So like in the real world environment,

let's say, right?

159

:And

160

:Real world environment is extremely

unpredictable.

161

:If we want to, to be more clear, let's say

we try to develop a drone, which tries to

162

:navigate the environment and it has like

several sensors and we want to build a

163

:probabilistic model of the environment,

such that drones wants to act in this

164

:environment and like in sensors, it has

some noise in it.

165

:Like, uh, so essentially.

166

:We cannot predict in what order the data

will be arriving, right?

167

:Because you may have a video signal, you

may have an audio signal and this, um,

168

:devices that record video, let's say they

also have unpredictable update rate.

169

:Usually it's maybe like 60 frames per

second, but it may change.

170

:Right.

171

:Um, so instead of like fixing the

algorithm and saying, okay, we wait for

172

:like new frame.

173

:from a video, wait for a new frame from an

audio, then we update, then we wait again.

174

:Instead of doing that, we just simply let

the system react on new changes and update

175

:the series as soon as possible.

176

:And then based on new posteriors, we act

as soon as possible.

177

:This is kind of the main idea of reactive

implementations.

178

:And in traditional software,

179

:for Bayesian inference, for example, we

just have a model, and we have a data set,

180

:and we feed the data set to the model, and

we have the posterior, and then we analyze

181

:the posterior, and it also works really

great, right?

182

:But it doesn't really work in the field

where you don't have time to synchronize

183

:your data set and to react as soon as you

can.

184

:Okay, okay, I see.

185

:So that's where, basically,

186

:This kind of reactive message passing is

extremely useful when you receive data in

187

:real time that you don't really know the

structure of.

188

:Yes, we work primarily with real-time

signals.

189

:Yes.

190

:Okay, very interesting.

191

:Actually, do you have any examples, any

real-life examples that you've worked on

192

:or...

193

:You know, this is extremely useful to work

on with RxInfoR.jl or just in general,

194

:these kind of relative messages passing.

195

:Yes.

196

:So I myself, I usually do not work with

applications.

197

:So my primary focus lies in the actual

Bayesian inference engine.

198

:But in our lab, there are people who work,

for example, on audio signals.

199

:Right.

200

:So you want to you want, for example,

maybe create a probabilistic model of

201

:environment to be able to denoise speech

or it or it may be like a position

202

:tracking system or a planning system in

real time.

203

:In our lab, we also very often refer to

the term active inference.

204

:which basically defines a probabilistic

model, not only of your environment, but

205

:also of your actions, such that you can

infer the most optimal course of actions.

206

:And this might be useful in control

applications, also for the drone, right?

207

:So we want to infer not only the position

of the drone based on sensors that we

208

:have, but also how it should act to avoid

an obstacle, for example.

209

:I see.

210

:Yeah, OK, super interesting.

211

:So basically, any case where you have

really high uncertainty, right, that kind

212

:of stuff, OK, yes, super interesting.

213

:And so what prompted you to create a tool

for that?

214

:What inspired you to develop our existing

Forto.jl?

215

:And maybe also tell us how it differs from

traditional Bayesian inference tools.

216

:be it in Python or in R or even in Julia.

217

:If I'm a Julia user, I'm used to use

probabilistic programming language in

218

:Julia, then what's the difference with

RxInfoR?

219

:This is a good question.

220

:But there are two questions in one about

inspiration.

221

:So I joined the bias lab in 2019.

222

:without really understanding what it is

going to be about.

223

:So, but really understanding how difficult

it is really.

224

:So, and the inspiration for me came from

the project that I started my PhD on.

225

:And basically the main inspiration in our

lab is like the so-called the free energy

226

:principle, which kind of tries to explain.

227

:how natural biotic systems behave.

228

:Right.

229

:So, and they basically say they define

so-called Bayesian brain portesies and

230

:pre-energy principles.

231

:So they basically say that any biotic

system, they define a probabilistic model

232

:of its environment and tries to infer the

most optimal course of action to survive

233

:essentially.

234

:But all of this is based on Bayesian

inference as well.

235

:So, right.

236

:At the end.

237

:It kind of, it's a very good idea, but at

the end, it all boils down to the, to the

238

:Bayesian inference.

239

:And basically if you look how biotech

system work, we, we note that there are

240

:very specific properties of this biotech

system.

241

:So they do not consume a lot of power.

242

:Right.

243

:It's actually, it has been proven that our

brain consumes like about 20 Watts of

244

:energy, right.

245

:And it's like an ex.

246

:extremely efficient device, if we can say,

right?

247

:It does not even compare with

supercomputers.

248

:It's also scalable because we live in the

very complex environment with many

249

:variables.

250

:We act in real time, right?

251

:And we are able to adapt to the

environment.

252

:And we are also kind of robust to what is

happening around us, right?

253

:So...

254

:If something new happens, we were able to

adapt to it instead of just failing.

255

:Right.

256

:And this is kind of the idea.

257

:So the inspiration for this Bayesian

inference toolbox that we need to be

258

:scalable, real time, adaptive, robust,

super efficient, and also low power.

259

:Right.

260

:So this is the main ideas behind RX

Inferior project.

261

:And here we go to the second part of the

question.

262

:How does it differ?

263

:Because this is exactly where we differ,

right?

264

:So other solutions in Python or in Julia,

also very cool.

265

:There are actually a lot of cool libraries

for Bayesian inference, but most of them,

266

:they have a different set of trades off or

requirements.

267

:And maybe I will be super clear.

268

:We are not trying to be better.

269

:But we are trying to have a different set

of requirements for the Bayesian different

270

:system.

271

:Yeah.

272

:Yeah, you're working on a different set of

needs, in a way.

273

:Yes, yes.

274

:And it's application-driven.

275

:Yeah, you're trying to address another

type of applications.

276

:Exactly.

277

:And if we directly compare to other

solutions, they are mostly based on

278

:sampling, like HMC or not.

279

:Or maybe they are like black box methods

like a GVI, automatic differential

280

:variation inference or VDI.

281

:And they basically, they are great methods

that they tend to consume a lot of

282

:computational power or like energy, right?

283

:So they do a very expensive simulation.

284

:It may run for maybe hours, maybe even

days in some situations.

285

:And they were great, but you cannot really

apply it in this autonomous systems where

286

:you need to...

287

:Uh, like if we're again talking about

audio, it's like 44 kilohertz.

288

:So we need to really perform Bayesian

inference and extremely fast scale.

289

:And it seems you're not, uh, are not

really applicable in this situation.

290

:So.

291

:Yeah, fascinating.

292

:And you were talking, well, we'll get back

to the computation part a bit later.

293

:Maybe first I'd like to ask you, why did

you do it with Julia?

294

:Why did you choose Julia for RxInfer?

295

:And what advantages does it offer for your

applications of patient inference?

296

:The particular choice of Julia was

actually driven by the needs of the bias

297

:lab in the university because all the

research which we do in the university now

298

:in our lab is done in Julia and that

decision has been made by our professor

299

:many, many years ago.

300

:Interestingly enough, our professor

doesn't really code.

301

:But Julia is a really great language.

302

:So if I would choose myself.

303

:If I, I would still choose Julia.

304

:It's, it's, it's a great language.

305

:It's fast.

306

:Right.

307

:So, and our primary concern is efficiency.

308

:Um, and like Python can also be fast.

309

:Uh, if you like know how to use it, if you

use an MP or like some specialized

310

:libraries, uh, but with July, it's, it's

really easy.

311

:It is easier.

312

:In some situations, of course, you need to

know a bit more.

313

:So my background is in C and C++.

314

:And I understand like how compilers works,

for example.

315

:So maybe for me, it's a bit easier to

write a performance Julia code.

316

:But in general, it's just, it's just

really, it's a nice, fast language.

317

:And it also develops fast in the sense

that new versions of Julia, they,

318

:come up like every several months.

319

:And it really gets better with each

release.

320

:Another thing which is actually very

important for us as well is macros.

321

:Are macros in Julia?

322

:So for people who are listening, so macros

basically allow us to apply arbitrary code

323

:transformations to the existing code.

324

:And it also allows you to create

sublanguage within a language.

325

:And why it is particularly useful for us

is that specifying probabilistic models in

326

:Bayesian inference is a bit hard or

tedious.

327

:We don't want to directly specify these

huge graphs.

328

:And instead, what we did and what Turing

also did and many other libraries in

329

:Julia, they came up with the main specific

language for specifying probabilistic

330

:programs.

331

:And it's extremely cool.

332

:So it's much, much simpler to define a

probabilistic program in Julia than in

333

:Python, in my opinion.

334

:And I really like this feature of Julia.

335

:Yeah, these basically building block

aspect of the Julia language.

336

:Yeah, yeah, I've heard that.

337

:There are other aspects I can mention of

Julia.

338

:By the way, maybe I also can make an

announcement regarding Julia is that the

339

:next Julia the con is happening in I'm

told in the city where I'm currently in.

340

:And it's going to be very cool.

341

:It's going to be in PC stadium in the

football stadium.

342

:Right.

343

:The technical is the technical conference

about programming language is going to be

344

:on the stadium.

345

:So, but so another aspect.

346

:about Julia is this notorious dynamic

multiple dispatch.

347

:And it was extremely useful for us in

particular for reactive message passing

348

:implementation.

349

:Because again, so if we think about how

this reactiveness work and how do we

350

:compute these messages on the graph, in

order to compute the message, we wait for

351

:inputs.

352

:And then when all inputs have arrived, we

have to decide

353

:how to compute the message.

354

:And computation of the message is

essentially solving an integral.

355

:But if we know types of the arguments, and

if we know the type of the node, it might

356

:be that there is an analytical solution to

the message.

357

:So it's not really necessary to solve a

complex integral.

358

:And we do it by multiple dispatch in

Julia.

359

:So multiple dispatch in Julia helps us to

pick the most efficient message update

360

:rule.

361

:on the graph, and it's basically built

into the language.

362

:It's also possible to emulate it in

Python, but in Julia, it's just fast and

363

:built-in, and it works super nice.

364

:No idea.

365

:Yeah, super cool.

366

:Yeah, for sure.

367

:Super interesting points.

368

:And I'm very happy because it's been a

long time since we've had a show with some

369

:Julia practitioners, so that's always very

interesting to hear of what's going on in

370

:that.

371

:in that field and yeah, I would be

convinced just by coming to PSV Eindhoven

372

:Stadium.

373

:You don't have to tell me more.

374

:I'll be there.

375

:Let's do a live show in the stadium.

376

:Yes, I will be there.

377

:Yeah.

378

:Yeah, that sounds like a lot of fun.

379

:And actually, so I'm myself an open source

developer, so I'm very biased to ask you

380

:that question.

381

:What were some of the biggest challenges

you faced when you developed RxInfer?

382

:And how did you overcome them?

383

:I guess that's like the main thing you do

when you're an open source developer is

384

:putting a tire.

385

:This is an amazing question.

386

:I really like it.

387

:So, and I even have like some of the

answers in my PhD dissertation.

388

:And I will probably just go ahead.

389

:I'll probably just quote it, but I don't

remember exactly how I framed it.

390

:But I took it from the book, which is

called, um, uh, software engineering for

391

:science.

392

:So, and basically it says that people

usually underestimate how difficult it is

393

:to create, um, a software in scientific

research area.

394

:Uh, and the main difficulty with that is

that there are no clear guidelines to

395

:follow.

396

:Uh, it's not like designing a website with

clear, like a framework rules and you just

397

:need tasks between like people and team.

398

:No, it's like, um, new insights of

science, like, or like an area where we

399

:work in that they happen every day.

400

:Right.

401

:And the requirements for the software,

they may change every day.

402

:Uh, and it's really hard to like come up

with a specific design before we start

403

:developing because

404

:requirements change over time because you

may create some software for research

405

:purposes and then you found out something

super cool which works better or faster or

406

:scales better and then you realize that

well you actually have to start over

407

:because this is just better we just we

just found out something cooler and

408

:It also means that a developer must invest

time into this research.

409

:So it's not only about coding, like you

should understand how it all works from

410

:the scientific point of view, from a

mathematical point of view.

411

:And sometimes if this is like a cutting

edge research, there are no books about

412

:how it works, right?

413

:So we must invest time in reading papers.

414

:Um, and also being able to write a good

code, which is fast and efficient.

415

:Uh, and all of these problems, they, they

also cured, uh, when we developed our

416

:extinfer, uh, even though I'm the main

author, uh, a lot of people have helped

417

:me, right?

418

:It's like, uh, very thankful for that.

419

:Uh, and for our extinfer in particular,

for my, I also needed to learn a very big

420

:part of statistics because when I joined

the lab,

421

:I actually didn't have a lot of experience

with Bayesian inference and with graphs

422

:and with message passing.

423

:So I really need to dive into this field.

424

:And many people helped me to understand

how it works.

425

:A lot of my colleagues, they have spent

their time explaining.

426

:And even though, right, so we have already

this stack of difficulties at the end or

427

:like maybe not at the end, but the

software that we use, we would like it to

428

:be.

429

:Easy to use, like, or user friendly.

430

:So we already have this difficulties about

we don't know how to design it.

431

:We have to invest time into reading

papers.

432

:But then we at the end, we want to have a

functional software that is easy to use,

433

:addresses different needs and allows you

to find new insights.

434

:So the software should be designed such

that it does not.

435

:impose a lot of constraints on what you

can do with this software, right?

436

:Because scientific software is about

finding new insights, not about like doing

437

:some predefined set of algorithm.

438

:You want to find something new

essentially.

439

:And software should help you with that.

440

:Yeah, yeah, for sure.

441

:That's a good point.

442

:What do you think, what would you say are

the key challenges in achieving

443

:scalability and efficiency in this

endeavor and how does RxInfair address

444

:this?

445

:Basically, we are talking in the context

of Bayesian inference and the key

446

:challenge in

447

:the base rule doesn't scale, right?

448

:It's, the formula looks very simple, but

in practice, then we start working with

449

:large probabilistic models.

450

:Just blind application of base rule

doesn't scale because it has exponential

451

:complexity with respect to the number of

variables.

452

:And Arikson-Ford tries to tackle this

by...

453

:having essentially two main components in

the recipe, like maybe three, let's say

454

:three.

455

:So first of all, we use factor graphs to

specify the model.

456

:So we work with factorized models.

457

:We work with message passing, and message

passing essentially converts the

458

:exponential complexity of the Bayes rule

to linear, but only for highly factorized

459

:models.

460

:And like highly factorized here is a

really crucial component, but many models

461

:are indeed highly factorized.

462

:It's it means that Variables do not

directly depend on all other variables.

463

:They directly depend on maybe a very small

subset of variables in the model.

464

:And the third component here is

variational inference.

465

:So because it allows us to trade off the

computational complexity with accuracy.

466

:So if the task is too difficult or it

doesn't scale, basically what variational

467

:inference gives you is the ability to

impose a set of constraints into your

468

:problem, because it reframes the original

problem as an optimization task.

469

:And we can optimize with up to a certain

constraint.

470

:For example, we may say that this variable

is distributed as a Gaussian distribution.

471

:It may not be true in reality and we lose

some accuracy, but at the end it allows us

472

:to solve some equations faster.

473

:And we can impose more and more

constraints if we don't have enough

474

:computational power and if you have large

model, or we may relax constraints if we

475

:have enough computational power and we

gain accuracy.

476

:So we have this sort of a slider.

477

:which allows us to scale better.

478

:But here's the thing, right?

479

:We always can come up with such a large

model with so many variables and so

480

:difficult relationships between variables

where it still will not scale.

481

:And this is fine.

482

:But Alexin Fur tries to push this boundary

for like scaling Bayesian inference to

483

:large models.

484

:And actually, so you're using variational

inference quite a lot in this endeavor,

485

:right?

486

:So actually, can you discuss the role of

variational inference here in RxInfer and

487

:maybe any innovations that you've

incorporated in this area?

488

:So the role I kind of touched upon a

little bit is that it acts as like a

489

:slider.

490

:Right.

491

:In in the controlling the complexity and

the accuracy of your inference result.

492

:This is the main role.

493

:Of course, for some applications, this

might be undesirable.

494

:For some applications, you may want to

have a perfect posterior estimation.

495

:But for some applications, it's not a very

big deal.

496

:Again, we are talking about different

needs for different application.

497

:here.

498

:And the innovation that RX and Fer brings,

I think it's like one of the few

499

:implementation as message passing, like

variational inference as message passing,

500

:because it's usually implemented as like

black box method that takes a function

501

:like a probabilistic model function and

maybe does some automatic differentiation

502

:or some extra sampling under the hood.

503

:And message passing by itself has a very

long history, but I think people

504

:mistakenly think that it's quite limited

to like some product algorithm.

505

:But actually, variational inference can

also be implemented as message passing.

506

:And it's quite good.

507

:So it opens the applicability of the

message passing algorithms.

508

:And also.

509

:As we already talked a little bit about

this reactive nature of the inference

510

:procedure, so it's also maybe even the

first reactive variational inference

511

:engine, which is designed to work with

infinite data streams.

512

:So it continuously updates this posterior

continuously does minimization.

513

:It does not stop.

514

:And as soon as new data arrive, we

basically update our posteriors.

515

:But in between this kind of data windows,

we can spend more computational resources

516

:to find better approximation for the

variational inference.

517

:But yeah, but all other solutions, let's

say that are also variational inference,

518

:they basically require you to, yeah.

519

:to wait for the data, then feed to the

data, or wait for the entire data set,

520

:feed the data set, and then you have the

result, then you analyze the result, and

521

:then you repeat.

522

:So RxInfoR works a bit differently in that

regard.

523

:Yeah.

524

:Fascinating.

525

:And that, I'm guessing you have some

examples of that up in the RxInfoR

526

:website, maybe we can...

527

:a link to that in the shows for people who

are interested to see how you would apply

528

:that in practice?

529

:So I,

530

:So it does not really require reactivity,

but because it's kind of like easy to use

531

:and fast, students can do some homework

for signal processing applications.

532

:What I already mentioned is that we work

with audio signals and with control

533

:applications.

534

:I don't really have a particular example

if our sensor is being used in the field.

535

:or by an industry.

536

:So it's primarily our research tool

currently, but we want to extend it.

537

:So it's still a bit more difficult to use

than Turing, let's say.

538

:Turing, which is also written in Julia,

because yeah, message passing is a bit

539

:maybe more difficult to use and it is not

that universal as HMC and NUTS still

540

:require some approximation methods.

541

:Yeah.

542

:So we still use it as a research tool

currently, but we have some ideas in the

543

:lab, how to expand the available set of

probabilistic models we can run an

544

:inference on.

545

:And yes, indeed, on our documentation, we

have quite a lot of examples where we can

546

:use, but these examples, they are, I would

say, educational in most of the cases.

547

:at least in the documentation.

548

:So we are at this stage where we have a

lot of ideas how we can improve the

549

:inference, how we make it faster, such

that we can actually apply it for real

550

:tasks, like for real drones, for real

robots, to make a real speech, like the

551

:noise or something similar.

552

:Yeah, definitely said.

553

:That would be super interesting, I'm

guessing, for people who are into these

554

:and also just want to check out.

555

:I have been checking out your website

recently to prepare for the episode.

556

:Actually, can you now...

557

:So you've shared some, like the overview

of the theory, how that works, what

558

:RxInfer does in that regard.

559

:Can you share what you folks are doing

with Lazy Dynamics, how that's related to

560

:that?

561

:How does that fit into this ecosystem?

562

:So yeah, Lazy Dynamics, we created this

company to commercialize the research that

563

:we do at our lab to basically find funding

to make our extrovert better and ready for

564

:industry.

565

:Because currently, let's say,

566

:Ericsson is a great research tool for our

purposes, right?

567

:But industry needs some more properties to

the addition that I have already

568

:mentioned.

569

:Right?

570

:For example, indeed the Bayesian inference

engine must be extremely robust, right?

571

:It does not allow to fail if we really

work in the field.

572

:And this is not really a research

question.

573

:It's more about like implementational

side.

574

:Right.

575

:It's like a good goal to good code

coverage, like great documentation.

576

:And this is what we kind of also want to

do with lazy dynamics.

577

:We want to take this next step and want to

create a great product for other

578

:companies, especially that can rely on Rx

and Fur in the maybe in their research or

579

:maybe even in the field.

580

:Right.

581

:And maybe we create some sort of a tools,

a tool set around RxInfer that will allow

582

:you to maybe debug the performance of your

probabilistic problem or your

583

:probabilistic inference, right?

584

:It's also not about research.

585

:It's about like having it more accessible

to other people, like finding bugs or

586

:mistakes in their model specification,

make it easier to use.

587

:Or maybe, for example, we could.

588

:come up with some sort of a library of

models, right?

589

:So you would want to build some autonomous

system and it may require a model for

590

:audio recognition, it may require a model

for video recognition.

591

:And this kind of set of models, they can

be predefined, very well tested, have a

592

:great performance, super robust.

593

:And basically Lazy Dynamics may provide an

access to this kind of a library.

594

:right?

595

:So, and for this kind of, because this is

not a research related questions, it's, it

596

:must be done in a company with like a very

good programmers and very good code

597

:coverage and documentation.

598

:But for research purposes, Ericsson-Fer is

already a great toolbox.

599

:And basically many students in our lab,

they already use it.

600

:But.

601

:Yeah, because we are all sitting in the

same room, let's say on the same floor, we

602

:can kind of brainstorm, find bugs, fix it

on the fly and they keep working that.

603

:But if we want Rx, for Rxinfer to be used

in industry, it really needs to be a

604

:professional toolbox with like a

professional support.

605

:Yeah.

606

:Yeah, I understand that makes sense.

607

:Surprised you can, I don't know when you

sleep though, between the postdoc, the

608

:open source project and the company.

609

:So yeah, it's a great comment, but yeah,

it's hard.

610

:Yeah, hopefully we'll get you some sleep

in the coming months.

611

:To get back to your PhD project, because I

found that very interesting.

612

:So your dissertation will be in the show

notes.

613

:But something I was also curious about is

that in this PhD project, you explore

614

:different trade-offs for Bayesian

inference architecture.

615

:And you've mentioned that a bit already a

bit earlier, but I'm really curious about

616

:that.

617

:So could you elaborate on these trade-offs

and why they are significant?

618

:Yes, we already touched a little bit about

that.

619

:So the main trade-offs here are kind of

computational load, efficiency,

620

:adaptivity, high power consumption, magic.

621

:Yeah.

622

:And another aspect actually, which we

didn't talk about yet is structural model

623

:adaptation.

624

:So this is the requirements that we are

favor.

625

:in the Ricks Center.

626

:And this has the requirements that were

like central to my PhD project.

627

:And this all arises, all of these

properties, they are not just coming from

628

:a vacuum.

629

:They are coming from real time signal

processing applications on autonomous

630

:systems.

631

:We don't have a lot of battery power.

632

:We don't have a very powerful CPUs on this

autonomous devices, because essentially

633

:what we want to do also is that

634

:We want to be able to run a very

difficult, large probabilistic models on

635

:the Raspberry Pi.

636

:And Raspberry Pi doesn't even have a GPU.

637

:So we can buy some small sort of a GPU and

put it on the Raspberry Pi.

638

:But still, the computational capabilities

are very, very limited on edge devices.

639

:For example, one may say, let's just do

everything in the cloud, which is a very

640

:valid argument, actually.

641

:But we also, in some situations, the

latencies are just too big.

642

:And also, maybe we don't have access to

the internet in some areas, but we still

643

:want to create these adaptive Bayesian

inference systems like a drone that they

644

:may...

645

:explore some area maybe in the mountain or

something where we don't really have an

646

:internet so we cannot really process

anything in the cloud.

647

:So it must work as efficient as possible.

648

:On a very, very small device that doesn't

have a lot of power doesn't have a lot of

649

:battery and still this should work in real

time.

650

:Yeah, I think, I think this is mostly the

main trades of and

651

:In terms of how we do it, we use this

variational inference and we sacrifice

652

:accuracy with respect to scalability.

653

:Reactive message passing allows us to

scale to a very large models because it

654

:works on Factor graphs.

655

:Yeah.

656

:And I think that's, these are very

important points to make, right?

657

:Because always when you work and you build

an open source,

658

:package you have to trade off to make.

659

:So that means you have to choose whether

you're going to a general package or a

660

:more specified one.

661

:And that will dictate in a way your trade

off.

662

:In RxInfer, it seems like you're quite

specified, specialist of message passing

663

:inference.

664

:So the cool thing here is that I'm

665

:choices because you're like, no, our main

use case is that.

666

:And so we can use that.

667

:And the devirational inference choice, for

instance, is quite telling because in your

668

:case, it seems to be really working well,

whereas we could not do that in PMC, for

669

:instance.

670

:If we remove the ability to use HMC, we

would have quite a drop in the user

671

:numbers.

672

:So yeah, that's always something I'm.

673

:Try to make people aware of when they are

using open source packages.

674

:You can do everything.

675

:Yeah, exactly.

676

:Exactly.

677

:So I actually really, when I have a need,

I really enjoy working with like HMC or

678

:NAT based methods because they just work,

just like magic.

679

:And, but, and here's the trade off, right?

680

:They work magically in many situations.

681

:But they're slow in some sense.

682

:Let's say they're not slow, but they're

slower than a message button.

683

:So here is this trade-off.

684

:So user friendliness is really, really

important key in this equation.

685

:Yeah, and what do you call user

friendliness in your case?

686

:So what I refer to user friendliness here

is that a user can specify a model, press

687

:a button with HMC and it just runs and the

user gets a result.

688

:Yes, a user needs to wait a little bit

more.

689

:But anyway, like user experience is great.

690

:Just specify a model, just run inference,

just get your result.

691

:With RxInfer, it's a bit less easier

because in most of the cases,

692

:uh, message passing works like that, that

it favors like analytical solutions on the

693

:graph.

694

:And if analytical solution for a message

is not available, uh, basically a user

695

:must specify an approximation method.

696

:Uh, it actually also can be HMC, uh, just

in case.

697

:Uh, but still our X and four does not

really specify a default approximation

698

:method.

699

:Currently,

700

:fine default approximation, but because it

does not define it currently, if a user

701

:specifies a complex probabilistic model,

it will probably throw an error saying

702

:that, okay, I don't know how to solve it,

please specify what should I do here and

703

:there.

704

:And for a new user, it might be a bit

unintuitive how to do that, what to

705

:specify.

706

:So for HMC, there's no need to do it, it

just works.

707

:But if RxInfer, it's not that easy yet.

708

:That's what I was referring to,

user-friendliness.

709

:Yeah, that makes sense.

710

:And again, here, the interesting thing is

that the definition of user-friendliness

711

:is going to depend on what you're trying

to optimize, right?

712

:What kind of use case you're trying to

optimize on.

713

:Yes.

714

:Actually, what's the future for RxInfer?

715

:What are the future developments or

enhancements that you are planning?

716

:So, we have already touched a little bit

about like Lazy Dynamics side, which tries

717

:to make a really, like a commercial

product out of the person, where we have

718

:great support.

719

:This is one side of the future, but we

also have a research side of the project.

720

:And research side of the project includes

structural model adaptation.

721

:We just, uh, which in my opinion is quite

cool.

722

:So what it basically means in a few words

is that we, we may be able in the future

723

:to change the structure of the model on

the fly without stopping the inference

724

:procedure and you may need it for several

reasons, for example, uh, computational

725

:power, uh, computational budget change,

and we are not longer able, we are no

726

:longer able to run inference.

727

:on such a complex model.

728

:So we want to reduce the complexity of the

model.

729

:We want to change the structure, maybe put

some less demanding factor nodes.

730

:And we want to do it on the fly.

731

:We want actually stopping the inference

because for like sampling based methods,

732

:if we change the model, we basically are

forced to restart because we have this

733

:change and it's quite difficult to reduce

the previous result if the structure of

734

:the model change.

735

:graphs, it's actually possible.

736

:So another point why we would need that in

the field is that if you could imagine

737

:different sensors, so we have different

observations, and one sensor all of a

738

:sudden just burned out, or glitched, or

something like that.

739

:So essentially, we are not longer having

this sort of observation.

740

:So we need to change the structure of our

model

741

:to account for this glitch or breakage of

the sensor.

742

:And this is also where reactive message

passing helps us because we basically,

743

:because we do not enforce the particular

order of updates, we stop reacting on this

744

:observation because it's no longer

available.

745

:And we also change the structure of the

model to account for that.

746

:Another thing for the future of RxN4 in

terms of research is that we want to be

747

:to support natively different update rates

for different variables.

748

:And so what I mean by that is that if you

imagine an audio recognition system, let's

749

:say, or audio enhancement system, let's

say, and you have you modeled the

750

:environment of like a person who is

talking around several persons and let's

751

:say their speech signal.

752

:arise at the rate of like 44 kilohertz if

we are talking about a typical microphone.

753

:But their environment, where are they

currently sitting, doesn't really change

754

:that fast because they may sit in a bar

and it will be a bar an hour later.

755

:So there's no need to infer this

information that often as their speech.

756

:So it changes very rarely.

757

:So we have a different set of variables

that may change at different scales.

758

:And we want also to support this natively

in RxInfer.

759

:So we can also make it easier for the

inference engine.

760

:So it does not spend computational

resources on variables, which are not

761

:updating fast.

762

:We want to be able to support

non-parametric models in Rx and FUR.

763

:And this includes like Gaussian processes.

764

:And we have a research, so currently we

have a PhD student in our lab who is

765

:working a lot on that and he has a great

progress.

766

:It's not available in the current version

of Rx and FUR, but he has like experiments

767

:and it works all nicely.

768

:At some point it will be integrated into

the public version.

769

:And...

770

:Yeah, and it just, you know, just

maintenance and fixing bugs and this kind

771

:of stuff, improving the documentation.

772

:So the documentation currently needs

improvement because we have quite some

773

:features and additions that we have

already integrated into the framework and

774

:we happily use them ourselves in our lab

for our research.

775

:But it's like maybe poorly documented,

let's say.

776

:So other people in theory can use this

functionality, but because they cannot go

777

:to my table in the office in the Einhorn

University of Technology, they cannot ask

778

:how to use it properly.

779

:So we should just put it into

documentation and so other people can use

780

:that as well.

781

:Yeah, yeah.

782

:Yeah.

783

:That makes sense.

784

:That's a nice roadmap for this year.

785

:And looking ahead, what's...

786

:your, you know, what's your vision, let's

say, for the future of automated patient

787

:inference in the way you do it, especially

in complex models like yours.

788

:Yeah, what's your vision about that?

789

:What would you like to see in the coming

years?

790

:Also, what would you like to not see?

791

:A good question.

792

:So in my opinion, the future is very

bright.

793

:the future of automated vision and like a

lot of great people working on this and

794

:start to work on that more people are

coming.

795

:Right.

796

:So so many toolboxes in Python and Julia,

like I am see cheering Julia in our there

797

:are like in C plus 10.

798

:So, so many implementations and it's only

getting better every year.

799

:Right.

800

:But I think in my opinion, the future is

that there will be several applications,

801

:like in our case, this autonomous systems

or maybe something else.

802

:And this packages, they will basically not

really compete.

803

:But in a sense, they will, like for

different applications, you will choose a

804

:different solution because all of them

will be kind of great in their own

805

:application.

806

:But I'm not sure if there will be like a

super ultra cool method that solves all

807

:problems of all applications in Bayesian

inference.

808

:And maybe we'll have who knows.

809

:But in my opinion, there will be always

this trades of trades of in different

810

:applications and we'll just have we'll use

different methodologies.

811

:Yeah.

812

:Yeah, that makes sense in.

813

:In a way.

814

:I like your point here, but all these

different methods cooperating in a way

815

:because they are addressing different

workflows or different use cases.

816

:So yeah, definitely I think we'll have

stuff to learn from one type of

817

:application to the other.

818

:I like this analogy of like, no, we don't

cut the bread with a fork.

819

:But it doesn't really make a fork a

useless tool.

820

:I mean, we can use a fork for something

else and we are not eating a soup with a

821

:knife, but it doesn't make knife a useless

tool.

822

:So these are tools that are great, but for

their own purposes.

823

:So Alexin Fur is like a good tool for like

real time signal process application.

824

:And Turing and Julia is like a great tool

for other applications.

825

:So we'll just live together and learn from

each other.

826

:Yeah.

827

:Fascinating.

828

:I really love that.

829

:And well, before closing up the show,

because I don't want to take too much time

830

:with you, but do you have any question I

really like asking from time to time is if

831

:you have any favorite type of model that

you always like to use and you want to

832

:share with listeners?

833

:You mean probabilistic model?

834

:Sure, or it can be a different model for

sure.

835

:But yeah, probabilistic model.

836

:I actually, yeah, I mentioned a little bit

that I do not really work from application

837

:point of view.

838

:I really work on the compiler for Bayesian

inference.

839

:So I don't really have a favorite model,

let's say.

840

:It's hard to say.

841

:Yeah, that's interesting because basically

you work, that's always an interesting

842

:position to me because you really work on

the, basically making the modeling

843

:possible, but usually

844

:one of the people using that modeling

platform yourself.

845

:Exactly.

846

:Yes.

847

:Yeah.

848

:That's always something really fascinating

to me.

849

:Because me, I'm kind of on the bridge, but

a bit more to the applied modeling side of

850

:things.

851

:So I'm really happy that there are people

like you who make my life easier and even

852

:possible.

853

:So thank you so much.

854

:That's cool.

855

:Awesome.

856

:Dmitri, that was super cool.

857

:Thanks a lot.

858

:Before letting you go, though, as usual,

I'm going to ask you the last two

859

:questions.

860

:I ask every guest at the end of the show.

861

:First one, if you had unlimited time and

resources, which problem would you try to

862

:solve?

863

:Yes, I thought about this question.

864

:It's kind of an interesting one.

865

:And I thought it would be cool.

866

:to have if we have an infinite amount of

time to try to solve some sort of

867

:unsolvable paradox because we already have

a limited time.

868

:So one of the areas which I never worked

with, but I'm really fascinated about is

869

:like astronomy and one of the paradoxes in

astronomy which is kind of

870

:I find interesting, but maybe it's not

really a paradox, but anyway, it's like

871

:Fermi paradox, which basically in a few

words, it tries to explain the discrepancy

872

:between the lack of evidence of other

civilizations, even though apparently

873

:there is a high likelihood for its

existence.

874

:Right?

875

:So this is maybe a problem I would work on

if I would have an infinite amount of

876

:resources I can just fly in the space and

try to find them.

877

:That sounds like a fun endeavor.

878

:Yeah, for sure.

879

:I'd love the answer to that paradox.

880

:And people are interested in the physics

side of things.

881

:There is a whole bunch of physics-related

episodes of this show.

882

:So for sure, refer to that.

883

:I'll put them in the show notes.

884

:My whole playlist about physics episodes.

885

:Yeah, I know.

886

:And I know also you're a big fan of

Aubrey...

887

:Clayton's book, The Bernoulli's Fallacy.

888

:So I also put this episode with Aubrey

Clayton in the show notes for people who

889

:have missed it.

890

:If you have missed it, I really recommend

it.

891

:That was a really good episode.

892

:No, I know.

893

:I know.

894

:I know this episode.

895

:Yeah, awesome.

896

:Well, thanks for listening to the show,

Dimitri.

897

:Awesome.

898

:Well.

899

:Thanks a lot, Mitri.

900

:That was really a treat to have you on.

901

:I'm really happy because I had so many

questions, but you helped me navigate

902

:that.

903

:I learned a lot and I'm sure listeners did

too.

904

:As usual, I put resources in a link to

your website in the show notes for those

905

:who want to dig deeper.

906

:Thank you again, Mitri, for taking the

time and being on this show.

907

:Yeah, thanks for inviting me.

908

:It was a pleasure to talk to you.

909

:Really, super nice and super cool

questions.

910

:I like it.