Artwork for podcast Learning Bayesian Statistics
#114 From the Field to the Lab – A Journey in Baseball Science, with Jacob Buffa
Behavioral & Social Sciences Episode 1145th September 2024 • Learning Bayesian Statistics • Alexandre Andorra
00:00:00 01:01:31

Share Episode

Shownotes

Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!

Visit our Patreon page to unlock exclusive Bayesian swag ;)

Takeaways:

  • Education and visual communication are key in helping athletes understand the impact of nutrition on performance.
  • Bayesian statistics are used to analyze player performance and injury risk.
  • Integrating diverse data sources is a challenge but can provide valuable insights.
  • Understanding the specific needs and characteristics of athletes is crucial in conditioning and injury prevention. The application of Bayesian statistics in baseball science requires experts in Bayesian methods.
  • Traditional statistical methods taught in sports science programs are limited.
  • Communicating complex statistical concepts, such as Bayesian analysis, to coaches and players is crucial.
  • Conveying uncertainties and limitations of the models is essential for effective utilization.
  • Emerging trends in baseball science include the use of biomechanical information and computer vision algorithms.
  • Improving player performance and injury prevention are key goals for the future of baseball science.

Chapters:

00:00 The Role of Nutrition and Conditioning

05:46 Analyzing Player Performance and Managing Injury Risks

12:13 Educating Athletes on Dietary Choices

18:02 Emerging Trends in Baseball Science

29:49 Hierarchical Models and Player Analysis

36:03 Challenges of Working with Limited Data

39:49 Effective Communication of Statistical Concepts

47:59 Future Trends: Biomechanical Data Analysis and Computer Vision Algorithms

Thank you to my Patrons for making this episode possible!

Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser, Julio, Edvin Saveljev, Frederick Ayala, Jeffrey Powell, Gal Kampel, Adan Romero, Will Geary, Blake Walters, Jonathan Morgan and Francesco Madrisotti.

Links from the show:

Transcript:

This is an automatic transcript and may therefore contain errors. Please get in touch if you're willing to correct them.

Transcripts

Speaker:

Today I am joined by Jacob.

2

:

Buffa, the senior director of performance science and player development for the Houston

Astros.

3

:

Growing up with a deep -rooted passion for sports in St.

4

:

Louis, Missouri, Jacob's journey from aspiring baseball player at Missouri State

University to leading player development and performance science is nothing short of

5

:

inspiring.

6

:

Jacob discusses the critical role of nutrition and conditioning in athlete development,

emphasizing the innovative

7

:

of education and visual communication tools to help athletes understand how their dietary

choices impact performance.

8

:

He also explains how Bayesian stats play a pivotal role in analyzing player performance

and managing injury risks, and delves into how complex concepts like Bayesian analysis are

9

:

communicated effectively to coaches and players,

10

:

they understand the uncertainties and limitations of the models used.

11

:

Finally, Jacob and I discuss emerging trends in baseball science, such as biomechanical

analysis and the application of computer vision algorithms.

12

:

This is Learning Basics Statistics, episode 114, recorded June 20, 2024.

13

:

Welcome Bayesian Statistics, a podcast about Bayesian inference, the methods, the

projects, and the people who make it possible.

14

:

I'm your host, Alex Andorra.

15

:

You can follow me on Twitter at alex -underscore -andorra.

16

:

like the country.

17

:

For any info about the show, learnbasedats .com is Laplace to be.

18

:

Show notes, becoming a corporate sponsor, unlocking Bayesian Merge, supporting the show on

Patreon, everything is in there.

19

:

That's learnbasedats .com.

20

:

If you're interested in one -on -one mentorship, online courses, or statistical

consulting, feel free to reach out and book a call at topmate .io slash alex underscore

21

:

and dora.

22

:

See you around, folks.

23

:

and best patient wishes to you all.

24

:

And if today's discussion sparked ideas for your business, well, our team at PIMC Labs can

help bring them to life.

25

:

Check us out at pimc -labs .com.

26

:

Hello my dear fans, I'm coming to you with fantastic news because Learn Based Ads is going

live.

27

:

We're indeed going to have the two first live shows of Learn Based Ads history.

28

:

It's going to happen very soon in Stankham 2024 in Oxford, UK.

29

:

We're going to have two panel discussions.

30

:

We're going to kick things off amazingly on September 10 with Charles Margossian, Steve

Bronder and Brian Ward.

31

:

talking about the past, present and future of Stan.

32

:

And then on September 11, Elizaveta Semenova and Chris Wyman are gonna make science look

really cool because we're gonna talk about how Bayesian stats are used in the very

33

:

important field of computational biology.

34

:

So if that sounds like fun, if you wanna ask us embarrassing questions, if you wanna meet

us in person, if you wanna have exclusive LBS stickers, well, get your StanCon tickets now

35

:

and...

36

:

Honestly, I can't wait to meet you all on September 10 and 11.

37

:

See you very soon, my dear patients.

38

:

Jacob Buffa, welcome to Learning Bayesian Statistics.

39

:

Hey Alex, how are you doing?

40

:

I am doing very well, thank you so much for being on the show Jacob.

41

:

So as I said in the introduction, you work for the Houston Astros, which I'm sure the

American listeners know about for non -baseball listeners.

42

:

Houston Astros is a big MLB...

43

:

team so baseball and thanks a lot actually to JJ Robbie for putting us in contact.

44

:

JJ was here on the show damn a few years ago.

45

:

I don't even remember the number of the episode but for people curious about what JJ is

doing at the Astros he was not at the Astros at the time but you'll get an idea of what

46

:

he's doing he's doing absolutely tremendous job.

47

:

in the R &D department.

48

:

So yeah, I referred to that episode.

49

:

I put in the show notes the link to the sports analytics playlist.

50

:

And I'm sure if you're into sports, that's going to be worth your time.

51

:

But today we have Jacob with us and I'm having you on the show because you're doing a lot

of different things and your background is actually super interesting.

52

:

So yeah, maybe

53

:

Tell us what you're doing nowadays, but mainly how you ended up working on these because I

know your path is marked by a passion for baseball, sure, but it was still a bit senior in

54

:

random.

55

:

So I love that.

56

:

Yeah.

57

:

So currently I serve as the senior director of player development and performance science

for the Houston Astros.

58

:

My path to here was definitely unique.

59

:

I played baseball in high school and then actually went to Missouri State University for

baseball as well, but wound up very quickly realizing that I was much smarter than I was

60

:

good at baseball.

61

:

so wound up actually pursuing an interest in just overall human performance.

62

:

I was very passionate about

63

:

basically, you know, training to be bigger, faster, stronger.

64

:

so, you know, wound up spending a lot of time around the strength and conditioning staff

there, you know, wound up doing some, some internships and really learning as much as I

65

:

could actually outside of school.

66

:

You know, I chose my, my degree is actually marketing and then wound up adding economics

in there.

67

:

But

68

:

was never really big on learning like inside the classroom.

69

:

It was just something that was not a passion of mine.

70

:

So through college, you know, was able to gain a lot of knowledge around just general

kinesiology, strength and conditioning principles.

71

:

And actually, I think it was like my junior year, I was approached by a friend of mine

named Denton McRomey, who was like, hey, man, like,

72

:

we should start a gym, you know, after college, like that's what we should do.

73

:

And actually at first I was like, you're crazy, like starting a business, like that's, I

don't know how to do that.

74

:

But he kind of talked me into it.

75

:

And so that's, you know, after graduating in 2016, moved back to St.

76

:

Louis, Missouri.

77

:

And, you know, we had some connections.

78

:

We played baseball together in high school.

79

:

He went off to play at Rockhurst, but we had some connections in the St.

80

:

Louis area with

81

:

baseball teams and we wound up leveraging those to be able to start training some kids and

got a building and basically kind of step by step, you know, figured out, you how to do

82

:

it, how to run the business.

83

:

And one of the things that we did that turned out to be relatively unique there was, you

know, I was very, we were very passionate about identifying what underlying physical

84

:

qualities we, you know, were truly being impacted.

85

:

to help improve on field performance.

86

:

Because deadlifting more or squatting more is definitely important, but there's not

necessarily like a causal relationship to throwing five miles an hour harder.

87

:

But there are certain first principles that we're trying to impact.

88

:

So this is where I started to learn a lot about force plate research and just general

linear physics.

89

:

And we purchased a set of force plates ourselves.

90

:

started jumping athletes, really diving into the movement signatures, the force velocity

profiling.

91

:

And we started testing guys' bat speed, their throwing velocity, and just started keeping

all this information with players.

92

:

And over the course of a couple years, know, wound up, you know, being able to have some

research around why we do what we do, and we're really enjoying it.

93

:

And then the Houston Astros in 2019 opened up a job called a performance coach.

94

:

And this was traditionally what baseball would call a fourth coach or a development coach,

which coaches first base, know, maybe coaches defense and base running.

95

:

But this role, they actually expanded to help in performance science.

96

:

So the Astros were actually the first organization to have a sports scientist in baseball.

97

:

So they were very passionate about this.

98

:

part of this role was helping to do the sports science testing, helping to do the workload

monitoring, a lot of the grunt work, quite frankly, but it was definitely insights into

99

:

multiple departments.

100

:

it was something that I, given I never had actually a formal degree in kinesiology or

anything like that, I actually felt like this was my first shot.

101

:

to actually work professionally or work for someone, work for an organization or in sport.

102

:

So I applied and wound up getting the job actually.

103

:

I actually remember Bill Fricke, Pete Petilla and Jose Fernandez were the three who I

interviewed with and who wound up hiring me and I'm forever grateful to all three of them

104

:

for taking a chance on me because my resume was nothing spectacular.

105

:

And so I wound up doing that for 2019.

106

:

In 2020, they actually are after that season, they offered me a position as a sports

science analyst.

107

:

So I accepted that role, moved down to West Palm Beach with my wife, where the spring

training complex is, and was a sports science analyst for two years.

108

:

And then in 2022, you know, the Astros decided that they wanted to make a bigger

109

:

more formal investment in biomechanics and sports science.

110

:

So they started a performance science department and they asked me to be the director and

build it out.

111

:

So I was very grateful for that opportunity.

112

:

In 2022 and 2023, I was the director of performance science, building out that team and

trying to get that research off the ground.

113

:

then just this past year, at the end of the year, they

114

:

as to expand my responsibilities again to oversee player development.

115

:

So that is the long story of how I wound up where I'm at.

116

:

Yeah, I love it.

117

:

I love it.

118

:

It's absolutely, absolutely fantastic.

119

:

that's also why I wanted to have you on the show.

120

:

Because as you were saying, also you did quite a lot of weightlifting, which I am

personally very interested about.

121

:

I do that very amateurly in my local gym.

122

:

But something I discovered when digging into the science of weightlifting is

123

:

And that was surprising to me because like, I didn't know anything about that before doing

that myself, diving into the science of it and basically conducting small RCTs on me at

124

:

the gym and, know, coming up with my own macro cycles and so on.

125

:

So something I was really, really surprised by is by the importance of nutrition.

126

:

actually, you know, because when you're like, when you start a training like that, you're

like, yeah, the training is like 90 % of the results, right.

127

:

But actually, I discovered nutrition is extremely important and is an integral part of the

training program.

128

:

So I'm also curious if that's the case in sports teams.

129

:

like baseball and then like, yeah, basically how do you apply that kind of knowledge that

we have from a much more controlled sport like weightlifting?

130

:

How does that help you in your job today?

131

:

Yeah, that's a really good question.

132

:

absolutely, like nutrition plays a huge role in, I think all professional sport, but

definitely within our organization, we do take it very seriously.

133

:

And yeah, I think that there are, you interesting parallels, you you talked about

weightlifting specifically, you know, I did spend probably three, three, four years

134

:

competitively weightlifting.

135

:

And, you know, like one example of something that is, it's just a staple in weightlifting

is you have to make a weight class.

136

:

And so, you know, one of things that you have to do is you have to be able to manipulate

your body weight to essentially be

137

:

the lowest body weight that you can possibly be while like lifting the most weight that

you can possibly lift.

138

:

And, you you have weigh ins at a certain time and, you know, so you wind up needing to

basically weigh in at a certain time and then understanding this is what I need to eat and

139

:

when to be able to, you know, lift at my fullest capacity, you know, over X number of

hours later.

140

:

And while the exact scenario is

141

:

significantly different than baseball, right?

142

:

There's no weight classes or anything like that.

143

:

The general principle of being able to understand essentially, you know, what your body

needs to perform at its highest level and how long that takes to get in your system and

144

:

get out of your system is extremely valuable, right?

145

:

And so, you know, on like something like that, that is certainly applicable is, you know,

even something as simple as like caffeine intake.

146

:

You know, we play night games and so we know how important sleep is for overall

performance.

147

:

And it could be very easy for someone to take extreme amounts of caffeine before the game,

you know, because they don't know how long it takes for caffeine to actually get through

148

:

their system.

149

:

And so they wind up actually it not being that useful, you know, for the first portion of

the game.

150

:

And then they wind up basically not being able to sleep for several hours post game.

151

:

So I think even basic principles like that, understanding what to put in your body and

when is extremely impactful.

152

:

Yeah.

153

:

Yeah.

154

:

That's a very good example.

155

:

I'm actually very curious.

156

:

How do you follow that?

157

:

Because like, you cannot be behind the players all the time, right?

158

:

So here you have to also rely, I guess, own professional, professional character of the

player.

159

:

So I'm guessing there is variation on that.

160

:

How do you guys handle that?

161

:

Because, yeah, in the end, we know about that stuff, but also there is a lot of personal

variation, not only as you were saying on caffeine intake and the timing, but also effect

162

:

of caffeine on people.

163

:

I'm personally very sensitive to caffeine.

164

:

So that's cool because it wakes me up in the morning.

165

:

But definitely I know that if I take caffeine after more or less 12 p I'm gonna have

troubles at night.

166

:

I'm wondering how, yeah, how do you implement that stuff once, like how do you implement

the science on the players?

167

:

Yeah, you know, I'm not gonna lie and say that we have it down perfectly or that all of

our players, you know, follow

168

:

everything to a T.

169

:

We largely rely on education.

170

:

And I think that that's something that resonates with me.

171

:

you know, I think for anyone that has kids, it's not significantly different in that you

can tell them the right thing over and over and over again, but until they believe it

172

:

themselves, they may not do it.

173

:

And so, yeah, to your point, we don't have oversight over these guys.

174

:

all the time, nor do we want to have to.

175

:

So the best thing that we can do is essentially educate them on why it's important for

their performance, why it's important for their careers, and trying to distill complex

176

:

science into very simple but impactful infographics and try and communicate things

visually and essentially get them to believe and understand that if they want to improve

177

:

their performance.

178

:

this is something that they should do.

179

:

And definitely when players do that, you can definitely see it because they take ownership

over their careers and we definitely see changes on the field as well.

180

:

Okay, yeah, I see.

181

:

That must be super interesting.

182

:

So you basically get the players somewhere together and you go through the application of

the science.

183

:

I don't know, like today is about caffeine, tomorrow is about sleep, and next week is

going to be about meal timing, stuff like that.

184

:

Is that how that works?

185

:

Yeah, essentially.

186

:

You know, we have the draft coming up here, July 14th to the 16th, and that's a great

example of like after the draft, we have an onboarding process for all of our players.

187

:

And so they will learn about the Astros' philosophies.

188

:

In many areas, they'll learn about what our hitting philosophy is.

189

:

They'll learn about what our defensive philosophy is.

190

:

They'll learn about what our strength and conditioning philosophy is.

191

:

And one of the things that they'll learn about is our nutrition philosophy.

192

:

And it's definitely on the education side.

193

:

It's why are carbs important?

194

:

Why are fats important?

195

:

Why is protein important?

196

:

How much of that should you intake?

197

:

What are the proper sources?

198

:

And ultimately, you know, we always try and tie it back to on -field performance, you

know.

199

:

So, you know, for example, you know, we can educate players that if, you know, if you're

playing in the field, know, carbs are important for essentially like high bouts of energy,

200

:

right?

201

:

And if you, one of the key performance indicators of basically being a good defender in

the outfield is how fast you can run.

202

:

and how much ground you can cover.

203

:

if multiple balls hit you, can you do that multiple times?

204

:

And so, it's a non -trivial thing to be able to fuel your body correctly for maximum

effort sprints multiple times over several hours.

205

:

And so, if we can tie it back to basically what they value, I think it has a better chance

of landing.

206

:

Okay, yeah, that's definitely super interesting.

207

:

I love that.

208

:

And yeah, that...

209

:

Also, personally, that timing of things is, I can see very interesting.

210

:

You have also to understand, know, like there are definitely some moments of the days of

the day where I'm more efficient at the gym than like I'm definitely much more efficient

211

:

in the morning than in that at night.

212

:

Right.

213

:

So I never now I almost never train at night or the evening if I don't have to.

214

:

And I much rather do that in the morning.

215

:

Also, because I have the caffeine.

216

:

boost, you know, some like, and going to the gym after a full day of work is just like,

that's hard.

217

:

You know, I much prefer go for a walk, or something like that.

218

:

But definitely something I resonated with, and that's like, that's very anecdotal.

219

:

But you're saying that there are some late night games, right?

220

:

And so you have to take your caffeine at the right moment so that it

221

:

gives you the boost for the game, but at the same time doesn't disturb your sleep.

222

:

So it's a completely different field.

223

:

But I do some stand up from time to time.

224

:

And stand up shows are at night.

225

:

And so I actually have the same issue.

226

:

I I came up with that, that timing stuff, like very nerdy caffeine timing the other day,

just before a show because I wanted to have that but I knew if I if I

227

:

took my caffeine too late, I would have like I would not sleep for instance, before like

three or 4am which happened to me before.

228

:

So like that that made me that made me laugh when you talked about that because I was

like, well, not only you know, high sports professional have that issue.

229

:

So thank you so much, Jacob for for all the work you do.

230

:

And that's that's actually useful to much, much more people than you thought.

231

:

That's good to know.

232

:

You see?

233

:

Well, that's actually the same issue.

234

:

I mean, for any people who have to do some stuff at night where they need to be alert, I

guess that will be useful to them.

235

:

Now, I'm curious also about what you do, the kind of work you do for analyzing player

performance and injury risk, because I know these two topics are extremely important for a

236

:

professional sports team.

237

:

I'm wondering how Bayesian stats are applied here and how they can be helpful.

238

:

Yeah, I think that there's a significant way that just the overall Bayesian framework is

applied.

239

:

And I think if we think about the components of professional sports, being successful in

professional sports, some of them being skill acquisition, cognitive processing, in -game

240

:

strategy, and then obviously kinesiology.

241

:

you know, injury risk.

242

:

Kinesiology is probably the most publicly researched area, you know, of all of them.

243

:

If anyone wants some answers, it's easiest, you know, to Google how to make a player

bigger, faster, stronger, or, and you'll get dozens of research articles that are

244

:

applicable, which, you know, in my area means that

245

:

we can leverage that information as priors and then be able to apply our observations from

our population to both improve the resolution of the insights that we glean from the data

246

:

that we have, but also to be able to infer maybe where our specific processes or our

specific population might differ from the research population.

247

:

Okay, okay, I see.

248

:

That's in what's the what would you say is the state of the science on these on on these

fronts?

249

:

My are we somewhat confident?

250

:

Or is that something that's really at the frontier and that's evolving almost every year?

251

:

I think that there there are aspects of it that we are very confident in.

252

:

And there are aspects of it that are definitely evolving.

253

:

So an example of aspects that we are confident in is like we are very confident in how

specific musculature and their functions apply to injuries and human performance.

254

:

Very confident in the static state.

255

:

You know, I think that we are one area that we that the research is improving in is

understanding maybe how these function in a

256

:

in a dynamic state.

257

:

And an example of that would be, know, it's maybe easy to take a look at hamstring

strength, right, and player's hamstring strength and then track that over a season and

258

:

see, okay, who hurts their hamstring more or less, right?

259

:

But it's, you know, there are certainly aspects to like sprint mechanics, that impact

that.

260

:

that maybe are less obvious because there's not quite as much quantifiable information on

it right now.

261

:

It also requires essentially getting more nuance and understanding what is the muscle

doing at the time of injury.

262

:

And given that injuries are relatively sparse in nature when compared to non -injured

instances, that type of information is tough to come by.

263

:

But there are definitely people doing good work and trying to understand how

264

:

coordination fits into injury mitigation.

265

:

So that's one area that we're improving.

266

:

But I do think it's very good and we're very confident in overall, how does musculature

impact injury risk?

267

:

Okay.

268

:

What is a question or topic in particular in that realm that you'd love to see answered in

the coming month?

269

:

that you're really curious about.

270

:

Well, I guess, you know, don't know if this is specific enough.

271

:

It's definitely not in the coming months.

272

:

But you know, one thing that like we are always pursuing in the baseball industry is one

of the things that is most important to a pitcher being successful on the field is how

273

:

hard they throw.

274

:

That's, that's like pretty common.

275

:

The harder you throw, generally, the better the results are going to be.

276

:

However, we also know from

277

:

external research that how hard you throw is pretty much the driving factor to whether or

not you're going to get hurt.

278

:

You you put more torque on the elbow and more strain and that winds up essentially

escalating your injury risk a ton.

279

:

And so like one of the things that I think we've been trying to look at is

280

:

tendon and ligament adaptations and trying to understand, can we periodize workload of a

pitcher to be able to maximize their in -season performance and mitigate their injury

281

:

risk?

282

:

Because ultimately, the answer of throw the ball slower is not gonna work.

283

:

I think baseball has tried to take the approach of

284

:

just throw less overall and injuries continue to increase.

285

:

So, you know, I think that there's, I don't know the answer to the question.

286

:

I don't think external research will get to it.

287

:

Hopefully, you know, we're able to get to it internally.

288

:

Yeah.

289

:

Yeah.

290

:

I guess that's, that would be quite, be quite interesting, I'm guessing.

291

:

And what about the, so I guess you talked a bit, a bit about that right now, but what do

you,

292

:

See like how to Bayesian models help in predicting the impact of training loads on the

athletes Well -being and performance in general like not only injury.

293

:

Yeah You know, I think when it comes to training loads, you know, we we know that there

are broad truths About how stress, you know impacts the human body

294

:

We also know that there are nuances around how specific people or players adapt to

stresses.

295

:

So we can essentially use those broad truths to overcome sparse data where we may not have

a whole lot of information on any specific player.

296

:

But we do have a few observations.

297

:

And then if we combine that with something like a multi -level model where we can

298

:

then really glean some robust insights where maybe robust data actually doesn't exist.

299

:

I see.

300

:

Yeah.

301

:

Yeah, for sure.

302

:

I mean, that's definitely where, where hierarchical models would definitely be super

helpful.

303

:

Like if you can relate the different, the different positions and the different players

and the different population of players, definitely super powerful.

304

:

And what about the, so what you do also some, like you also work on, on

305

:

athlete conditioning, right?

306

:

And you, you, you do that, like you use the science of that to improve the training of the

players that right?

307

:

Yes.

308

:

Yeah.

309

:

Okay.

310

:

So how do you use how do you use, like, how do you do that?

311

:

And how do you use Bayesian approaches here?

312

:

Yeah.

313

:

Good question.

314

:

So I mean, I think it's not unrelated to the last answer that I gave, but one of the

limitations of overall injury prevention is the amount of data that can be collected.

315

:

We only have so many players that come through our system in any given year or even

through a couple of years.

316

:

And then we only have so many samples even within a given player.

317

:

And especially on an injured population, right?

318

:

The injured population is significantly less than the healthy population to the point

where, you know, if you have one or two players who maybe got injured with what looks like

319

:

healthy data, you know, it can be difficult to discern.

320

:

you know, even we go back to leveraging previous research, you know, I think

321

:

If we take the example, stick with hamstring strength and hamstring injuries.

322

:

If we have hamstring strength information on players, we can absolutely take information

from research and say, we believe within a certain degree of certainty that this is what a

323

:

healthy hamstring signature or force profile would actually look like.

324

:

And we can play around with how confident we are in that.

325

:

you know, to basically see what gets us closest to the actual outcomes.

326

:

And then that allows us to, you know, obviously be more confident in what we're looking

at.

327

:

Yeah, yeah, that makes sense.

328

:

That, I mean, that sounds pretty challenging, but that does make sense.

329

:

So from all that you're seeing here, really something I can see is that, yes, if you look

at the

330

:

You know, like one question in particular, the data can be limited.

331

:

But if you look at the overall amount of data, and definitely in comparison to other

sports, baseball is quite rich in data.

332

:

Because you have inputs from game statistics, you have player tracking systems, have

physiological data, you have a lot of these sources, how, so how do you integrate these

333

:

diverse data sources to then provide interesting insights?

334

:

Yeah, I think it, from my perspective or just my opinion on it, I think it starts by

layering the data properly, which I think to me means understanding what level of

335

:

information is important depending on the question being asked or what level of

information do we need to start with.

336

:

And so, for example, if we want to know, let's say how we can make someone a better

outfielder, right?

337

:

First, we jump right to, well, what's their reactive strength index from the force plates

and, the reactive strength index is low, hamstring strength is low, I think we're going to

338

:

lose a lot of people, right?

339

:

There's not going to be a whole lot of people that will immediately make that connection

and say,

340

:

yeah, that makes sense.

341

:

We fixed that, he'll be a better outfielder.

342

:

But if we start with maybe asking the question, how many runs is this player worth as a

defender?

343

:

How many runs has he saved as a defender?

344

:

Which may come from our ball tracking data, right?

345

:

That may come from understanding which balls were hit to them, were hit to him.

346

:

How many other defenders would have actually made that play on average?

347

:

Something simple like that.

348

:

you know, then we can maybe work backwards to the next level of information and say, well,

it looks like maybe he doesn't catch quite as many balls as the average outfielder because

349

:

he's not as fast.

350

:

Like he's slower than average as well.

351

:

And we know that the amount of ground that you can cover is certainly important.

352

:

Then I think you can make the next step to the physiological data and say, he's like,

doesn't produce a whole lot of force and he's not super strong.

353

:

So now,

354

:

people start to actually link the two and say, okay, now I can see how improving his

hamstring strength and his force production qualities makes him a better outfielder.

355

:

Okay, yeah, that's fascinating.

356

:

it's like, yeah, different hints basically that you're picking up from the data.

357

:

Yeah, yeah, essentially, and making sure that basically each one is applied at the right

time.

358

:

Yeah, yeah, And well, I think that's...

359

:

And a question I have that's related to that also is then what what do you think are the

most significant challenges that you face?

360

:

Not only you, but the whole, you know, science team that which are these challenges that

you face when you're applying patient stats in baseball science?

361

:

And how do you address them?

362

:

really good question.

363

:

I mean, actually, I, I think that

364

:

The largest challenge is actually getting people that are extremely familiar with Bayesian

methods and fluent in Bayesian methods.

365

:

And I would not consider myself a Bayesian expert by any means.

366

:

And the field of sports science doesn't teach this.

367

:

It's very limited in the statistical methods.

368

:

that it actually teaches.

369

:

so I actually think one of the, I guess maybe another tangent, like it's related, is

people in the field of sports science tend to be very tied to what previous research,

370

:

methods that previous research have done, right?

371

:

So they'll come in and they'll say, I want to do project X and...

372

:

this paper, these two papers were written on this project and they did it this way.

373

:

So this is exactly how I want to do it.

374

:

These are the statistical methods that were used.

375

:

And a lot of times these papers are written by very, very intelligent strength

conditioning coaches or, you know, exercise physiologists, but they're, they're very,

376

:

they're not written by people with strong stats backgrounds.

377

:

So I think getting people in the field that are actually familiar with this type of

approach.

378

:

is the largest obstacle.

379

:

But I do think once we get people with that skill set, there's actually very few barriers

to it, just given, I think, two things.

380

:

The first one being the amount of tools that are available to use Bayesian methods across

both Python and R with very simple syntax and are computationally fast has expanded

381

:

tremendously.

382

:

you know, just over the last six or seven years since I've been paying attention.

383

:

And I also think that how we communicate, how we communicate Bayesian stats generally

aligns with how people think.

384

:

People know that there are uncertainties around every decision that is made.

385

:

And we know that some uncertainties are wider than others.

386

:

And depending on our risk tolerance, you know, that may factor in more so than just a

single point estimate.

387

:

And so I do think that overall, communicating them, I think that's actually one of the

strengths of Bayesian approaches.

388

:

Okay.

389

:

I see.

390

:

Yeah, that's very interesting.

391

:

it's like, it's a mix of like not only the data and the availability of those, and also,

guess, the importance of having at least a part of the organization focused on that, but

392

:

it's also a technical side in the sense that

393

:

You definitely need people who are able to work on these with these kind of methods that

you're using a lot and Bayesian stats are definitely a very important part of that

394

:

workflow.

395

:

Yeah, yeah, absolutely.

396

:

Yeah.

397

:

And actually how, like, because you have to communicate, as you were saying, your findings

and the results of your models to a lot of different stakeholders.

398

:

So how do you do that?

399

:

I know from experience that it can be challenging.

400

:

So how do you communicate these complex statistical concepts like those from Bayesian

analysis to coaches and players to ensure that they are effectively utilized?

401

:

Yeah, that's a non -trivial task as well.

402

:

I do think one of the things that we try and do is

403

:

we do communicate it in different ways to different groups of people, right?

404

:

I think when talking with JJ's group and R &D, we're actually gonna wanna be as technical

as possible because we actually want their input on the methods and they're gonna wanna

405

:

know, I trust these results based off of the process?

406

:

If I'm communicating with a coach or a scout,

407

:

they don't care about that, right?

408

:

If I'm communicating with them, it's actually more so like one of the general approaches

that we take is, we distill the information that we have down to as few dimensions as

409

:

possible?

410

:

So, oftentimes what that looks like is maybe at most three or four dimensions where

obviously if we're relaying it in a graph, we have our X, Y axis and then maybe it's

411

:

you know, gradiented with a specific color and faceted by different positions, right?

412

:

So, you know, for example, if we're trying to communicate injury risk of a elbow injury

risk of a pitcher, you know, we might take a look at, we might take a look at the, you

413

:

know, x axis being shoulder strength, the y axis maybe being lower body strength or

something like that, it may be gradiented out.

414

:

by injury risk or probability and it may be faceted by how hard you throw.

415

:

So that way, we can communicate four different variables, but very, very simply put and

hopefully easy to distill down.

416

:

see.

417

:

And what do you, in your experience, what are the most common challenges of consumers of

these models?

418

:

be players, be coaches, be people from the business side.

419

:

What do you see as the main difficulties and what would you recommend?

420

:

What would you your advice to listeners who have to do the same at the work?

421

:

Maybe not for coaches and players, but for other stakeholders who are not part of the

model building team, but have to use the models.

422

:

in their own work?

423

:

Yeah, so I think that there are two obstacles and they're actually kind of probably

competing obstacles.

424

:

I mean, the first one is, you know, we want to be as concise, as quick as possible, right?

425

:

We don't want to say, okay, you know, look at this visual, then this visual, then this

visual, then this visual to make your decision, right?

426

:

If we can encompass it all in a single visual or

427

:

you know, a single pillar of philosophy, that's what's going to resonate.

428

:

Otherwise, you know, they may forget or if it gets too complex, they may not even try and

use it.

429

:

The second obstacle is actually, you know, related to that is when we do that, I think we

run the risk of glossing through or like smoothing through a lot of information, maybe

430

:

meaningful information and maybe nuanced, but nuanced cases do come up, right?

431

:

And so we don't want to overgeneralize in an effort to simplify too much.

432

:

so, you know, like one of the things that we have tried to do, and I'm not saying that we

are great at it, so maybe other people have better approaches, but we have tried to keep

433

:

the information or the philosophy or the tagline as simple as possible, but then try and

highlight, you know, these scenarios.

434

:

maybe possible scenarios where it's worth, if the results don't look intuitive to you, ask

a question.

435

:

And we just try and highlight where these possible scenarios could go wrong, where

certainly we want people to actually think through it.

436

:

And if they see a result that says, I don't think this is right, this doesn't make any

sense to me, as an expert in their field, just ask the question, or please just don't take

437

:

these at face value all the time.

438

:

Hmm, yeah, definitely.

439

:

think it's something very useful.

440

:

So in my experience, making sure to communicate not only what the model can do, but also

and maybe most importantly, what it cannot do.

441

:

And that way, that will mitigate a lot of these issues of over or under confidence in the

model.

442

:

Because I mean, we definitely as humans and it's well documented in the science that

443

:

And humans have a different way of handling uncertainty around algorithm decisions, right?

444

:

We tolerate much more the fact that a human is gonna underperform and be wrong in a

special case, but algorithms, when they are wrong in just one instance, then people will

445

:

lose trust extremely fast.

446

:

in the algorithm.

447

:

yeah, I can think it's something to be very careful of when we communicate our model

because well, people will be way more, will be way harsher on the model than on a scout,

448

:

for instance, right?

449

:

A scout can be wrong much many more times than a model for recruiting players can be

because of that.

450

:

bias that humans have.

451

:

So I think it's very important to communicate that as you were saying, and also to

communicate that the model is not just a machine.

452

:

The model is made by humans.

453

:

be a bit kinder to it, Yeah.

454

:

Yeah.

455

:

I I think a good example of that, least for us, you put it well, communicating what the

model can't do.

456

:

For us, that's actually our injury risk models, I think,

457

:

for a lot in that category where like, you know, if we were to actually communicate, if we

actually communicated the exact probabilities that the model outputs of someone getting

458

:

hurt, it's almost always going to say that the odds are they don't get hurt because those

are the true odds, right?

459

:

That at any given time, if someone goes out there and plays, the odds are that they won't

get hurt.

460

:

And so then, you know, we can communicate that like this, model that we're using is,

461

:

not necessarily to make the prediction whether or not that this player is going to get

hurt.

462

:

It's to infer what physical qualities or what features are actually important that we can

impact that lead to more or less risk.

463

:

And so maybe it's less about is this player at 45 % risk or 35 % risk.

464

:

It's more about what do we deem as important that would put that player at more or less

risk and then is that worth it?

465

:

Yeah.

466

:

Yeah.

467

:

So basically communicating all the uncertainties around the decision to make.

468

:

Nice.

469

:

Yeah.

470

:

Cool.

471

:

Well, I think I've already asked you about a lot of that, like very precise, you know,

science questions.

472

:

So maybe now to play us out a bit more looking towards the future.

473

:

Are there any emerging trends?

474

:

that you see in baseball science that you believe will significantly impact how teams

manage training and performance in the near future.

475

:

And yeah, are there also some breakthroughs that you would really want to see?

476

:

Yes.

477

:

So actually, as far as future,

478

:

you know, I guess, more or less like innovations in this field.

479

:

One of the things that makes me very excited about my role is I actually do believe like

performance science in general fits into that category.

480

:

And I guess more specifically as it relates to biomechanical information.

481

:

So like we've talked a lot about just general kinesiology and physiology in this

conversation.

482

:

you know, the last I think was three or four years ago, you

483

:

Major League Baseball rolled out Hawkeye information, which is tracking the individual

joints of every single player.

484

:

And that is where a lot of injury research comes from, especially in baseball field around

the torque of the elbow and things like that.

485

:

So I do believe, like I, I'm very excited at that data set.

486

:

And I believe that that's where, that's where the arms race.

487

:

is in baseball is who can leverage that information the best.

488

:

As far as breakthroughs that I'm hoping for, I don't know, maybe I could probably change

my answer if you would like me to change it.

489

:

But I actually think that not necessarily on the research side, but the quality of the

computer vision algorithms and the player, the tracking, I'm hoping that

490

:

breakthroughs occur there and maybe even more specifically the speed at which those

algorithms or those models are processed.

491

:

And I guess that's for two reasons.

492

:

First of all, when we're talking about elbow torque, the difference of one inch of the

wrist placement is exponentially more in degrees, which is exponentially more in force or

493

:

torque.

494

:

And so

495

:

if a model misses by an inch, that's significant.

496

:

And that's a very high standard for a computer vision model.

497

:

It's a high standard for the human eye.

498

:

But ultimately, if you want to get the most precise information possible, that's where I

think some of the innovation will come from.

499

:

And then in a practice setting, there's a lot of research around

500

:

feedback loops and skill acquisition and basically being able to provide a target and then

just providing that player with feedback of whether or not they hit that target and how

501

:

far they were.

502

:

And just given the complexity of the computer vision models and the size and the compute

power, those results, those biomechanical results don't come back for an hour or two,

503

:

which is fast, but it's not, you

504

:

we could use it inside of a minute, right?

505

:

To really get to apply it in a practice setting.

506

:

And so, yeah, those are maybe not specific kinesiology or physiology innovations, but I'm

hoping that somebody can figure that out in the next several years.

507

:

Yeah, mean, yeah, for sure.

508

:

That's like, I agree, that sounds absolutely amazing.

509

:

So listeners, you've heard Jacob like...

510

:

get going on that if you're if you're a fan of computer vision algorithms in baseball.

511

:

Definitely that would be used by the Astros.

512

:

And I'm guessing a lot of other teams.

513

:

Yeah, that's super cool.

514

:

I completely agree.

515

:

And, well, I think that's that's the show, Jacob.

516

:

I mean, that's I think we've already covered a lot of topics.

517

:

Before we close up,

518

:

I have the last two questions I ask everybody, of course, at the end of the show.

519

:

But is there a topic you would have liked to mention but I failed to ask you about?

520

:

I actually think that we covered it.

521

:

mean, these are probably my three favorite topics of baseball, human performance, and

maybe statistical methods.

522

:

So I think we hit on it all.

523

:

Well, I'm glad.

524

:

to hear that.

525

:

So then, let's play a sandwich.

526

:

First question, if you had unlimited time and resources, which problem would you try to

solve?

527

:

Yeah, I...

528

:

Is this specific to my field or just in general?

529

:

Now, Justin Shanerl.

530

:

Yeah, at a limited time and resources, I'd probably dive into that.

531

:

But then more specifically, like in my field, I would absolutely love to be able to solve

the elbow injury risk with pitchers.

532

:

I think it's something that is just an extremely complex problem.

533

:

And I very much enjoy complex problems.

534

:

And there's an extremely high return on investment.

535

:

I think for someone I can help with that.

536

:

Yeah, I mean, I'm really impressed at because the players play such an amount of games per

year.

537

:

It's absolutely incredible.

538

:

Me, like honestly, my I was anchored with European sports teams.

539

:

So like in football, they play.

540

:

I mean, soccer, they will play like on tops.

541

:

let's say 50, 60 games per season, rugby is less.

542

:

So like, yeah, when I started working in baseball and I saw the number of games that these

guys play per year at such a high level, I'm honestly surprised that they don't get

543

:

injured more often.

544

:

And yeah, like I understand why you're saying the elbow injury because like, yeah, that

was one of my first, that was one of my first questions when I started looking today.

545

:

It was like, damn, but the pitchers must throw, I don't know how many thousand balls in

each season.

546

:

And that's not even counting the training.

547

:

So the amount of joint pain and risk that you have with that is absolutely incredible.

548

:

I really don't know how they don't get injured more often, to be honest.

549

:

Yeah, I agree.

550

:

What they do and what they go through is impressive.

551

:

Yeah, 162, that's a lot of games.

552

:

damn.

553

:

And is that actually, so maybe last question before the very last question, do you see

any, like is the pitcher position really the one that's the most at risk for injury or is

554

:

that pretty much

555

:

will widespread across the positions or do you have some positions that are much more

prone to injury?

556

:

No, mean, it's pretty centralized at the pitcher position.

557

:

There are definitely injury risks all over the field, but in terms of the biggest, mean,

the injury risk on the mound is exponentially higher than any other injury.

558

:

I think if we look at the

559

:

the game of baseball, the throwing motion is probably the only one that like truly pushes

the limits of the human body.

560

:

know, sprinting, you know, no offense to any of my baseball players, love you guys, but

they're not the fastest in the world.

561

:

You know, they're not pushing that barrier.

562

:

They're not the strongest, you know, in the world, but that right or left arm and the

delivery is moving, you the fastest in the world.

563

:

And so I think that's the one that pushes the boundaries the most.

564

:

Yeah.

565

:

Okay.

566

:

Interesting.

567

:

Yeah.

568

:

I mean, I'm not, I'm not surprised, but that's, that's good to, to, to hear say that.

569

:

Yeah.

570

:

I mean, the, you amount of pitches they have, they have to make is just like, can't, I

can't believe that.

571

:

It's just, it's just absolutely incredible.

572

:

and also like, if you have any baseball players listening to that episode, well done.

573

:

that's like, that's impressive.

574

:

Like, you let me know if you have some.

575

:

Houston Astros players listening to that episode.

576

:

That's like great publicity.

577

:

We need to like, you know, advertise that and then on the social media.

578

:

It's like that.

579

:

That'd be quite amazing.

580

:

actually, you know, do you have do we have any study about then pitchers who retire and

how their joints age?

581

:

Because I know for US football, for instance, that can be quite a big, they can still be

at a high injury risk even after their professional career.

582

:

Is that the case also in baseball?

583

:

Or do we not know about that?

584

:

You know, that's a good question.

585

:

I'm not gonna say that there's not research.

586

:

haven't, know, there could be something that I'm not aware of.

587

:

But I haven't read personally, you know, any, any research on it.

588

:

So, yeah, I'm not aware of any.

589

:

Okay, yeah.

590

:

Yeah, I'd be interested in that.

591

:

If anybody in the audience knows about that, let us know.

592

:

And well, finally, last question for you, Jacob.

593

:

If you could have dinner with any great scientific mind, dead, alive, or fictional, who

would it be?

594

:

man.

595

:

You know, I'm to go with fictional.

596

:

And I'm going to go with Tony Stark as Iron Man.

597

:

I'm a big fan of the Marvel movies.

598

:

And so I think he's the one that I'd like to have dinner with.

599

:

That's a great answer.

600

:

I have never had that one on the show.

601

:

So yeah, you're the first one.

602

:

But I understand.

603

:

That's definitely my favorite of all the Marvel superheroes.

604

:

So yeah, I love that.

605

:

Yeah, that would

606

:

Definitely be super cool.

607

:

Would you ask him if you can fly the iron suit?

608

:

Definitely.

609

:

And I'd hope that he would say no, but I would have to ask.

610

:

Yeah, I mean, yeah, for sure.

611

:

Yeah, I would definitely ask.

612

:

Like you should probably also ask if you could play baseball with the iron suit.

613

:

That'd be probably super fun.

614

:

Yeah, that might be my only chance to make it professionally.

615

:

Yeah, mean, with the iron suit.

616

:

You must throw pretty fast.

617

:

you like you should think about that, Jacob.

618

:

That would mitigate injury risk a lot.

619

:

Yeah, probably.

620

:

Well, on that note, I think it's the perfect time to close.

621

:

So thank you so much, Jacob.

622

:

That was a pleasure to have you on the show.

623

:

Thanks again, JJ, for putting us in contact.

624

:

As usual.

625

:

We'll add links to your website and socials and any resource that you think is interesting

for listeners who want to dig deeper and start learning about baseball science, sports

626

:

science in general, and baseball analytics.

627

:

Thanks again, Jacob, for taking the time and being on this show.

628

:

Thank you very much, Alex.

629

:

I really enjoyed it.

630

:

This has been another episode of Learning Bayesian Statistics.

631

:

Be sure to rate, review, and follow the show on your favorite podcatcher, and visit

learnbaystats .com for more resources about today's topics, as well as access to more

632

:

episodes to help you reach true Bayesian state of mind.

633

:

That's learnbaystats .com.

634

:

Our theme music is Good Bayesian by Baba Brinkman, fit MC Lars and Megharen.

635

:

Check out his awesome work at bababrinkman .com.

636

:

I'm your host.

637

:

Alex and Dora.

638

:

can follow me on Twitter at Alex underscore and Dora like the country.

639

:

You can support the show and unlock exclusive benefits by visiting Patreon .com slash

LearnBasedDance.

640

:

Thank you so much for listening and for your support.

641

:

You're truly a good Bayesian change your predictions after taking information and if

you're thinking I'll be less than amazing.

642

:

Let's adjust those expectations.

643

:

Let me show you how to be a good Bayesian Change calculations after taking fresh data in

Those predictions that your brain is making Let's get them on a solid foundation

Chapters

Video

More from YouTube

More Episodes
114. #114 From the Field to the Lab – A Journey in Baseball Science, with Jacob Buffa
01:01:31
113. #113 A Deep Dive into Bayesian Stats, with Alex Andorra, ft. the Super Data Science Podcast
01:30:51
112. #112 Advanced Bayesian Regression, with Tomi Capretto
01:27:18
109. #109 Prior Sensitivity Analysis, Overfitting & Model Selection, with Sonja Winter
01:10:49
102. #102 Bayesian Structural Equation Modeling & Causal Inference in Psychometrics, with Ed Merkle
01:08:53
94. #94 Psychometrics Models & Choosing Priors, with Jonathan Templin
01:06:25
92. #92 How to Make Decision Under Uncertainty, with Gerd Gigerenzer
01:04:45
89. #89 Unlocking the Science of Exercise, Nutrition & Weight Management, with Eric Trexler
01:59:50
84. #84 Causality in Neuroscience & Psychology, with Konrad Kording
01:05:42
83. #83 Multilevel Regression, Post-Stratification & Electoral Dynamics, with Tarmo Jüristo
01:17:20
19. #19 Turing, Julia and Bayes in Economics, with Cameron Pfiffer
01:00:26
27. #27 Modeling the US Presidential Elections, with Andrew Gelman & Merlin Heidemanns
01:00:52
28. #28 Game Theory, Industrial Organization & Policy Design, with Shosh Vasserman
01:03:56
31. #31 Bayesian Cognitive Modeling & Decision-Making, with Michael Lee
01:09:18
34. #34 Multilevel Regression, Post-stratification & Missing Data, with Lauren Kennedy
01:12:39
40. #40 Bayesian Stats for the Speech & Language Sciences, with Allison Hilger and Timo Roettger
01:05:32
52. #52 Election forecasting models in Germany, with Marcus Gross
00:58:07
71. #71 Artificial Intelligence, Deepmind & Social Change, with Julien Cornebise
01:05:07
53. #53 Bayesian Stats for the Behavioral & Neural Sciences, with Todd Hudson
00:56:12
57. #57 Forecasting French Elections, with… Mystery Guest
01:21:48
77. #77 How a Simple Dress Helped Uncover Hidden Prejudices, with Pascal Wallisch
01:09:00