Artwork for podcast Kunstig Kunst: Kreativitet og teknologi med Steinar Jeffs
Philippe Pasquier on MIDI-GPT and Creative AI
Episode 2416th May 2025 • Kunstig Kunst: Kreativitet og teknologi med Steinar Jeffs • Universitetet i Agder
00:00:00 01:03:47

Share Episode

Shownotes

Philippe Pasquier is a professor at Simon Fraser University and director of the Metacreation Lab for Creative AI. In this episode, he shares insights from two decades at the intersection of artificial intelligence, creative tools, and generative music.

We talk about what creativity means in the context of computer science and how AI systems like MIDI-GPT are designed to collaborate with musicians rather than replace them. Philippe explains how MIDI-GPT works, why prompting with music instead of text might be more useful for composers, and how the system allows for granular musical control.

The conversation also explores the limitations of current AI tools, ethical questions around training on data from deceased artists, and how digital infrastructure might be governed in the future. Philippe speaks openly about the challenges independent labs face compared to tech giants and argues for more public and artist-centered AI development.


Transcripts

Speaker:

So, hi and welcome.

2

:

Philippe Vasquer.

3

:

You are a professor at Simon Fraser University School of Interactive Arts and Technology,

where you direct the Meta Creation Lab for Creative AI.

4

:

You also lead a research creation program around generative systems for creative tasks.

5

:

And as such, you're a scientist specialized in artificial intelligence, software designer,

multidisciplinary media artist, educator, and a community builder.

6

:

Pursuing a multidisciplinary research creation program, your contribution bridge

fundamental research on generative systems, machine learning, effective computing and

7

:

computer assisted creativity, applied research in the creative software industry and

artistic practice in interactive and generative art.

8

:

Does that sound about right?

9

:

Hello Steiner, yes, that sounds about right.

10

:

Excellent.

11

:

And thank you so much for joining the podcast.

12

:

um I have a pretty broad question for you to start with, and that is, uh what is

creativity and what does it mean for you?

13

:

Yeah, well, know, creativity is sort of a very trendy concept in our day and age.

14

:

It has gained value, apparently, uh in every possible way in the last uh few years and

decades.

15

:

um Since uh the rise of the individual, um it has become an important value, sort of grown

and made popular in our society uh through...

16

:

the lens of economical growth.

17

:

There was this idea that the creative individual would be a more productive individual in

our society and it became a value.

18

:

When it comes to defining it, that has been a challenge for the last 2000 years where

philosophers all over the world have discussed the notion of creativity and that's what in

19

:

philosophy we call an essentially contested concept, meaning that there's no agreed upon

definition really and uh

20

:

and the debate is going on.

21

:

And so at the Meta Creation Lab for Creative AI, which I direct at Simon Fraser

University, we take a very computer scientist approach to creativity because that's what

22

:

we do.

23

:

We sort of reduce it to the generativity of it.

24

:

Often creativity is defined as the possibility to generate new ideas, new options, new

items that are both original and valuable.

25

:

uh

26

:

But we try ourselves, instead of defining the concept of creativity and what are the

creative process and how it works in human cognition, we reduce it to the notion of a

27

:

creative task, which allows to capture the generative aspect of creativity.

28

:

So I don't really talk about creativity in general.

29

:

I often talk and look at creativity through the lens of, in the case of AI, of the partial

or complete automation of a creative task.

30

:

And so it doesn't really matter what it takes to, for example, compose a four-part music

with four tracks, but the task itself is clearly easy to define, and you know when you

31

:

have achieved it.

32

:

And so as a computer scientist, this sort of reducing creativity as a set of actions uh is

a great, has been a very productive way to frame it, as opposed to getting into the muddy

33

:

territory of debating.

34

:

the philosophy of the concept, although that's really interesting too.

35

:

But yeah, I'll keep it at that for now.

36

:

So yeah, we talk about creative tasks and creative tasks are in computer science and for

all intents and purposes, all of those tasks, and that's a lot, for which there's no

37

:

rational, optimal solution.

38

:

So if I want to find the shortest path on a Google map, I will use A star.

39

:

It's an algorithm that does that for you.

40

:

And most people, you know, they're really...

41

:

follow the instruction of Google Maps when they move.

42

:

So what's oriented people in the world right now is this algorithm A star.

43

:

And it's capable of finding the shortest path from A to B, given the driving conditions or

given the state of the city and the streets.

44

:

And that's an optimal solution.

45

:

And so that's not a creative task.

46

:

Playing chess, you win uh or you lose.

47

:

It's a rational problem solving.

48

:

So it's not a creative task.

49

:

So that's under my definition, of creativity as a task.

50

:

So tasks that are creative are tasks that, by definition, do not accept optimal solutions.

51

:

So things like composing music, interpreting music, writing lyrics, everything that has to

do with what you talk about usually and what your center is about, that uh it does fall

52

:

under the idea of a creative task.

53

:

Mastering a track is a creative task.

54

:

There's no such thing as an optimal mastering.

55

:

or Pareto dominant mastering or the utility function of a mastering, right?

56

:

And so, yeah, jokes, uh choosing clothes in the morning, ah deciding the gaze we take when

we look at something or the type of movement we make, all those things uh often do not

57

:

have rational uh optimal solutions and therefore they're considered creative.

58

:

So at the Metacritic Lab, that's really what we focus on.

59

:

Automatizing.

60

:

creative tasks for which there's no rational solution across domains.

61

:

So we do it for dance and choreography, we do it for visual, for animation, but more

importantly we've been doing it in the last 15 years that the group has been together,

62

:

we've been doing it intensively for music.

63

:

So if I got you right, um creativity for you guys means uh creating something that doesn't

have an optimal rational solution.

64

:

That's right.

65

:

I haven't actually heard that definition before.

66

:

It was really interesting and really fits.

67

:

It really fits well.

68

:

A couple of other definitions on creativity.

69

:

em I've found, for instance, like a stock version of creativity from Wikipedia says that

creativity is the ability to form novel and valuable ideas or works using one's

70

:

imagination.

71

:

And of course, it gets to be a muddy territory um when you talk about novel and valuable.

72

:

I mean, who's the judge of what's novel and or what's valuable.

73

:

And I mean, something that's really hype these days is can a machine be creative?

74

:

Can they create something new or is it just rejured regurgitating or imitating stuff

that's made before, you know, that discourse.

75

:

And I, I recently read a post by an earlier podcast guest I've had on the show before

she's called Esther Fierre-Einhart.

76

:

I don't know if you've heard about her, but she's made this definition that creativity is

the decomposition of what is perceived, selection and recombination with the ability to

77

:

compare an intermediate result with an inner theme, an emotion or idea.

78

:

And the confrontation with the inner theme in turn requires the ability to feel emotions

and a consciousness.

79

:

Aha, so more of a cognitive definition.

80

:

Yeah, that's interesting.

81

:

you know, I could go that way and there's plenty of attempts that way.

82

:

we do ourselves, when we do model creativity, we do look at what is the process, what are

the cognition involved in terms of the beliefs, the desires, the intentions of an

83

:

artificial agent.

84

:

But I like my definition better because it is so functional.

85

:

And it captures the first one from Wikipedia that you...

86

:

that you sort of read, which is also the one from Margaret Boden, which is maybe the most

well-known philosopher of creativity in the 20th century.

87

:

And her definition is exactly that, that it's about generating.

88

:

So you produce something when you're being creative.

89

:

There's an act of generation, an idea, an object, a painting, a piece of music.

90

:

And then what you create needs to be original and valuable.

91

:

And then there's different types of creativity.

92

:

Personal creativity, peak creativity.

93

:

Sometimes a kid makes a drawing, it is uh an act of creation, it is generative, it is

novel, original, but is it valuable?

94

:

Well, it's valuable for the kid, it may be valuable for the parents who are like, now my

kid can draw a house, right?

95

:

But it's not valuable for the world, it's not gonna go at Christie auction, right?

96

:

So this availability, how valuable it is, that's subjective.

97

:

And of course, we all know that in art and when creativity is involved, there is

subjectivity.

98

:

know, someone's creativity might...

99

:

might rub me the right way and someone else's might rub me the wrong way.

100

:

And so I'm not going to like all those creative artifacts the same way.

101

:

And that's really that subjectivity that for a computer scientist, for someone who works

in AI, makes it more complicated than rational problem solving, more complicated than

102

:

actually getting a self-driving car to go from A to B, taking the shortest path.

103

:

And therefore a little bit more interesting from, you know, I come from mathematics

myself.

104

:

and then I became a computer scientist as a way to do applied mathematics.

105

:

on the side, I've been growing a career as an artist.

106

:

And so I've been fascinated by this very question of what is it that makes creativity

different from pure rational problem solving?

107

:

And at the beginning uh of trying to identify creativity as different from rational

problem solving, I had the idea that it meant sort of by definition that it would be

108

:

really, really hard.

109

:

to get computer to be any good at any of those tasks.

110

:

And it turns out, especially the last 10 years, uh after high efforts but many other AI

researchers, the whole field of generative AI uh with algorithms that specifically tackle

111

:

those tasks for which there's no optimal solution has been booming.

112

:

And now it's everywhere.

113

:

And we'll get back to the issues with evaluating originality uh and valuable later on.

114

:

But let's get into the reason I found you in the first place, which was an article about

um a thing called MIDI GPT.

115

:

Could you tell?

116

:

us about what that is and why you have built it.

117

:

Right.

118

:

So, yeah, we've been working at the Metacore CineLav for QATF on generative music systems

for a long time.

119

:

Often those systems, uh and that's circled back to our previous conversation about

definitions as well, often those systems uh in machine learning and in AI have to do with

120

:

training a model based on a set of data that we call the inspiration set or the training

set.

121

:

And then what um those generative AI algorithms that we've been focusing on at the

Metacrystian Lab do is that they tend to try to imitate the data.

122

:

And so they will generate new artifacts, so again, novel and potentially valuable uh

artifacts, let's say pieces of music, in the case of MIDI GPT, that would be classified by

123

:

an unbiased observer, that is you and me, staying out.

124

:

if I do an evaluation study, they will be classified as belonging to the training set.

125

:

In other words, the goal here is uh to pass a little test uh that is to compare what the

system generates with what's in the training set.

126

:

And if you can't make the difference, then it means my system is a good system.

127

:

And that's really the tenet of any machine learning system.

128

:

is to learn from data, uh the distribution of the data in order to generate new elements

that would be classified by an unbiased observer as belonging to the data set.

129

:

And so we've been doing that uh since 2008 in the case of the Metacreation Lab.

130

:

And then we went through all of the generation of different algorithm as machine learning

as a field, uh as progress, and as computing, which is the subtract.

131

:

in which these machine learning algorithms work is becoming more and more powerful.

132

:

And as we accrue more more data, then possibilities started to increase and increase.

133

:

And back in 2018 and 2019 is when the first wave of transformers came through.

134

:

And it appeared as an expert in the domain, someone who has been published in AI

conferences since:

135

:

uh it was clear very quickly that this breed of model and approaches was better and

stronger.

136

:

so everyone was flabbergasted by, you know, chat GPT and the launch of uh AI agents and

bots that we can talk to.

137

:

And then it turns out that I knew from my upbringing in AI that language is just, you

know, monophonic music in a way.

138

:

It's a sequence of token.

139

:

In music, we also have a temporal aspect of sequences of events, notes in general, in

different tracks and instruments, but there is uh a vertical dimension that does not exist

140

:

in language.

141

:

In language, we do turn taking, but we don't talk at the same time together, especially

not different sentences.

142

:

And in music, there's this extra complexity of having horizontal dependencies and vertical

dependencies.

143

:

And so...

144

:

As a uh researcher in AI, it seemed like there was a really good thing to try very quickly

on to try to apply this new architecture on musical data.

145

:

We were not the first one.

146

:

mean, all of that happened in parallel in a way.

147

:

We probably all have the same idea at the same time when the model came out.

148

:

uh And I think one or two models, the model from Google...

149

:

uh

150

:

Music Net, I don't remember, there was one or two models that came out just before ours

because they're really quick at releasing those big companies.

151

:

know, have engineers and PR people and all the mechanics to be the first one and make a

big splash.

152

:

But we were among the first one, among the three first ones in the world in applying this

technology.

153

:

And then interestingly, the way we applied it was a little bit more powerful, more generic

than others.

154

:

And so the model has been...

155

:

and since 2019 I've been training basically MIDI GPT transformers but the paper just came

out.

156

:

So that's how much work it took really.

157

:

The paper just came out uh at AAAI, at the latest edition of AAAI and been published with

all the refinements that we did add through those years and during those years we worked

158

:

with that algorithm ourselves.

159

:

We worked with companies to adapt that algorithm to their tool and we trained many, many

versions of that algorithm.

160

:

One of the things that we were really interested in that is new to our algorithm is how to

control the generation.

161

:

And so we've been doing a lot of work on that particular aspect.

162

:

But to backtrack for a second, so MIDI GPT is just a transformer.

163

:

uh But instead of taking textual prompts, and there's a lot of text to music models out

there and there's nothing wrong with them, there's a debate whether they're more useful or

164

:

less useful, but in our case the prompt is made of music, is made of existing MIDI music.

165

:

And so MIDI GPT is an infilling model that will basically generate whatever you want and

whatever you want will work, what it will generate will work with what's before, what's

166

:

after, what's above and what's under.

167

:

So I can say, hey, this is a song redo the bassline.

168

:

Or hey, this is a song redo those three bars of the melody.

169

:

Or hey, this is a song make me a continuation of those four tracks.

170

:

Or hey, this is a song add a marimba track.

171

:

And so what's nice with MIDI GPT, it supports all the MIDI channels.

172

:

And so all of the MIDI instruments are known by the system.

173

:

It actually knows what range a bass is uh using.

174

:

It generates parts that are playable by humans, so I can actually have people interpreting

that music.

175

:

And then it offers a lot of control, and we've been working really hard on making it

controllable.

176

:

um And I can go on and explain why that is the case, but it's mostly because of our back

and forth and exploration with musicians themselves and with uh software companies that

177

:

make software for musicians that we felt that they needed to be...

178

:

ah more addition to this model before we release it publicly and now it's available on

Hugging Face and there's been some good updates.

179

:

So if I got that right, MIDI GPT works in a way that you feed it uh MIDI, your own

playing, and then it kind of searches its training, or it has a foundation of training

180

:

set, which then references to find some structures or patterns, so as to make a coherent

musical result.

181

:

um

182

:

that could be combined both horizontally or vertically with the input.

183

:

you could harmonize a melody for instance.

184

:

Yeah, exactly.

185

:

So you can be like, hey, this is a melody, make me a polyphonic piano part uh or an organ

or a guitar and then a bass line and a drum for it.

186

:

No problem.

187

:

it restricted in terms of how many bars you get input or output?

188

:

No, the model itself has an attention window and we have different versions depending on,

but we like it to run on a CPU, so the current model that works everywhere has about 16

189

:

bars to 32 bars of attention and what it will do is there's a windowing mechanism, so if I

load a song from Stevie Wonder or Michael Jackson and I re-sample the entire bassline,

190

:

then it's gonna go and slide through like that.

191

:

Yeah.

192

:

So.

193

:

idea by prompting the system with music was that artists are in the business of making

music and often they have their own musical material.

194

:

And explaining in text what style you want and all of the detail of what you want is

actually really hard to do.

195

:

And there's now a lot of studies showing that text prompting is not necessarily a great

human computer interface.

196

:

uh in a number of domains.

197

:

uh And so we try this alternative in which you use the music material itself as part of

the prompt.

198

:

It's often a lot easier to come up with a great idea than to continue it or finish it.

199

:

So it might be a good companion in that sense.

200

:

So I have this project with a few music students where we do a kind of a hackathon because

I teach a subject in composition and arranging ah and they're

201

:

Their task is to try to cheat on these assignments using AI tools.

202

:

So could you imagine using MIDI GPT to cheat on an assignment, for instance, uh making uh

a ABA uh jazz song in that format?

203

:

Could you use MIDI GPT for that purpose, you think?

204

:

Yeah, 100%.

205

:

Yeah, 100 % you can use MIDI GPT on any composition assignments.

206

:

And yeah, and so that's really entering the realm of computer assisted uh creativity.

207

:

And that's one of the reasons why we like the idea that, of course, with MIDI GPT, you can

generate from scratch.

208

:

But if you generate from scratch, this system is trained on a dataset called MetaMIDI,

which we also released at the MetaCreationLab.

209

:

And it's the largest data set of MIDI data in the world.

210

:

It's uh at the moment 1.6 million MIDI tracks um with the multi-track pieces uh of MIDI.

211

:

And so it knows all of the music, every music.

212

:

It knows way more music than a human can.

213

:

And so if you generate from scratch, you might end up with medieval classical music.

214

:

And even if you can't tell the system, I want a pop song.

215

:

with like four instruments and this and that.

216

:

Like describing in Word again with text is so difficult to get it right.

217

:

And a lot of the system out there, and AI is for the most part a for-profit Californian

enterprise, right?

218

:

You can name, mean, software that people use are not uh typically from...

219

:

Europe or anywhere else in the US or China.

220

:

get those, you have the Asian stack and the American stack.

221

:

And those systems are made for the layman and they're made for everyone to use, but

therefore they're really not specialized.

222

:

In other words, the system that we are about, we are about them because of the massive PR

machine of those big companies.

223

:

And those big companies, they want to have everyone as a user, like for charge GPT.

224

:

So Udio, Suno.

225

:

those plant-based systems, they're great and we can't remove that from them, they do work.

226

:

But at the same time, if you're a musician and you have a specific idea of the aesthetic

you want, you typically will never get there with those systems.

227

:

You will have to adjust that from changing over and over and over until you get what you

want.

228

:

So we took the reverse approach, which is like, no, a musician has music, that music is

actually telling what style it is.

229

:

uh It's telling you know the density of note the size of note is telling even the the

micro timing of the interpretation the velocity that is used the dynamic all of the things

230

:

that you would be well at pressed you know to describe with words all of that is actually

given by the fact that I don't want you to start from scratch I want you to start from

231

:

your music or someone else music if you're less of a you know if you if ethics isn't a

problem for you you do what you want right, but uh

232

:

But the truth is, by prompting with music, we find that we go way uh faster to the actual

content that the creator is looking at and wants to listen to.

233

:

And then we work more on, hey, what are the new features that no software has?

234

:

Like things like batch generation.

235

:

Hey, I want to make a variation of this.

236

:

And a lot of professional musicians, Stana, you probably know that.

237

:

They do pastiche, they do style imitation, they do what my algorithm does for a living.

238

:

They receive a call from BBC on Monday, there is this documentary, it's going into 64

countries, you need to remake the soundtrack because we're not gonna pay rights for those

239

:

countries.

240

:

And so a lot of musicians, they do for a living, music style imitation.

241

:

And so the system would really allow them to take that track, the, okay, we're gonna

resample half of the bars of the entire...

242

:

music piece on all of the track except for the drum and we're going to do that 60 times or

100 times or 200 times and I want you to rank all those generations based on their

243

:

proximity to the original.

244

:

I listen to them one by one and as soon as it sounds far enough from the original to be

different but with a good style that I'm interested in, I'll take that and I start working

245

:

on it.

246

:

know, use cases like that that are real use cases actually.

247

:

is really what we were aiming for from the get-go.

248

:

Being musicians ourselves in the lab, we really believe in my lab, which is a bit

different from a lot of computer science departments and labs that I know, we believe in

249

:

participatory design.

250

:

So this idea that when we design a system and it starts by designing an algorithm, we want

the end user to be there in the room.

251

:

And to do that, we work with artists ourselves, second person view on the system, and we

work ourselves as artists.

252

:

your first-person view on the system.

253

:

And that's important to us.

254

:

And I think that's what allows us to design a little bit different type of systems.

255

:

And so far, in interaction with the industry, it's been quite uh successful and fruitful

because I think the industry understands uh that what we're looking at and what we

256

:

actually need for real creators that make a living out of making music is different from

what Californian AI has to offer.

257

:

and how generic the tools are and how we need things that are bit more specialized.

258

:

And so that's really the direction we took and how we went that way.

259

:

Hmm, yeah, it makes sense.

260

:

Different...

261

:

Of course, musicians look after entirely different things and an entirely different level

of detail um in their workflows.

262

:

Have you been surprised by any feedbacks you've received from the users this far?

263

:

Yeah, well, we've been really happy.

264

:

There's been albums made.

265

:

uh There's uh all sorts of projects ongoing and passed.

266

:

With it, we qualified for the final of the AI Song Contest last year.

267

:

uh And there was some of my students on stage during uh that presentation.

268

:

And so we've been really happy to see that there's actual real world artistic application

of the systems.

269

:

You know, in AI, very, very often we make the work with graduate

270

:

students, we publish it and then it's ending up on a shelf.

271

:

Or even more often than Google, Facebook, Meta, know, all of those companies would take it

on, develop it and then push something and then the world will discover it then, right?

272

:

So we move to a world where um things have changed a lot in terms of research because

before the GAFA

273

:

the best in any domain, especially computer science and the type of things we do in AI,

the best would always be a professor and his her PhD students.

274

:

uh And nowadays, you know, we have a number of companies that for some reason is still

unclear and there's still conversation around it, started doing research and started

275

:

publishing in the same place where my student publish, right?

276

:

Except that when they publish their paper,

277

:

ah they have 1.4 billion followers, let's say, if you say Google, for example.

278

:

So doesn't matter if that paper is good or bad or new or not new.

279

:

The world is going to take it for, uh this is the first time, you know, this is the first

people doing that, and those are the best, and et cetera, et cetera.

280

:

Turns out three years earlier, maybe my student came up with a way better algorithm for

the same, same thing.

281

:

And they've been, you know, copying it, improving it through engineering, and then

publishing it.

282

:

And it's very often the case like that.

283

:

So the world has really changed.

284

:

often my students say that it's unfair because there's a big difference between doing the

same, work in my lab and doing the work as an intern in Google.

285

:

The visibility of the work would be the same, but the visibility of the work would be

completely different.

286

:

And so there's been, not just in research, but of course, in the software industry and in

application, there's been a very big asymmetry of information uh and power.

287

:

uh

288

:

because a lot of those companies are so big that they sort of...

289

:

um

290

:

put everything else in the shade and then it's really hard to exist outside of the realm

of those very big players.

291

:

And so that's something that we've been trying to do and work on with our little ecosystem

here that involve both artists and communities that are hybrid basically, like the AIMC

292

:

conference, the AI Music Creativity Conference, which will be in Belgium in September this

year.

293

:

It was at Oxford last year.

294

:

next year and then we're looking after that maybe moving it to North uh America and that's

the type of conference in which we have a mix of academics and actual musicians and

295

:

composers as you know take their time and work together on tools and in those you know

forums that I attend and I help uh animate there is a strong you know uh critic of the

296

:

mainstream uh AI and what the GAFA are doing

297

:

and the way they are doing it.

298

:

I guess that's a bit like the rest of the world in terms of uh visibility and influence

being the currency.

299

:

That's how it's like for artists or yeah I mean influencers kind of rule the world in that

way.

300

:

So if you have a big following.

301

:

also a reality in research now, and that's really new.

302

:

And even when I applied for grants, it used to be that what matters was excellence.

303

:

and the quality of the work.

304

:

Now they have all those measures called impact, not just the citations, which is the

normal regular type of impact we use to measure in academia, but now it's like how many

305

:

followers do you have and how many people have downloaded or tried your software?

306

:

If I give a demo uh software for people to try, it's going to be hard for me to compete

with Google, you know what I mean?

307

:

Because Google, going to put it on the front page of their search engine

308

:

and it's going to be really hard to exist that way.

309

:

And so, the ecosystem is rapidly changing, but I'm always one to try to...

310

:

uh

311

:

to reaffirm the fact that there is a real diversity.

312

:

And I'm not saying that what the GAFA are doing is not good nor useless.

313

:

know, it's having generic agents and generic tools that you can prompt is really useful

for a lot of people, including myself.

314

:

But there's also a space in which for professional, for experts, there's different types

of tools that needs to exist.

315

:

And those tools won't have billions of users because there's you know, millions of people

who actually want to compose music from the ground up.

316

:

And so those big companies have no interest to serve those niche markets.

317

:

There's not enough money basically for them to go there.

318

:

And so there's a real space there of opportunity for people like you and I to exist and to

do interesting things.

319

:

So.

320

:

m

321

:

In the world of AI and music these days, we kind of look past the major commercial

players, who are the most interesting guys at the moment, do you think?

322

:

What kind of communities or institutions or who's making the coolest stuff right now?

323

:

Well, it's mostly happening, think, I feel, in the academic world.

324

:

So you get different pools.

325

:

You get IRCAM in Paris, who has always done really interesting things, and they continue

doing it.

326

:

I work with Jérôme Nica right now, because we have also a thread of work in my lab on

musical agents.

327

:

And that, again, has never been, and so far is not, a mainstream activity.

328

:

And therefore, there's been...

329

:

It's been very active and very interesting, but there's been none of the GAFA presence in

that space.

330

:

Although Google just released a paper inspired by some of our work uh using transformers

for musical agents.

331

:

um

332

:

But yes, there is the academic world.

333

:

There's Queen Mary in London, which is possibly the main center for computer music and

anything music and computing and AI related uh is the largest concentration on earth.

334

:

And then there is, know, humbly on our side, the Metacreation Lab where we're doing also

pretty well and a few others.

335

:

And so we gather in those communities such as AIMC.

336

:

There's NIME, the New Interface for Musical Expression is a conference.

337

:

that is about more human uh machine uh interactions and how to design that in the context

of what's happening right now with music, musical tools and AI.

338

:

And there's plenty of other uh conferences and that's really where we live.

339

:

I also like to have a public facing uh part of the work at the lab.

340

:

So we do play music ourselves.

341

:

I will be playing at Mutec festivals this year uh in...

342

:

uh

343

:

in Montreal at the end of August, it's the largest electronic music festival in Canada and

we're really happy we'll be playing a piece called Revival in which we, in a sort of uh

344

:

cheeky way, we train musical agents on the music of dead composers and we explore this

idea of digital afterlife but for music composition.

345

:

It's very trendy, there's a lot of companies now making agents that are going to survive

you and learn from your data and all that.

346

:

So we're interested in exploring those concepts in the context of music.

347

:

Concretely, I always wanted to play with David Tudor.

348

:

He's dead, so I'm going to make an AI agent of him, for example, and play on stage with an

agent that has learned from the material of those dead musicians, both the sound material

349

:

but also the temporal dynamic.

350

:

And those agents are capable of

351

:

on stage in live situation reason and be like given what Stenart just played and what

Philippe just played and what I just played what should I play next to sound like David

352

:

Tudor right so those type of style imitation in live on stage we present that but that's

really um what we try to do is to link the academic community within an audience and then

353

:

the artistic community and try to serve those communities the best we can because we

354

:

are in a way part of those communities ourselves, right?

355

:

And so we feel really, ah we feel that uh we're sincere in our effort.

356

:

ah And then we work with companies, Steinart, we work with Steinberg, it's been a number

of years that they're funding our research.

357

:

ah But so far nothing with Steinberg, with Teenage Engineering, ah with...

358

:

uh

359

:

audio kinetics in Montreal, no product has been released and that has to do more so with

the fact that all of our systems are data driven.

360

:

And when we talk about...

361

:

MIDI GPT.

362

:

So it's trained on MetaMIDI and MetaMIDI is acquired because I'm a researcher and on the

fair use in Canada for research purposes, I can scrub the internet, right?

363

:

There's no way I would have gotten so many MIDI files otherwise.

364

:

And then it used to be that you scrub the internet to make a data set, you train a model,

the model doesn't have the data.

365

:

If it's not...

366

:

um

367

:

if it's trained properly, it's not memorizing the data, and it generates things that I can

show are not plagiarizing the data.

368

:

And we show that, of course, because the company would not even consider the model

otherwise.

369

:

And then it used to be the case that they can pack the model and sell it.

370

:

And there's still a number of companies doing that, like Suno, Udio.

371

:

And now you have an ecosystem with two types of players.

372

:

Players...

373

:

who are either very big, like the gaffer or like Sunoo or Udyo, smaller players that just

do AI.

374

:

And so for them, it's a break it or make it sort of game, and they get lawsuits and we see

what's going to happen.

375

:

I'm actually...

376

:

oh

377

:

witness expert in the federal court in the US on some of those cases, so I can't talk too

much about it, but there's those companies who will try it, you know, so they have one

378

:

product and this product is based on AI, so they're gonna try all their best, maybe

they're gonna get sued, maybe they win, maybe they lose.

379

:

And then you have companies I work with, like TN Engineering or like Steinberg, they have

plenty of products.

380

:

Cubase has a million license and hundreds of functionalities.

381

:

They don't want to damage their reputation and to take a risk of losing everything because

of one AI feature.

382

:

no matter how good that AI feature is, right?

383

:

And so for those guys, and there's a lot of companies in the world like that in every

domain, they're just waiting to see what the legislation is gonna really look like and how

384

:

it's gonna be applied.

385

:

And it's been a real pity that our legislators, know, governance in general in the world

is a little bit broken, but in this particular case, it has been so slow to move, and

386

:

therefore the GAFA and all the people who are much more like, we do this and that's all we

do, we can take the risk,

387

:

you

388

:

got such a lead, such a lead in that adventure that it's very complicated now for regular

software company who existed before this AI craze.

389

:

Where here it's really hard for them to push those product.

390

:

And they're still in a way all of the company I work with, we do the research together,

we're still preparing things in the background, we're testing things, uh we're refining

391

:

designs, but they can't release anything because uh the legal team is like, this is not uh

an ecosystem in which

392

:

we want to take that risk.

393

:

So that's where we are right now with some of the divide between what people are exposed

to, typically Californian AI, and what's actually existing all across the world, not just

394

:

here in Canada, but also in Europe.

395

:

That's why uh it's hard for people to name AI and software coming from Europe because they

don't really exist commercially for the most part.

396

:

Hmm.

397

:

So there was a lot of stuff to unpack in what you just said.

398

:

em So among other things, you mentioned that you wanted to train an AI agent on a dead

artist.

399

:

That kind of raises some ethical questions.

400

:

And you're also talking about all the pending lawsuits ah and slow legislation.

401

:

And I'm wondering um both.

402

:

both when it comes to the ethical viewpoint on using dead musicians music as a training

set.

403

:

em How do feel that fits into your ethical view of fair use?

404

:

Yeah, well, exactly.

405

:

And we do it uh in an artistic context because it's provocative.

406

:

It's obvious that we won't do it with Beyonce or Michael Jackson because first Beyonce is

not dead, but Michael Jackson is dead.

407

:

We won't do it on Michael Jackson because we probably would get sued real quick.

408

:

m

409

:

Yeah, on the ethical side, uh of course, yeah, this is really to have that discourse that

we also make that type of artworks uh because we do believe that the question of the

410

:

digital afterlife is a real one.

411

:

And then, you know, increasingly a digital footprint.

412

:

you want it or not is increasing because everything we do is computer mediated.

413

:

And so there is already the case that Facebook exploits your data while you're alive.

414

:

And guess what?

415

:

They'll continue exploiting your data while you're dead.

416

:

And you're not going to be removed from the database, and et cetera, et cetera.

417

:

So want it or not, this question is a very, very,

418

:

current one.

419

:

And so with this artwork in particular, we really put that to the forefront by being

explicit and saying, hey, this is what we're doing.

420

:

But then I can link it to a lot of things that are already been done without people being

explicit about it.

421

:

So that's bringing the discourse forward.

422

:

Then concretely and a bit more technically in terms of ethics, well, we can in some case

get the rights of the

423

:

the stakeholders of people who have uh inherited the rights of those composers.

424

:

In some cases, uh the work is in public domain, right?

425

:

So for example, I'm training AI on some uh Wagner...

426

:

Wagnerian music, some music from Wagner.

427

:

uh He died more than 70 years ago.

428

:

And so in most legislations, there's differences between countries, which is another thing

with those type of notions, which is quite incredible, is that regulating AI is hard, but

429

:

regulating AI in a world that doesn't have, unfortunately, the capacity to do worldwide uh

regulation that are concerted and harmonized.

430

:

uh is sort of the dramatic situation we're in.

431

:

And so this is also another part of the conversation we want to have is, you know, where

are you trying that model?

432

:

Where are the servers?

433

:

Where is the data?

434

:

All of that will change completely the legal, uh legal status of what we do.

435

:

uh And then when it comes to the ethics, there's other questions such as the legitimacy of

the work.

436

:

So...

437

:

A lot of the arguments of companies defending themselves, being attacked in justice right

now for scrapping data from the internet, they say, well, music is a socially constructed

438

:

phenomena, right?

439

:

No one will be born, no human will be born on an island alone and end up with composing a

symphony in their lifetime, right?

440

:

We are musicians and we do what we do because there was musicians before us doing what

they do.

441

:

and we are inspired by it, right?

442

:

And I can listen to the radio, get inspired by a piece and write a piece that is sort of

related but different.

443

:

How is that different for my neural network listening to the radio, except all of the

radio, very, quickly, in other words, know, eating all that data, getting inspired by it

444

:

and generating something that is arguably not, ah in that case, you if I imitate the work

of David Tudor, that is not the work of David Tudor.

445

:

that is original, that there's no copyright issue, but it's been inspired by the work of

David Tudor.

446

:

What is ethically wrong with that?

447

:

This seems to be what humans are actually doing, right?

448

:

And so we want to go a little bit deeper into that discourse whereby right now, and

because people are afraid and scared, and it's true because some people are getting sued,

449

:

some companies are getting sued, so everyone is like, ooh, if you use data, if it's not

your data, then...

450

:

you probably shouldn't.

451

:

But the truth is then that starts a new one, which is who can afford uh data, right?

452

:

And if it is the case that you need to be able, like big American company, to afford

entire catalog of music and pay a massive amount of money to be able to train and

453

:

eventually use and exploit those models, then again, you create another type of

inequality, which is inequality to access to data, right?

454

:

Or you see, so...

455

:

So we do that in an artistic context because we think it's harmless, right?

456

:

uh It's not like the family of David Tudor is making a fortune and they're gonna sue us.

457

:

We love the aesthetic, you we're earnest in our approach.

458

:

We really think it's a strong art project and the output is really, really good.

459

:

It has obviously nothing to do with the actual work of the artist itself.

460

:

It's our work, you know, and but part of it is that we use that data.

461

:

And so we do that consciously and it's part of our practice and our process and it's also

triggering discussions, right?

462

:

And that's why we do it.

463

:

But so far at least I looked at every angle of it and I've not been proven guilty of

anything unethical that I know of.

464

:

What would be the counter argument you think if you were to play Devil's Advocate?

465

:

I think there's plenty.

466

:

Some of them would be cultural.

467

:

There's cultures in which, for example, pictures of dead people are not okay.

468

:

Even for memory purposes or things like that.

469

:

There's religions, like the Muslim religion, which is the fastest growing ideology on

earth and the second largest ideology after the Christian one, religion.

470

:

And so in that region, for example, explicit representation.

471

:

of human figures is not okay.

472

:

So the statues of data, uh depending on cultures, depending on the way death is being

treated, depending on the way remembrance is being treated, depending on the definition of

473

:

respect that you have, the statues of data of these people would be vastly different.

474

:

The concrete thing though is right now, uh nobody read them when they sign.

475

:

the agreement, you know, but when you open your Facebook account, your Instagram account,

your TikTok account, your Google account, you give the entirety of the data of the

476

:

interaction you will have with those platforms at an infinite time, right?

477

:

So you sell your afterlife, your digital afterlife by definition, like by, you literally

sign on it when you sign in.

478

:

And so again, you know, this artwork is a way to also raise awareness.

479

:

to that because typically people don't think about that.

480

:

It's not really pleasant to think about one's death.

481

:

It's not really pleasant to think about databases with our own data in it.

482

:

And mixing the two is not something people think about on a daily basis.

483

:

People are really busy with the present, their immediate future, and maybe their immediate

past.

484

:

But they rarely think about, hey, what's going to happen to all of that in 50 years,

right?

485

:

um So it's probably a good market idea to make a Halal Facebook or something.

486

:

I think there's gonna be, yeah, absolutely, I think there's gonna be some platform as

those conversations evolve that are gonna start being explicit about exactly that, about

487

:

talking about what's happening, right?

488

:

But you mentioned also that the way humans learn and make music is uh community driven or

socially driven and the way we learn is similar to a neural network.

489

:

And that's a common argument when it comes to defending these lawsuits.

490

:

But do you have any opinions on how AI music legislation should look like?

491

:

Do think it should be like completely open or should there be any boundaries?

492

:

That's a good question, that's a complex question and the response, I want to be a little

bit nuanced here but the answer is that it depends.

493

:

But there is surely something to be said about the protection of intellectual property and

it's almost like a political...

494

:

uh

495

:

a political view that people take on it, you know, from people that are extremely copy

left, aka copyright is a wrong thing, we should not bother with it, to people that are uh

496

:

typically you move toward corporations that are extremely...

497

:

copyright oriented with like massive legal instruments to defend uh those copyrights.

498

:

And right now all of the lawsuits that I know of most, no, not all, most of the lawsuits I

know of in the US and in Canada have to do with big stakeholders attacking AI companies.

499

:

And so it's universal.

500

:

It's Sony.

501

:

It's really big stakeholders attacking AI companies because of the monetary dimension of

it.

502

:

Sometimes

503

:

more rarely it is artists themselves, right?

504

:

And I'm more interested in those cases because I think it's more earnest, it's less of a

game of money then.

505

:

And I know personally some artists, visual artists, painters that used to make a living

painting...

506

:

um

507

:

portraits that people would ask them to make of themselves and then they can't earn a

living now because now you can do you know image to image with a few prompt and get a

508

:

really good uh portrait of you.

509

:

Of course, it's not made of old painting on a canvas so it's actually completely different

but it looks really great.

510

:

You get to steer the system towards something you like and it's sort of flattering and

that and so there's a real uh impact of those generative AI systems on the market.

511

:

In France for example, 35 % of

512

:

uh of the market for graphic design has been eaten by those systems.

513

:

And so there's literally 35 % less money to go around if you're a graphic designer in

France.

514

:

And so a lot of young graphic designers can't make it anymore.

515

:

So the impact on job displacement and all that is already real of all those system.

516

:

So legislation and some protection, at least in the short term, I think is the right way

to go.

517

:

uh

518

:

the right tools, right?

519

:

How as an artist can I really decide to give my data and make sure that I'm gonna be paid

for it?

520

:

Or how can I make sure that I decide not to give my data and then no one can actually pump

it?

521

:

No one can scrap it.

522

:

So technically we're not quite there yet.

523

:

Legislative legislation, we're not quite there yet.

524

:

But there is those niche market appearing.

525

:

Some companies, for example uh Adobe, are already doing the right thing.

526

:

like Adobe, because they own, I think it's called, with Firefly, have Giphy, they have the

largest stock pictures site in the world, and they let every creator on that stock picture

527

:

site decide if they allow their data to be used to train AI or not.

528

:

And if they do, then a little stream of money will come back each time the model is being

used.

529

:

And so those new systems to retribute

530

:

um So to clear, ah to clear, um to clear, how uh would you call that?

531

:

to make sure that the artist is on board and then to retribute and pay back the artist.

532

:

Those systems already exist.

533

:

know, some companies are going that route and stuff.

534

:

But here again, it takes a massive infrastructure and I'm afraid that only big companies

can afford to go that route for now.

535

:

And so we need, and that's something we work very hard on in Canada, we need what we call

AI comments.

536

:

So we need clear regulations, but we also need AI comment that is uh tools and

537

:

approaches that allow common people to either sell the data or protect the data or to

manipulate and do that without having a team of engineers and legal experts uh at hand.

538

:

And that last mile of some companies are doing the right thing to everyone is in a

position to be able to work and

539

:

at the same time do the right thing, uh we're not there yet.

540

:

We're not there yet.

541

:

So there's quite a bit of work.

542

:

And so in Canada, we are trying to work towards AI commands and look at the picture in an

holistic fashion and say, AI, the internet and the digital.

543

:

is just an infrastructure.

544

:

And infrastructures are comments.

545

:

Roads aren't private, they're public.

546

:

The mailing system, the post system, post Canada is public.

547

:

And we should have an equivalent for AI.

548

:

And if we do, then it means we would have places to store data, all sort of things will

happen that are not the case now.

549

:

Because it would be a public infrastructure, that would follow the laws sort of by

definition.

550

:

uh And then that way that will empower people to uh either offer the data or protect the

data and with some sort of guarantee because professional computer scientists uh would run

551

:

public servers to do that.

552

:

So that's one of the potential future that I can see in terms of.

553

:

untangling and trying to answer all those complex questions is to go back to basically um

considering the internet, AI and artificial information as part of being managed by public

554

:

services.

555

:

It used to be the case in France before the internet that we had the Minitel, which was a

French uh situation in which everyone had a terminal.

556

:

It was using the telephone lines at the time.

557

:

And you would have access to a network.

558

:

It's the Internet.

559

:

It was an Internet of the time.

560

:

uh Different networks would be linked together.

561

:

And then there was this idea that you would rent it from the post office and then the

servers would be at the city level or the department or even at the state level.

562

:

And then only computer scientists would work on those servers.

563

:

And uh so the software will always work and you don't have to install anything.

564

:

Right now we're back to that model, which is a decentralized model where you have client

server and everything is in the cloud, except that the cloud is not public, it's private.

565

:

And so we're already in a situation where if every communication and every action we do is

digitally mediated, well, uh those...

566

:

infrastructure we use, uh they're all private.

567

:

And so all of sudden, by the very fact of using them, the data escapes you, and the future

of those data and your behavior, everything is being logged, obviously, and everything

568

:

accepts you because everything becomes all of a sudden preempt by the private uh industry.

569

:

So with that uh digressing too, too, too much, which I love to do from your question,

Stena, there is here what I want

570

:

to just do is point out a vision that uh it has not always been the way it is right now

and there is nothing saying that it should be in the future the way it is right now.

571

:

But the way it is right now uh benefits a few very big companies and no one else really.

572

:

um So um you're saying that the digital infrastructure and cloud services should be

publicly owned or governed by the state or something like that?

573

:

Absolutely, absolutely there should be and there is like right now a big movement in most

countries to re-nationalize a lot of things and I think the digital infrastructure is one

574

:

of those things that people are now being increasingly worried about.

575

:

Yes, because it's almost impossible to live without Californian AI.

576

:

and it's almost impossible to enter the market and propose an alternative.

577

:

And so those two facts together are a little bit problematic, especially in the

geopolitical context in which we are entering.

578

:

Yeah, yeah.

579

:

So there's lots of stuff to dig into there.

580

:

I was also thinking about, since you mentioned that 35 % of the market for graphic

designers in France have been eaten up by AI.

581

:

uh

582

:

How do you see anything similar happening with tools like MIDI GPT?

583

:

You mentioned earlier that you could use it in sort of a workflow as a film music composer

could in terms of imitating styles and doing that more efficiently.

584

:

Could that also um take away some of the market opportunities?

585

:

Well, uh possibly, but I don't think so.

586

:

And we really took care of trying to design MIDI GPT in a way whereby it's an assistant to

the artist, but it's not a way to replace the artist.

587

:

With Suno Audio, can, without knowing nothing, scratch generate a great track in 30

seconds.

588

:

But with MIDI GPT...

589

:

you need to know what to start with, and then the granularity at which it works, there's

so much, because of all those controls, so much parameters and things, you have to work,

590

:

like it's work, and it's a tool you have to learn how to know and learn how to use, and

some people get good at it and or not, but in other words, with our systems, you cannot

591

:

completely remove the human of the loop.

592

:

And that's something that is important to us, to make sure that eventually the creator uh

stay uh being the human, and eventually we do computer-assisted creativity.

593

:

ah

594

:

and or even co-creation where we can sometimes acknowledge the system as part of this

composition.

595

:

And in fact we did studies around that because we were really interested both for

beginners and for professionals what are the differences, what's at stake and how do they

596

:

use the system.

597

:

And we find that indeed they use it very differently.

598

:

When we work with uh Ableton or Steinberg, in this case the studies take place in Cubase.

599

:

You know, a beginner

600

:

Typically, their problem is that they bought a music software and they want to make music,

but um they never finished a track.

601

:

They don't know already how to do it.

602

:

And often those music software, especially in music, are aspirational purchase where

people buy Ableton because they want to make electronic music.

603

:

There's six million Ableton license.

604

:

I guarantee you 95 % of those license, they never finished a track.

605

:

Right?

606

:

So by putting something like MIDI GPT in their hands, they will start maybe make a melody,

maybe they can generate the rest.

607

:

They can also then generate things that they like, but they would not be able to compose

anyway.

608

:

And by steering the system, and that's where the balance of control, you know, the one

button generate versus like all of the control being explicitly shown, beginners get lost

609

:

if there's too much control, professional are annoyed if there's not enough control.

610

:

So this balance is always really hard in any software to strike.

611

:

but by manipulating and steering the system, if we do it well and if we give them the

right control, they can compose music that they like because they become the curator and

612

:

the system becomes the composer.

613

:

And they have this little assistant that does what they want and they can converge, and we

have evidence of that, to music that they like that they could not compose otherwise.

614

:

So for beginners, and that's really the vast majority of the market that...

615

:

those companies we work with would want to eat is that they want their beginners to be

able to become in a way composers.

616

:

The composers themselves, the professionals, those guys they know how to make music.

617

:

They actually don't even need the system, right?

618

:

And when we study with those professionals and in that study it was published at Ishkai uh

last year, the International Joint Conference on AI.

619

:

We define professional by people who actually make a living out of using Cubase.

620

:

So really, there's a direct link between this use of Cubase, this software, and their

livelihood.

621

:

And then we put this new AI system in it.

622

:

And for them, it's really scary.

623

:

It's been 30 years because Cubase is an old software that they use and compose in Cubase.

624

:

And also then, you put a machine that is almost competing with them, potentially, inside

their favorite software.

625

:

And then we observe what happens.

626

:

And what we find is for composers, they actually love it.

627

:

But they don't really use the output of the system at all.

628

:

Unlike the beginners, would verbatim use the output of the system.

629

:

The composers would be more like, what if I put a marimba here?

630

:

Give me three examples.

631

:

yeah, I like that.

632

:

I'll work with that.

633

:

And then they start composing a part.

634

:

Because they can.

635

:

Because that's what they do.

636

:

Usually some of them were even surprised when we asked if they use the They were like, of

course I'm not going to use the content of the system.

637

:

That's not my content.

638

:

But I'm interested to generate and try.

639

:

And so they do what-if scenario.

640

:

So you see that two different population of users, two completely different type way of

using it with two complete different uh outcomes.

641

:

And so we are interested in that, not just generating with MIDI GPT the best algorithm for

music composition, but also how to integrate them in existing workflows and software.

642

:

And then studying not just usability, does the system work or is the interface the right

interface, which we do.

643

:

of course, usability studies, but more importantly, we're interested in autorship.

644

:

At the end, do you feel this is your music that you made with ChargeGPT, or with MIDI GPT

in this case?

645

:

Or is it the machine music and you've been directing the machine and it's good music, but

it's not your music?

646

:

Or is it really your music and the machine was just an assistant?

647

:

Or is it a collaboration?

648

:

What is the status of that music?

649

:

And even more importantly,

650

:

We now experiment, we also look at what we call the phenomenological aspect of using those

software.

651

:

In other words, how do you feel?

652

:

So if you've been using Cubase for 20 years and you come and now there's this new system

and you sort of have to use it because you compose faster with it, how do you feel in the

653

:

morning?

654

:

Do you feel like, oh no, do you feel like this is really ruining your work, which is like

the passion of your life, or do you feel like you're still excited and using it?

655

:

So we really try to go...

656

:

deeper than just, you know, this is a toy, use it and tell me that it's amazing.

657

:

You know, no.

658

:

Like I think in JNI we need to pass this sort of, you know, flamboyant just quick demo of

like, my God, look what it does.

659

:

It does a little video that moves like that.

660

:

But yeah, but I can't use that video anywhere because it's totally not my aesthetic and I

can't get it to do what I want.

661

:

Well, you know, it's just a toy.

662

:

We're trying to build actual tools and to really study what are the implications of those

tools.

663

:

Excellent.

664

:

I think that was a great finishing statement for this episode, actually, So, thank you so

much for joining.

665

:

Yeah, thanks a lot and sorry for digressing in my answers.

666

:

I like to not really answer questions.

667

:

I like more to put out provocating thoughts and I hope your audience will appreciate it.

Follow

Chapters

Video

More from YouTube