Artwork for podcast Kunstig Kunst: Kreativitet og teknologi med Steinar Jeffs
Valerio Velardo on AI Music: From Classical Piano to Generative Soundscapes and the Future of Music Creation
Episode 1230th September 2024 • Kunstig Kunst: Kreativitet og teknologi med Steinar Jeffs • Universitetet i Agder
00:00:00 01:13:33

Share Episode

Shownotes

Valerio Velardo shares his journey from classical pianist to AI music engineer, researcher and entrepreneur. He discusses the development of Melodrive, the first real-time adaptive music engine for video games, and the growth of The Sound of AI, a thriving ecosystem with over 9,000 members.

Throughout the conversation, Valerio delves into the role of open source in AI music, the limitations of current AI music models, and the potential of neuro-symbolic integration for music creation. They also explore the future of AI-generated music in interactive content, the relationship between AI and human creativity, the role of AI in music education, and the ethical considerations of copyright in generative music.

Valerio Velardo is an AI music engineer, entrepreneur, and researcher with a passion for the intersection of AI and music. He builds AI music products and startups, consults in the AI music space, and helps companies hire and form AI teams. He founded Melodrive, the first AI music engine for generating real-time video game soundtracks, and later sold the company in 2020. Valerio also runs The Sound of AI, a YouTube channel and community with over 9,000 members, offering education and consulting services on AI music. Currently, he's writing The AI Music Revolution for UC Press.

Transcripts

Speaker:

[Automatic captions by Autotekst using OpenAI Whisper V3. May contain recognition errors.]

Speaker:

[SPEAKER_00]: Welcome to the podcast Artificial Art.

Speaker:

[SPEAKER_00]: My name is Steinar Jeffs.

Speaker:

[SPEAKER_00]: I'm a musician and a music teacher.

Speaker:

[SPEAKER_00]: And in this podcast, I'll be interviewing guests about technology and creativity.

Speaker:

[SPEAKER_00]: So hi and welcome.

Speaker:

[SPEAKER_00]: Valerio Velardo.

Speaker:

[SPEAKER_00]: I'll give a brief introduction of you first.

Speaker:

[SPEAKER_00]: You are an AI music engineer, entrepreneur and researcher.

Speaker:

[SPEAKER_00]: You have a PhD in AI music and you've studied both music composition and piano performance as well as astrophysics.

Speaker:

[SPEAKER_00]: You're the founder of an AI music ecosystem called The Sound of AI.

Speaker:

[SPEAKER_00]: So it's great to have you.

Speaker:

[SPEAKER_00]: And I was wondering if you could start us off by telling us your backstory, how this all began, this interest in music and AI, and maybe you can go on and tell us something about your current and previous activities.

Speaker:

[SPEAKER_01]: Oh yeah, absolutely.

Speaker:

[SPEAKER_01]: So first of all, thank you very much for having me here.

Speaker:

[SPEAKER_01]: It's a pleasure, really.

Speaker:

[SPEAKER_01]: And yeah, in terms of how I got where I am right now, it's a little bit of a journey, I have to say.

Speaker:

[SPEAKER_01]: So I've always been super interested in music.

Speaker:

[SPEAKER_01]: It's my number one passion.

Speaker:

[SPEAKER_01]: And I started to play piano, classical piano, since I was like...

Speaker:

[SPEAKER_01]: a kid, really, right?

Speaker:

[SPEAKER_01]: And so there I started to, as I said, playing piano, training as a classical pianist.

Speaker:

[SPEAKER_01]: From there, I moved on to music composition and then to conducting.

Speaker:

[SPEAKER_01]: But at the same time, I've always been interested in the more technical, scientific side of things.

Speaker:

[SPEAKER_01]: And because of that, I studied physics

Speaker:

[SPEAKER_01]: And I've always been involved in programming, really, right, since I was a teen, more or less.

Speaker:

[SPEAKER_01]: And then after that, at some point, I thought, okay, so how can I combine these two huge passions of mine?

Speaker:

[SPEAKER_01]: So the scientific side, technical side, and the one end, and on the other end of the spectrum, the more like artistic creative.

Speaker:

[SPEAKER_01]: And that's when I basically decided to explore how I could teach machines to be creative.

Speaker:

[SPEAKER_01]: So in other words, how I could generate music using machines or artificial intelligence.

Speaker:

[SPEAKER_01]: And so this is a journey that started for me in my very early 20s, I have to say 20, when I was like that age more or less.

Speaker:

[SPEAKER_01]: And then after a while, I started to like study like quite a lot.

Speaker:

[SPEAKER_01]: And after that, basically decided to do a PhD.

Speaker:

[SPEAKER_01]: I left Italy.

Speaker:

[SPEAKER_01]: So I'm originally from Italy.

Speaker:

[SPEAKER_01]: So after like all my degrees, then I moved on to the UK where at the University of Huddersfield, I did a PhD there.

Speaker:

[SPEAKER_01]: which was basically like in between the music department and computer science department.

Speaker:

[SPEAKER_01]: And I basically did a lot of research in generative music at that point.

Speaker:

[SPEAKER_01]: And so once I'm done with that, the next step was, okay, so what do I do now?

Speaker:

[SPEAKER_01]: So I have like a PhD.

Speaker:

[SPEAKER_01]: So I had like a couple of directions in front of me.

Speaker:

[SPEAKER_01]: One was just staying in Atkinsonio.

Speaker:

[SPEAKER_01]: And the other one was, yeah, why not start a startup company leveraging like all the research that I've done?

Speaker:

[SPEAKER_01]: And that's basically what I did.

Speaker:

[SPEAKER_01]: So I found a couple of guys and together we started a company called Melodrive.

Speaker:

[SPEAKER_01]: And at Melodrive...

Speaker:

[SPEAKER_01]: Basically, we built a really cool real-time generative music system that can generate music for video games.

Speaker:

[SPEAKER_01]: And it has one feature that's very, very unique to it, or it was very unique to it, which was

Speaker:

[SPEAKER_01]: generating music in real time that adapts to the sort of emotional context of a video game.

Speaker:

[SPEAKER_01]: So I'm going to give you an example.

Speaker:

[SPEAKER_01]: So you have this music that gets generated in real time, if you are

Speaker:

[SPEAKER_01]: relaxing, like for example with your character on a beach, then the very same musical themes are going to be played in a very relaxed manner, but all of a sudden there's a monster that comes over and starts kicking you, then probably you're going to hear the very same musical themes, but now they're going to be transposed into a more aggressive style, for example, right?

Speaker:

[SPEAKER_01]: And so we basically built this engine, this music engine, and it worked actually, actually well.

Speaker:

[SPEAKER_01]: And this was before all the buzz that we currently have with AI music, right?

Speaker:

[SPEAKER_01]: So it was back in 2016.

Speaker:

[SPEAKER_01]: And at some point, we did the typical things that you do with a startup company like Silicon Valley Life.

Speaker:

[SPEAKER_01]: So we went to Silicon Valley, went through an accelerator there, which was really, really cool, super educational at the same time.

Speaker:

[SPEAKER_01]: So we raised some capital, came back to Berlin, and there we just continued for a few years.

Speaker:

[SPEAKER_01]: And at some point, we just moved on to other things.

Speaker:

[SPEAKER_01]: And I worked for a couple of companies as head of machine learning operation.

Speaker:

[SPEAKER_01]: It was a music tech company as senior AI music researcher.

Speaker:

[SPEAKER_01]: And then after that, I started to work on the sound of AI, which is basically a...

Speaker:

[SPEAKER_01]: an ecosystem, really, where I do a lot of things related to AI music.

Speaker:

[SPEAKER_01]: So it started as a simple YouTube channel where I would just share my passion and knowledge of AI music and all things audio, really, beyond music, right?

Speaker:

[SPEAKER_01]: But the important thing was to do things at the crossroads of artificial intelligence and audio music things.

Speaker:

[SPEAKER_01]: And after that, I saw that there was like some significant interest within the niche.

Speaker:

[SPEAKER_01]: And so I started the Slack channel, which became the Sound of AI Slack community.

Speaker:

[SPEAKER_01]: Now it's quite strong, like with almost 10,000 people in there.

Speaker:

[SPEAKER_01]: It's very active.

Speaker:

[SPEAKER_01]: It's a very friendly community.

Speaker:

[SPEAKER_01]: And then with this community, we started to do really, really cool things, like for example,

Speaker:

[SPEAKER_01]: a project which we called the Open Source Research Project, where basically we had more than 150 to 200 people from all over the world contributing to a research project.

Speaker:

[SPEAKER_01]: So we had seven different research groups with their leaders and everything, with project managers.

Speaker:

[SPEAKER_01]: And so we managed to basically create a really, really

Speaker:

[SPEAKER_01]: cool piece of software, which was like an AI model that could generate sounds with vocal inputs.

Speaker:

[SPEAKER_01]: So, like, for example, give me a Jimi Hendrix guitar sound, right?

Speaker:

[SPEAKER_01]: And, like, this model would give you that.

Speaker:

[SPEAKER_01]: And the really cool thing is that, again, we also published this paper because we sort of, like, synthesized, like, summarized all the research that we did into a paper, and we published that in a very important generative music conference.

Speaker:

[SPEAKER_01]: And from there, we started doing a lot of other things like hackathons.

Speaker:

[SPEAKER_01]: Currently, we run this generative music AI workshop in collaboration with the music technology group at University Pompeu Fabra.

Speaker:

[SPEAKER_01]: So yeah, it's been quite active.

Speaker:

[SPEAKER_01]: And at the same time, I've been doing a lot of consulting in the space of AI music.

Speaker:

[SPEAKER_00]: uh space and advisorship and also recruitment like of ai music talent so yeah that's me in a nutshell you're a busy guy valerio it sounds like and have been busy for uh for a while now i guess uh and there's a few few things i'm thinking about as you tell your backstory

Speaker:

[SPEAKER_00]: First of all, your YouTube channel, The Sound of AI, that's how I found you in the first place.

Speaker:

[SPEAKER_00]: I would just recommend anyone listening to go check that channel out because it's really good, really informative.

Speaker:

[SPEAKER_00]: Thank you for that.

Speaker:

[SPEAKER_00]: You have a lot of educational videos on how AI works and how you can also leverage it for yourself.

Speaker:

[SPEAKER_00]: Yeah.

Speaker:

[SPEAKER_00]: You seem quite intent on the idea of open source and building things from the ground up like a grassroot movement.

Speaker:

[SPEAKER_00]: Is that kind of a philosophy you live by?

Speaker:

[SPEAKER_01]: Oh yeah, absolutely.

Speaker:

[SPEAKER_01]: I think it is something that we should all embrace, especially now that with this new technological revolution that is like the AI revolution.

Speaker:

[SPEAKER_01]: And in our specific niche, it's going to be the AI music revolution, right?

Speaker:

[SPEAKER_01]: And I think open source is going to be instrumental in order to sort of like give back control, especially to copyright holders in our case, right?

Speaker:

[SPEAKER_01]: Generative music or AI music in general.

Speaker:

[SPEAKER_01]: We've seen a lot of, well, not a lot, but a few companies right now that are sort of like the flagship companies in generative music AI these days that are not necessarily being super transparent with the type of data that they're using and also the type of

Speaker:

[SPEAKER_01]: sort of relationship that they have with the copyright holders.

Speaker:

[SPEAKER_01]: It's not really that clear whether they're using copyrighted data for training their generative music models, and if they do, whether they have consent and whether they are sort of paying for that right, for that privilege of using that type of music.

Speaker:

[SPEAKER_01]: Because of that, I go by, let's say, like I follow a certain, let's say, a few principles.

Speaker:

[SPEAKER_01]: Like for me, whenever I work like in generative music, I think the most important thing, first of all, is consent.

Speaker:

[SPEAKER_01]: So basically making the copyright holders happy.

Speaker:

[SPEAKER_01]: aware of the work that I'm doing and getting the green light from them so that I know that I'm doing something that the people who actually created that data that I'm currently using are actually okay with whatever I'm doing.

Speaker:

[SPEAKER_01]: That is number one thing, and along with that goes the part of transparency, right?

Speaker:

[SPEAKER_01]: So you need to be transparent.

Speaker:

[SPEAKER_01]: And I think this is a huge problem in the industry, not just the music industry, but the industry at large.

Speaker:

[SPEAKER_01]: Take Chachi Petit, for example, right?

Speaker:

[SPEAKER_01]: So I remember there was this really cool interview with OpenAI CTO, right?

Speaker:

[SPEAKER_01]: And there was the interviewer asking the CTO what type of data they use for training SORA, which is this really, really cool experimental model, text-to-video model.

Speaker:

[SPEAKER_01]: And the CTO was like completely shocked in a sense, like by the question and she couldn't really utter an answer, a proper answer to that question, right?

Speaker:

[SPEAKER_01]: And that speaks volumes in and of itself, right?

Speaker:

[SPEAKER_01]: So it has happened with StarGPT, so basically using whatever type of text found in the internet and elsewhere.

Speaker:

[SPEAKER_01]: And probably it's happening right now with music AI companies.

Speaker:

[SPEAKER_01]: And I believe that this is a huge problem.

Speaker:

[SPEAKER_01]: So transparency is a necessary step.

Speaker:

[SPEAKER_01]: we need to know the type of data that these huge models have been trained on.

Speaker:

[SPEAKER_01]: So I think like it is like a very ethical point for me to respect.

Speaker:

[SPEAKER_01]: And then obviously there's a third point, which is that of pain.

Speaker:

[SPEAKER_01]: copyright holders that if I build a generative music system and I'm going to monetize it, I think it's only ethical for me or my company or my institution to actually send some of the revenues or the profits that I get out of that model to the copyright holders that I have leveraged, who have leveraged in order to generate or to actually implement that particular generative model.

Speaker:

[SPEAKER_01]: There's a fourth point and now we sort of like loop back to your question regarding open source.

Speaker:

[SPEAKER_01]: I think open source, it is extremely important because in a world where AI is going to sort of transform everything or most of the things that we already know, I think it is extremely important to have models that we can really trust.

Speaker:

[SPEAKER_01]: and models that will be reliable, and the only way for that to have that level of trust in the models is if these models are publicly available, let's say, on GitHub, on code repositories, so for everybody to check them and to play around with them.

Speaker:

[SPEAKER_01]: Now, obviously, I know that it is extremely important for companies to make profits.

Speaker:

[SPEAKER_01]: Of course, it's their main goal, right?

Speaker:

[SPEAKER_01]: So what I sort of suggest is to, of course, on the one hand, try to have open source models, but on the other hand, what you can do is basically...

Speaker:

[SPEAKER_01]: Don't allow third parties to use those models if they want to commercialize those models.

Speaker:

[SPEAKER_01]: So for me, the most important thing is that those models will be available out there for everybody to see and check.

Speaker:

[SPEAKER_01]: But then of course, commercialization is another thing.

Speaker:

[SPEAKER_01]: So I think that is something that companies should do, and third parties shouldn't be able to use those models in order to commercialize them.

Speaker:

[SPEAKER_01]: Unless, of course, you're fully open source, but that is another type of category, like a startup, and it's something different.

Speaker:

[SPEAKER_01]: But for your typical AI company, I think open source should be a foundational step, but at the same time, you should save money.

Speaker:

[SPEAKER_01]: commercialization for yourself.

Speaker:

[SPEAKER_00]: Okay.

Speaker:

[SPEAKER_00]: Another thing I was thinking about while we were talking about your background was the

Speaker:

[SPEAKER_00]: The model you made where you could input sound and then get Jimi Hendrix sound out of it.

Speaker:

[SPEAKER_00]: That was an open source thing, right?

Speaker:

[SPEAKER_00]: Yeah, of course.

Speaker:

[SPEAKER_00]: So is it still available?

Speaker:

[SPEAKER_00]: Could our listeners try it for themselves?

Speaker:

[SPEAKER_01]: Yeah, so basically it's in a GitHub repository.

Speaker:

[SPEAKER_01]: So you can just go there.

Speaker:

[SPEAKER_01]: So we have a website there, right?

Speaker:

[SPEAKER_01]: So where we just like put a lot of information about the entire process there.

Speaker:

[SPEAKER_01]: So I'm going to share a link with you like later so that you can just like

Speaker:

[SPEAKER_01]: And from there, you also have access to decode.

Speaker:

[SPEAKER_01]: And there was one of these research groups, or subgroups, I should say, which was responsible for creating an interface so that you can also play around with it through an interface, which is quite cool.

Speaker:

[SPEAKER_01]: And we also have a companion website where you can hear some of the sounds generated.

Speaker:

[SPEAKER_01]: Some of those are quite canonical, whereas others are quite, quite out there, I should say.

Speaker:

[SPEAKER_01]: They're really, really weird sounds, but everything is open source, yes.

Speaker:

[SPEAKER_00]: Cool.

Speaker:

[SPEAKER_00]: So in the generative music space, there's kind of two giants at the moment, it feels like, Zuno and Udio.

Speaker:

[SPEAKER_00]: At least that's what generally people are telling me about.

Speaker:

[SPEAKER_00]: And one comment that's often made is that they tend to make,

Speaker:

[SPEAKER_00]: like generic music kind of boring music a lot of people say um what would you consider the the forefront or or cutting edge like artistically in the ai space at the moment

Speaker:

[SPEAKER_01]: Uh, yeah, I think probably out of all the companies that are actively producing these models out there, like the, the demo that impressed me the most was the one from Eleven Labs.

Speaker:

[SPEAKER_01]: It seems to me that they had a little bit of a quantum leap if you compare their model against Suno's and Yudio's models.

Speaker:

[SPEAKER_01]: For sure.

Speaker:

[SPEAKER_01]: I think the music is way more coherent, a little bit more creative, if you just let me pass that word in this context, and also the overall quality in terms of the audio fidelity

Speaker:

[SPEAKER_01]: is higher.

Speaker:

[SPEAKER_01]: So out of the current text-to-music models, I would say that 11 Labs is the one that impressed me the most.

Speaker:

[SPEAKER_01]: But I'm not necessarily super keen on these types of models.

Speaker:

[SPEAKER_01]: We're talking about text-to-music models.

Speaker:

[SPEAKER_01]: So in other words, you input some music description, like for example, hey,

Speaker:

[SPEAKER_01]: generate a Jimi Hendrix... Well, you shouldn't be really able to do that because of copyright reasons, but let's assume you could.

Speaker:

[SPEAKER_01]: You would write something like, hey, generate something in the style of Jimi Hendrix with these lyrics, and that has a guitar solo or a ukulele solo, and that's it.

Speaker:

[SPEAKER_01]: Now, I think this is quite impressive, especially for your typical internet user.

Speaker:

[SPEAKER_01]: For them, this is fantastic.

Speaker:

[SPEAKER_01]: Imagine you are on TikTok and you want to create some music that's unique and customized for whatever TikTok you've created or Instagram story.

Speaker:

[SPEAKER_01]: I mean, that's fantastic.

Speaker:

[SPEAKER_01]: And it taps into this idea of user-generated content, which is really, really nice.

Speaker:

[SPEAKER_01]: But the musician in me tells me that this is not really the most exciting application of artificial intelligence within the music environment.

Speaker:

[SPEAKER_01]: What I strive for instead is to try to create models that can augment musicians.

Speaker:

[SPEAKER_01]: And by musicians, I mean both professionals and semi-professionals or amateurs.

Speaker:

[SPEAKER_01]: I think there is a whole world to discover, like, for example, the opportunity of augmenting and streamlining music generation, well, music composition, I should say, or music production.

Speaker:

[SPEAKER_01]: And imagine you have a model that can help you come up with new melodic ideas or harmonizations or accompaniments.

Speaker:

[SPEAKER_01]: So that at the end of the day, you'll be a little bit like a film director who can work at the high level.

Speaker:

[SPEAKER_01]: high level and then have all the... sorry about that.

Speaker:

[SPEAKER_01]: Sorry about that.

Speaker:

[SPEAKER_01]: Yeah, so I'll just go ahead.

Speaker:

[SPEAKER_01]: So imagine you can basically like come up with a lot of like melodic ideas and then harmonizations or just like a complement.

Speaker:

[SPEAKER_01]: So what those AI models will do for you is basically just like streamline in the process and let you be a sort of like film director, and instead of like taking care of all the minutiae in the music, you can think of the higher level concepts and then having the model actually filling up like all the tiny little details for you, and this will

Speaker:

[SPEAKER_01]: streamline your compositional production process quite a lot.

Speaker:

[SPEAKER_01]: And this is really something that I'm striving for, this type of implementations of artificial intelligence in the music space.

Speaker:

[SPEAKER_00]: And in some of your YouTube videos, you talk about this combination of symbolic and more like machine learning type of AI.

Speaker:

[SPEAKER_00]: And you use this word neuro-symbolic integration.

Speaker:

[SPEAKER_00]: Could you explain what that is and what it potentially could do to musicians?

Speaker:

[SPEAKER_00]: Or for musicians, I mean?

Speaker:

[SPEAKER_00]: Yeah, of course.

Speaker:

[SPEAKER_01]: So I feel like I have to give you a little bit of context here to get into neuro-symbolic integration.

Speaker:

[SPEAKER_01]: So right now, the way we generate music or audio in general can follow like two paths, let's say.

Speaker:

[SPEAKER_01]: So one is a more traditional path, and this is like generative music from its inception, let's say towards like the end of the 50s until 2000, 2010.

Speaker:

[SPEAKER_01]: So your typical generative music models at the time were based off of traditional AI algorithms.

Speaker:

[SPEAKER_01]: So not really machine learning data-driven, but rather

Speaker:

[SPEAKER_01]: rules-driven.

Speaker:

[SPEAKER_01]: So in this space we like to refer to this type of implementations like good old-fashioned AI.

Speaker:

[SPEAKER_01]: So this is like what came before machine learning and deep learning.

Speaker:

[SPEAKER_01]: So here we're talking about those models like expert systems, generative grammars, and a bunch of other things, but you can have that, right?

Speaker:

[SPEAKER_01]: And so these models are based off of the music knowledge that we have and that we can transfer

Speaker:

[SPEAKER_01]: to program directly within a piece of software, right?

Speaker:

[SPEAKER_01]: So this was like the typical way of doing generative music before the advent of machine learning.

Speaker:

[SPEAKER_01]: Then we had machine learning and more specifically deep learning.

Speaker:

[SPEAKER_01]: Now all of a sudden it's not the composer, it's not the programmer who

Speaker:

[SPEAKER_01]: sort of like programs all the rules within a certain program, but rather is the program itself that figures out the rules and extracts the patterns from the data.

Speaker:

[SPEAKER_01]: In other words, I have a deep learning model, so it could be a neural network, and I pass a lot of data

Speaker:

[SPEAKER_01]: For example, in the symbolic sphere, like, I could pass a lot of, like, MIDI files, let's say, of Bach chorales, right?

Speaker:

[SPEAKER_01]: It's your typical, like, Hello World application for generative music.

Speaker:

[SPEAKER_01]: So you pass all of the MIDI files of all the Bach chorales into a neural network, and then the neural network hopefully will figure out all the rules and patterns hidden within Bach's music.

Speaker:

[SPEAKER_01]: And in this way, the model learns by looking at data just by itself.

Speaker:

[SPEAKER_01]: This is a great thing, and it has allowed us to create things that would be unthinkable

Speaker:

[SPEAKER_01]: So, for example, if you listen to Suna's results or Eleven Labs' results, they directly generate music in the audio space.

Speaker:

[SPEAKER_01]: So they directly generate waveforms.

Speaker:

[SPEAKER_01]: And it would be basically impossible to come up with rules, human-made rules, to actually generate waveforms.

Speaker:

[SPEAKER_01]: It's just impossible because it's too complex.

Speaker:

[SPEAKER_01]: So there's a huge advantage in using these deep learning models, these data-driven models, but they come with a certain limitation.

Speaker:

[SPEAKER_01]: And the limitation that they have is that they don't afford enough control to the end user.

Speaker:

[SPEAKER_01]: In other words,

Speaker:

[SPEAKER_01]: They are kind of black boxy, so we really don't know what's going on, and it's difficult to control them.

Speaker:

[SPEAKER_01]: And when they generate things, they tend to be like really, really wild, right?

Speaker:

[SPEAKER_01]: It's like a super wild horse, and you need to direct that horse in a direction that makes sense for the horse race, right?

Speaker:

[SPEAKER_01]: And my idea basically is to combine both good old-fashioned AI, so the more traditional models, also called symbolic-based models, with deep learning, so that we're going to have this neuro-symbolic integration.

Speaker:

[SPEAKER_01]: So we're going to have

Speaker:

[SPEAKER_01]: hybrid models that leverage the power of deep learning models, so neural networks, and here's the neural part in neuro-symbolic, but at the same time that they are controlled to a certain extent by good old-fashioned AI models, so models that rely on symbols and that manipulate symbols.

Speaker:

[SPEAKER_01]: And the great advantage of that approach is that you're going to have

Speaker:

[SPEAKER_01]: In the end, algorithms that a musician could relatively easily control, which is, I think, the most important missing piece of the puzzle so far.

Speaker:

[SPEAKER_00]: Yeah.

Speaker:

[SPEAKER_00]: So in music theory, when you start to learn how to compose melodies, for instance, a common approach is to use some large melodic leaps, a lot of stepwise motion.

Speaker:

[SPEAKER_00]: Some skips and often a contrary motion downwards after you've gone upwards, for instance.

Speaker:

[SPEAKER_00]: And those are some sort of rules one could encode in your good old-fashioned AI system.

Speaker:

[SPEAKER_00]: Well, in a neural system, you would just give the system a lot of melodies and let them figure it out for themselves.

Speaker:

[SPEAKER_00]: So you're saying one could combine both rule-based symbolic systems and the wild horse that is the neural network together.

Speaker:

[SPEAKER_00]: Yeah.

Speaker:

[SPEAKER_01]: Absolutely.

Speaker:

[SPEAKER_01]: And there's quite some interesting research in the space, and this is also research that I've been doing for quite some time, and it tends to work way, way better, especially if we are dealing with symbols.

Speaker:

[SPEAKER_01]: Now, there's a little bit of ambiguity using the term symbol here, because it means two things in this context.

Speaker:

[SPEAKER_01]: Symbolic music generation means generating music that has a symbolic representation, like for example MIDI or piano roll representation, right?

Speaker:

[SPEAKER_01]: But then we have symbolic AI, and those are algorithms that use symbols and manipulate symbols somehow.

Speaker:

[SPEAKER_01]: We should be aware of these two distinctions.

Speaker:

[SPEAKER_01]: But yeah, I totally agree with what you said.

Speaker:

[SPEAKER_01]: And as I said, there's a lot of research that pushes, well, not really a lot, but there is some research that pushes in this direction.

Speaker:

[SPEAKER_01]: And in my experience, this is what works best.

Speaker:

[SPEAKER_01]: tends to work better, especially in the symbolic representation space.

Speaker:

[SPEAKER_01]: So whenever we're dealing with scores or MIDI or piano roll representation.

Speaker:

[SPEAKER_01]: And to your point, let's say you have a melody generator, right?

Speaker:

[SPEAKER_01]: So one of the things that have been experimented is to basically come up with an initial

Speaker:

[SPEAKER_01]: idea melodic idea that is very well structured through some rules right because one of the drawbacks of neural networks is that they are wild horses meaning like that it is very difficult to let them create

Speaker:

[SPEAKER_01]: for example, melodies that are self-consistent and self-coherent.

Speaker:

[SPEAKER_01]: So in other words, that they are structured in such a way where you have, okay, so you have this initial, let's say, theme here that gets repeated, and then the two themes get developed, and now I have the level of a phrase or a semi-phrase, and semi-phrase A

Speaker:

[SPEAKER_01]: is followed by semi-phrase B, which has a contrasting material, but then this is encapsulated at a higher level of abstraction into the phrase level, and then now I have phrase 1 and phrase 1', that's just a variation.

Speaker:

[SPEAKER_01]: So this level of fractal structure within the music, it's very difficult to capture with a wild-horse algorithm as a neural network.

Speaker:

[SPEAKER_01]: But this is where a symbolic generative algorithm can help you quite a lot.

Speaker:

[SPEAKER_01]: So you can come up with rules to generate the backbone of this melody that's consistent.

Speaker:

[SPEAKER_01]: And then you can basically use that as an input to a wild horse algorithm, to say a transformer, a neural network, and now all of a sudden you can use that transformer to add complexity and variety and interest to a melody.

Speaker:

[SPEAKER_01]: And this tends to work quite well because you have the best of the two worlds, really.

Speaker:

[SPEAKER_01]: And at the same time,

Speaker:

[SPEAKER_01]: The really good thing about an approach like this is that you are the one who can actually set up the parameters for the initial generations, and you can also set up the rules.

Speaker:

[SPEAKER_01]: So you may have a product or an algorithm where you have the flexibility to say, hey, I want this level of, I don't know, like, let's say, harmonic complexity or like chromaticism or density.

Speaker:

[SPEAKER_01]: in this melody, right?

Speaker:

[SPEAKER_01]: And then that will allow any composer to have more control over the generation process, which I think in the end is the most important thing and that right now is actually missing.

Speaker:

[SPEAKER_00]: Yeah, often as a composer you could get a good idea, a riff or a bit of a melody or something and then you don't really know where to take it.

Speaker:

[SPEAKER_00]: So that would be a great addition to the toolbox to be able to input that idea and then

Speaker:

[SPEAKER_00]: get those variations, reharmonizations, rhythmic complexity, and actually be able to adjust the parameters, as you were saying, by yourself.

Speaker:

[SPEAKER_00]: Google Magenta, last time I tried it, it had, I think it was only one parameter you could adjust, which was temperature, which was kind of how wild it was.

Speaker:

[SPEAKER_00]: Yes.

Speaker:

[SPEAKER_00]: And basically that experience for me at least was you could choose between wild or extremely bizarre wild.

Speaker:

[SPEAKER_00]: It didn't really give me any feasible results.

Speaker:

[SPEAKER_00]: It was just completely crazy.

Speaker:

[SPEAKER_00]: Oh, yeah.

Speaker:

[SPEAKER_00]: But if one were able to control parameters on a more nuanced level, that would be really great.

Speaker:

[SPEAKER_01]: Yeah, I think, if I can just quickly add here, the problem with those type of models, the wild horse models, is that, like, for example, the temperature parameter that you had with Google Magenta's plugin, well, that is a...

Speaker:

[SPEAKER_01]: parameter that comes directly from the model, the deep learning model.

Speaker:

[SPEAKER_01]: I think like it's probably like a LSTM RNN.

Speaker:

[SPEAKER_01]: And so you have like that temperature parameter that you can use, but that is part of that architecture.

Speaker:

[SPEAKER_01]: So it doesn't really have much to do with music theory at all.

Speaker:

[SPEAKER_01]: What I'm talking here instead is having parameters that are really, really connected to concepts in music theory.

Speaker:

[SPEAKER_01]: Of course, you may have also that different levels of abstractions where you're going to just, let's say,

Speaker:

[SPEAKER_01]: show higher level concepts to semi-professionals.

Speaker:

[SPEAKER_01]: Like for example, I don't know, like the mood.

Speaker:

[SPEAKER_01]: So like the type of mood, like a melody or a certain harmonization maybe, or things like how complex things that are a little bit more

Speaker:

[SPEAKER_01]: understandable by people who don't necessarily have like that music theory grammar but the moment we transition to professionals it could be like film composers it could be producers what have you but at this point we can also tap into music theory related things like say yeah I want some music that has a lot of chromaticism I want some music that has like this percentage of tonic

Speaker:

[SPEAKER_01]: like in the mix or in a chord progression, or I want it to be like way more atonal, let's say, right?

Speaker:

[SPEAKER_01]: So you can start to speak the very same language that most of these music professionals would actually understand.

Speaker:

[SPEAKER_01]: It's a little bit like... the way I think of it is a little bit like a synthesizer, where you have a lot of knobs and levers that you can use in order to craft the perfect sound that you have in mind, right?

Speaker:

[SPEAKER_01]: So if I want a particular sound, I know that perhaps I have to use a sine wave, right?

Speaker:

[SPEAKER_01]: or I have to use this filter or this reverb.

Speaker:

[SPEAKER_01]: So it's different ingredients, right?

Speaker:

[SPEAKER_01]: And so you can mix the ingredients differently in order to get what you want.

Speaker:

[SPEAKER_01]: So that works very nicely, for example, with subtractive synthesis, but it could work more or less the same way with a proper interface with generative music.

Speaker:

[SPEAKER_00]: And then synth players, when confronted by this, for instance with software which can re-synthesize sounds,

Speaker:

[SPEAKER_00]: you would maybe ask them do you want to use this software you can re-synthesize that particular sound and really good synth players would probably answer why would I do that when I can do it myself just twisting attack and delay and those sorts of things and probably a composer would say the same thing right now but maybe in the future one would perhaps not

Speaker:

[SPEAKER_00]: necessarily need to learn that part of the craft anymore if if you could do it with a model instead do you think the future musicians and composers will be less versed in the in the craft of making music than they are now

Speaker:

[SPEAKER_01]: No, I don't think so.

Speaker:

[SPEAKER_01]: I just think that AI is just another tool in the composer's palette.

Speaker:

[SPEAKER_01]: And if you think about the history of music, and I'm writing a book about the AI music revolution, and there's a chapter, I think it's, yeah, it's chapter number two, where I go through the history of music sort of like in a nutshell.

Speaker:

[SPEAKER_01]: And the point that I tried to make there is that the history of music is a history of technological revolutions and technological development.

Speaker:

[SPEAKER_01]: So imagine in prehistoric times when the first human being found that they could use like a pipe

Speaker:

[SPEAKER_01]: to generate sound, right?

Speaker:

[SPEAKER_01]: Now, all of a sudden, they've externalized music making to a tool that they can leverage to amplify an idea that they have, a musical idea, right?

Speaker:

[SPEAKER_01]: So already a pipe is an amplifier, a creative amplifier.

Speaker:

[SPEAKER_01]: Then you move on and you have like the pipe organ, you have, let's jump like hundreds, if not thousands of years, and then you start having things like electric guitars or things like synthesizer, sample.

Speaker:

[SPEAKER_01]: sounds right and all of these things what they do is they just afford you more creative options and i think that ai is just another link in this super long chain of music technology in the end right and so i don't think uh

Speaker:

[SPEAKER_01]: composers or producers will be less music savvy just because there's AI.

Speaker:

[SPEAKER_01]: No, I think quite the opposite.

Speaker:

[SPEAKER_01]: Now composers have the opportunity to leverage more or less creative intelligence systems that can help them streamline the compositional process.

Speaker:

[SPEAKER_01]: For professional composers, I think the advantage of using AI will be just streamlining the process and at the same time using them as sparring partners.

Speaker:

[SPEAKER_01]: Because sometimes you have writer's block.

Speaker:

[SPEAKER_01]: We've always had it whenever we write something or we compose something, right?

Speaker:

[SPEAKER_01]: You have that thing.

Speaker:

[SPEAKER_01]: Now,

Speaker:

[SPEAKER_01]: Having a companion that can be there and suggesting a few options could be a really, really good, valuable resource to have in order to streamline the process and breaking out of composer's block, which is a real thing.

Speaker:

[SPEAKER_01]: And there's also another thing that I think is extremely interesting for AI in music,

Speaker:

[SPEAKER_01]: And that is potentially creating a type of music that was not possible earlier.

Speaker:

[SPEAKER_01]: Let me explain.

Speaker:

[SPEAKER_01]: I'm talking about interactive music or adaptive music.

Speaker:

[SPEAKER_01]: And I'm going to just make an example with video games, because that's what I know and that's what I've done, also with Melodrive, but I think it's quite fitting.

Speaker:

[SPEAKER_01]: Now, if you take a huge game like Skyrim, like these huge fantasy games, like people can spend thousands of, hundreds if not thousands of hours in them, right?

Speaker:

[SPEAKER_01]: Problem is that at the end of the day, the soundtrack is going to last just like five to ten hours, like what have you, but not more than that usually, right?

Speaker:

[SPEAKER_01]: And that's just like normal, because you can't ask a composer to write a thousand hour

Speaker:

[SPEAKER_01]: hours of music.

Speaker:

[SPEAKER_01]: That's just logical.

Speaker:

[SPEAKER_01]: But now, what if you had a model, an AI that could actually take your ideas, musical ideas, musical things,

Speaker:

[SPEAKER_01]: as input and then implement those ideas in real time and change them depending on the type of emotional feedback that the AI model gets from the game and, of course, from the user.

Speaker:

[SPEAKER_01]: So that is going to create a sort of infinite stream of music that's always perfectly fitting the particular emotional context of the video game.

Speaker:

[SPEAKER_01]: So this is for all sorts of interactive content, not just video games, but also virtual reality and augmented reality applications.

Speaker:

[SPEAKER_01]: Beyond that, we can also think of other types of music where, for example, there may be a sort of newly found relationship between a composer, songwriter, and their fans, where

Speaker:

[SPEAKER_01]: You have an AI that actually re-implements the initial musical ideas of the songwriter, the composer, in such a way that maximizes the interest of that particular user.

Speaker:

[SPEAKER_01]: Perhaps having an orchestration, an instrumentation that just is in tune with whatever particular genre that fan is usually into.

Speaker:

[SPEAKER_01]: So now all of a sudden, you may have, let's say, a Beatles song.

Speaker:

[SPEAKER_01]: in a reggae style or a Bob Marley song in a, I don't know, a grunge style, for example.

Speaker:

[SPEAKER_01]: Things like that, right?

Speaker:

[SPEAKER_01]: And I think this is a really, really interesting discussion to have and it's going to open up situations that are just impossible without a yarn.

Speaker:

[SPEAKER_00]: For sure.

Speaker:

[SPEAKER_00]: Do you see any downsides to such a development?

Speaker:

[SPEAKER_01]: One of the downsides that I see is that there's going to be possibly a huge influx of music, relatively low quality to mid-quality music.

Speaker:

[SPEAKER_01]: And we already have a ton of that made by humans.

Speaker:

[SPEAKER_01]: And now all of a sudden we're going to have

Speaker:

[SPEAKER_01]: That's the human-made one plus the AI-made one.

Speaker:

[SPEAKER_01]: But I think at the end of the day, probably what it's going to end up being is a market divided into different brackets.

Speaker:

[SPEAKER_01]: So you're going to have...

Speaker:

[SPEAKER_01]: Let's say players like Sune, Level Labs, or Udo, which are going to provide models for non-musicians, just like to create this half decent music.

Speaker:

[SPEAKER_01]: And that's all they need, like really in the end.

Speaker:

[SPEAKER_01]: they just probably don't want to create the next huge hit or the next Beethoven symphony.

Speaker:

[SPEAKER_01]: They don't care.

Speaker:

[SPEAKER_01]: They just care to express themselves creatively through the help of an artificial intelligence.

Speaker:

[SPEAKER_01]: But this, of course, is going to result into billions and billions of songs or sections of songs pumped into the market.

Speaker:

[SPEAKER_01]: But again, I don't think that this is going to touch the more creative part

Speaker:

[SPEAKER_01]: of the market.

Speaker:

[SPEAKER_01]: Let's call it like the higher end of the market where that's going to still be, of course, like the craft of a human being behind most of the creations.

Speaker:

[SPEAKER_00]: Yeah, one take might be that as the public gets exposed to a lot of, let's say, generic music, kind of a bit boring music, entertaining but boring, they might be even more interested in really creative, more out there kind of music as well.

Speaker:

[SPEAKER_01]: I think this is a terrific point.

Speaker:

[SPEAKER_01]: And this is something that I've been sort of rehearsing in my mind for quite some time.

Speaker:

[SPEAKER_01]: And thinking that, so if we are going to end up having a ton of already heard music, because at the end of the day, especially like this type of algorithms that we have these days, all they do is like recreate music in the style that they've been exposed to.

Speaker:

[SPEAKER_01]: So they can't create a new music genre by themselves.

Speaker:

[SPEAKER_01]: So they're just going to spit variations and re-elaborations of like what they've already seen.

Speaker:

[SPEAKER_01]: If you live in that world now, all of a sudden, perhaps even the music makers

Speaker:

[SPEAKER_01]: will be pushed to create something that's outside of the typical music content that you get out of these models, which perhaps in a sense could facilitate a new musical renaissance, just because of the need of being unique and outside of the AI bubble somehow.

Speaker:

[SPEAKER_01]: i think this could be an interesting thing to see and i don't know like if it's gonna sort of like go down in this direction but it's definitely a possibility that i see there yeah um one of the examples you mentioned earlier was uh that you could uh adapt the music to different moods

Speaker:

[SPEAKER_00]: and such and that might be falling into the category of music as entertainment since it's kind of catered to the user's needs and some artists will argue that art is really something that is not necessarily entertaining but it's something that pulls you out of your comfort zone and it's something different than what you were expecting or something different than what you wanted you know

Speaker:

[SPEAKER_00]: That's where real art lies.

Speaker:

[SPEAKER_00]: Could you imagine using or how one would use AI tools to make art that kind of takes you out of your comfort zone?

Speaker:

[SPEAKER_01]: Uh, yeah.

Speaker:

[SPEAKER_01]: So first of all, I just want to mention that I guess 90% of the, or 99% of the music that we have on the charts these days doesn't belong to the category like that.

Speaker:

[SPEAKER_01]: In my opinion, at least it doesn't belong to that category.

Speaker:

[SPEAKER_01]: I mean, like I come from classical music, so I have like a completely like different sort of like

Speaker:

[SPEAKER_01]: let's say, understanding and opinions in terms of what makes artistic music artistic.

Speaker:

[SPEAKER_01]: But beyond that, I think one could make the case for having an AI that actually helps you generate things that are

Speaker:

[SPEAKER_01]: outside of your comfort zone i don't know like if you've seen some of the cool experiments that some artists like have already like made like with ai so for example there's like the there's a group like a friend's like data but i don't know like if you know them like they've been playing around like with all your base generation i think like for five six years now and

Speaker:

[SPEAKER_01]: A few years back, they were coming out with these crazy-ass sounds that were super, super interesting and kind of completely unheard of.

Speaker:

[SPEAKER_01]: And they could create these sounds slash almost compositions by using these weird models, audio-based models.

Speaker:

[SPEAKER_01]: And I think they also had a very, very nice collaboration with a beatboxer.

Speaker:

[SPEAKER_01]: Bitbox?

Speaker:

[SPEAKER_01]: Bitboxer?

Speaker:

[SPEAKER_01]: I can't remember.

Speaker:

[SPEAKER_01]: It's basically like when you start using your mouth to generate weird sounds.

Speaker:

[SPEAKER_01]: Things like that, right?

Speaker:

[SPEAKER_01]: I think it's called bitboxing, something like that, but I'm not 100% sure.

Speaker:

[SPEAKER_01]: But they had this really, really cool collaboration where this guy

Speaker:

[SPEAKER_01]: with the help of DataBots, like trained a DataBots model on his beatboxing like performances.

Speaker:

[SPEAKER_01]: And so you had this sort of like collaboration and dialogue in a piece of music between the human beatboxer

Speaker:

[SPEAKER_01]: and the machine beatboxer, which was a sort of in-silico variation of the human.

Speaker:

[SPEAKER_01]: And the result was stunning.

Speaker:

[SPEAKER_01]: Really, really interesting things on the edge.

Speaker:

[SPEAKER_01]: Also, another thing is that many of these models

Speaker:

[SPEAKER_01]: that do not want to actually recreate your typical like pop music rock music like perfectly tend to create things that are quite noisy with a lot of artifacts like quite weird and there you can leverage that weirdness to create things that sound really really cool and otherworldly

Speaker:

[SPEAKER_01]: So there's a lot of opportunity there.

Speaker:

[SPEAKER_01]: It's just a matter of having like the right ideas and the right creative approach and open-mindedness in order to integrate that input that comes from all of these models.

Speaker:

[SPEAKER_00]: Yeah.

Speaker:

[SPEAKER_00]: I was just now, as you were talking about it, envisioning a world where microtonal music perhaps is more prevalent or maybe in like a post-neural link world where maybe one expands the limit of hertz one can listen to.

Speaker:

[SPEAKER_00]: And then you could leverage AI to make interesting music in like a larger specter of sound, basically.

Speaker:

[SPEAKER_00]: Oh, yeah, absolutely.

Speaker:

[SPEAKER_00]: But in your opinion, who are the top people or institutions or companies in the AI music field right now?

Speaker:

[SPEAKER_00]: Who's doing the most interesting stuff?

Speaker:

[SPEAKER_01]: Yeah, as I mentioned, I think at the level of text-to-music, I would say 11 Labs.

Speaker:

[SPEAKER_01]: So probably they are the ones that have, at the moment, at least that we know of, the most advanced model out there, text-to-music model for sure.

Speaker:

[SPEAKER_01]: In terms of academia, there's a few research groups out there that are really, really good.

Speaker:

[SPEAKER_01]: I think...

Speaker:

[SPEAKER_01]: There's definitely a music technology group in Barcelona, the one that I'm collaborating with.

Speaker:

[SPEAKER_01]: They do some amazing work in the AI music space and they are starting to tap into generative music ever more, which is super cool.

Speaker:

[SPEAKER_01]: And I think another research group that's worth checking out is the Center for Digital Music at Queen Mary University.

Speaker:

[SPEAKER_01]: They also have a really, really strong group in AI music.

Speaker:

[SPEAKER_01]: I would say most academic research...

Speaker:

[SPEAKER_01]: happens in Europe that's not obviously like completely true but there's a lot of like centers of excellence in AI music especially like generative music here in Europe but then there's also like another really really cool research group and it's in Canada

Speaker:

[SPEAKER_01]: Simon Fraser University, and it's led by Philip Pasquier.

Speaker:

[SPEAKER_01]: They've been doing some great work in generative music before it was all the rage.

Speaker:

[SPEAKER_01]: So yeah, I would say these are more or less the people, institutions, and companies that are sort of leading the charge.

Speaker:

[SPEAKER_00]: yeah nice thank you maybe you could send me a list afterwards because it was a bit difficult to write down as you were saying it yeah absolutely do you know the have you ever heard about the youtuber Adam Neely oh yeah

Speaker:

[SPEAKER_00]: Yeah, he had an episode where he talked a bit about AI and he said that he thought we would never get human level musician robots because of incentives not being lined up to make such kind of robots.

Speaker:

[SPEAKER_00]: I don't know if you saw that with that video.

Speaker:

[SPEAKER_01]: No, I haven't watched that video in particular, but I'm always a little bit skeptical when people say AI is not going to arrive at this level.

Speaker:

[SPEAKER_01]: It's not going to arrive like at that level.

Speaker:

[SPEAKER_01]: I mean, we've seen it with Chateau Petit and I would say like large language models in general.

Speaker:

[SPEAKER_01]: So if you take them as they are right now on AI,

Speaker:

[SPEAKER_01]: a huge amount of university-level tests that score better than the majority of students, which is incredibly impressive.

Speaker:

[SPEAKER_01]: And I think there's something very human about these positions, and it is

Speaker:

[SPEAKER_01]: I just don't want to become obsolete.

Speaker:

[SPEAKER_01]: So that's the sort of like underlying tone that I hear every time I hear, well, but machines are not necessarily like creative and they can't reproduce like what we can do.

Speaker:

[SPEAKER_01]: But I would actually.

Speaker:

[SPEAKER_01]: disagree with that type of argument and the reason is I don't think that there's anything special about us as human beings in terms of our creative capacity at the end of the day like we have a brain which sort of responds to very mechanistic lows and so there's no soul there's no romantic genius kind of thing no

Speaker:

[SPEAKER_01]: does, there are like processes and processes like that can be captured and reproduced also in silico through machines.

Speaker:

[SPEAKER_01]: And just like to give you an example, I'm a huge fan of a book that came out, I think like it was like in the mid seventies, something like that.

Speaker:

[SPEAKER_01]: It's called like Gödel, Escher, Bach.

Speaker:

[SPEAKER_01]: I don't know like if you know the book, it was like a super hit.

Speaker:

[SPEAKER_01]: fantastic book by, I think, Hofstadter is the author, an amazing mathematician, logician.

Speaker:

[SPEAKER_01]: And he was sort of going through logic and comparing it to the paintings of Escher and the music of Bach.

Speaker:

[SPEAKER_01]: And towards the end of this book, he had a section where he would make some guesses about where AI is going to be in a few years.

Speaker:

[SPEAKER_01]: And I still remember that there was a passage where he said,

Speaker:

[SPEAKER_01]: We're not going to have chess AIs that are going to be as strong as chess grandmasters until these machines are going to be able to feel what we feel, see what we see, and live like what we live.

Speaker:

[SPEAKER_01]: Of course, like this is not verbatim like the very same like quote, but more or less like this is like the gist of it.

Speaker:

[SPEAKER_01]: It turns out that less than 15, 20 years later, we had Deep Blue, which actually won against the chess world champion of the time, the incredible Garry Kasparov.

Speaker:

[SPEAKER_01]: And right now, it's impossible for a human being to even hope to win against a chess machine, chess AI.

Speaker:

[SPEAKER_01]: All of this to say that machines

Speaker:

[SPEAKER_01]: are capable of performing at human or even uber human level.

Speaker:

[SPEAKER_01]: The great thing about music creation is that there isn't such a thing as uber human level just because it's so subjective.

Speaker:

[SPEAKER_01]: It's not a game like chess.

Speaker:

[SPEAKER_01]: There's no you win and you get one point a year, you draw you get like half point and you lose and you get zero point.

Speaker:

[SPEAKER_01]: No, it's very, very open.

Speaker:

[SPEAKER_01]: So in that respect,

Speaker:

[SPEAKER_01]: I don't think we're going to ever have machines that are better composers, better musicians like that, humans, just because it's very difficult to create that kind of ranking there.

Speaker:

[SPEAKER_00]: Yeah.

Speaker:

[SPEAKER_00]: I might have botched the explanation at the beginning a bit because Adam Neely's point wasn't that he doesn't believe that AI could reach the level of humans because he absolutely thinks it can.

Speaker:

[SPEAKER_00]: But that the economic incentives for companies aren't aligned with making robots capable to play music in a live setting as well as humans.

Speaker:

[SPEAKER_00]: Because it's a lot cheaper to just have humans do it than to make robots that can do the same.

Speaker:

[SPEAKER_01]: Well, I have to say that in the history of humanity, I think like it's actually the right thing is always like the, well, not the right thing, but the answer has always been like the opposite, right?

Speaker:

[SPEAKER_01]: So if I can automate something, like I tend to automate that thing and that has

Speaker:

[SPEAKER_01]: uh sort of like uh economic incentives so i don't see like any major difference like in this case so i don't know like if these robots are going to be like hardware based so you like

Speaker:

[SPEAKER_01]: kind of like robots like with balls and metal or if it's going to be just like virtual instruments and things like that but i do think that there's already quite a lot of competition in the generative music ai space so i'm assuming like that certain companies probably will see an economic benefit going down that route again i don't know like if they're going to be like humanoid robots or if it's just going to be like software programs and but

Speaker:

[SPEAKER_00]: i honestly i think that we're gonna get there sooner than later do you think you will go to a concert featuring a quartet of robots playing like kansas cover or something in your lifetime i think it

Speaker:

[SPEAKER_01]: It's going to be super, super cool, especially if I have the luck of designing those systems.

Speaker:

[SPEAKER_01]: Well, by the way, if we're talking about, let's say, a rubber that performs a string quartet, that is going to be way, way harder to do just because of the complexity of performing a string instrument or things like that.

Speaker:

[SPEAKER_01]: It's incredibly complex.

Speaker:

[SPEAKER_01]: The compositional production side of things, I think it's going to be there relatively soon.

Speaker:

[SPEAKER_00]: Yeah, I think so as well.

Speaker:

[SPEAKER_00]: But it's a lot more expensive to make robots that can do all the movements and stuff.

Speaker:

[SPEAKER_00]: So I guess Nili's point is that it's just way cheaper to have humans do it.

Speaker:

[SPEAKER_00]: But at some point you'll probably have, yeah, for the time being, probably at one point a robot can do anything basically.

Speaker:

[SPEAKER_00]: So it won't be designed to play a violin, but it can do that as well as play ice hockey or whatever.

Speaker:

[SPEAKER_00]: Maybe.

Speaker:

[SPEAKER_00]: Yeah, it's likely.

Speaker:

[SPEAKER_00]: I recently saw you post on LinkedIn about creativity.

Speaker:

[SPEAKER_00]: And I think the question was if people thought a machine could be creative,

Speaker:

[SPEAKER_00]: And there was one guy named Fahim Hassan who gave an answer which I thought was quite interesting.

Speaker:

[SPEAKER_00]: So I was thinking I'll read his answer and then you could make a comment on that comment.

Speaker:

[SPEAKER_00]: so he says to add my thoughts here as a multi-instrumentalist musician I would say that the end goal of making music is not so much the recorded WAV file to share with the world but rather the journey of capturing one's emotional state and

Speaker:

[SPEAKER_00]: and channeling that into something worthwhile that has meaning for the artist as well as their audience.

Speaker:

[SPEAKER_00]: In that context, making art is purely an emotional act of discovering and transmuting one's own feelings that come from the experiences of life.

Speaker:

[SPEAKER_00]: Machines can definitely execute the mechanics of music making, but unless we have machines that are living emotional lives and need to channel their emotions in some way, I don't see the output as anything worthwhile in an artistic context.

Speaker:

[SPEAKER_00]: Making art allows us artists to make sense of the world we live in and derive meaning from the process of creating something.

Speaker:

[SPEAKER_00]: In that sense, I feel that machine creativity will always lack that direct emotional experience that fuels the very intention to create.

Speaker:

[SPEAKER_01]: Yeah, it's a very interesting comment, and it sounds a lot like Hofstadter's comment on chess, right?

Speaker:

[SPEAKER_01]: Until we get machines that are able to leave emotions and experiences,

Speaker:

[SPEAKER_01]: then we're not going to have a chess AI that's going to be better than, let's say, like a grandmaster, a chess grandmaster.

Speaker:

[SPEAKER_01]: In that respect, I don't think that that's necessary for a machine.

Speaker:

[SPEAKER_01]: And I don't think that that's going to be a problem for the output of the machine.

Speaker:

[SPEAKER_01]: Because at the end of the day, the user, I think, is the one, well, user in this case should be like the listener.

Speaker:

[SPEAKER_01]: The listener is the one who projects

Speaker:

[SPEAKER_01]: the emotions and projects the meaning onto the song, onto the composition.

Speaker:

[SPEAKER_01]: So I think and indeed take a song

Speaker:

[SPEAKER_01]: Even like with lyrics, you now ask different people to give you an interpretation of that song, right?

Speaker:

[SPEAKER_01]: To break down that song for you and to just tell you the type of emotions that they've experienced.

Speaker:

[SPEAKER_01]: You'll hear a lot of variants there.

Speaker:

[SPEAKER_01]: There's going to be a lot of different comments and different ideas about that song, depending on who's talking about that song.

Speaker:

[SPEAKER_01]: And the reason is because art, especially music, is very loose from a semantic standpoint.

Speaker:

[SPEAKER_01]: And so people are able to project onto that piece of art, that piece of music, whatever, their internal world.

Speaker:

[SPEAKER_01]: And I think this is what happens.

Speaker:

[SPEAKER_01]: So it's a little bit like when you...

Speaker:

[SPEAKER_01]: When you look up at the sky and you see clouds and you start seeing patterns that obviously are not there, you start seeing like familiar faces or perhaps a car or perhaps a tree with a cat that's running on top of a tree, right?

Speaker:

[SPEAKER_01]: It's you, it's your internal world finding patterns.

Speaker:

[SPEAKER_01]: We are pattern finding machines in the end, right?

Speaker:

[SPEAKER_01]: And because of that, I don't think that's required a level of emotional input on the side of the creator in order for the listener to find or experience that level of emotion.

Speaker:

[SPEAKER_01]: And by the way, like the machines that we have right now, like you do sooner,

Speaker:

[SPEAKER_01]: we could assume that they have internalized a certain level of emotion as part of the composition and the production, just like ChatGPT has internalized the meaning of words, even if at the end of the day it really doesn't understand the meaning, but it has internalized those patterns and can leverage them.

Speaker:

[SPEAKER_01]: In the same way, I believe that a system like Sunos or Eleven Labs does something similar to that.

Speaker:

[SPEAKER_01]: Of course, it's not perfect, but it does it in a decent way because all of that information regarding emotions, internal worldviews,

Speaker:

[SPEAKER_01]: and are somehow, I would say, hacked into the output of the music that has been used to train the models.

Speaker:

[SPEAKER_01]: So it's something that probably those models have internalized without having a real, real understanding of that.

Speaker:

[SPEAKER_01]: Now there's another point.

Speaker:

[SPEAKER_01]: So on the side of the listener, I don't think there's going to be, like, a huge difference just if we just look at the output.

Speaker:

[SPEAKER_01]: But music is not just the output.

Speaker:

[SPEAKER_01]: Music is also the interest that a listener has in the creator.

Speaker:

[SPEAKER_01]: We don't just follow, I don't know, like Taylor Swift, Pink Floyd or whomever for their music, but also for what they represent as a human being.

Speaker:

[SPEAKER_01]: right?

Speaker:

[SPEAKER_01]: And we cheer when they cheer, and we cry when they cry, right?

Speaker:

[SPEAKER_01]: And of course, this part is completely missing with all the models that we have right now.

Speaker:

[SPEAKER_01]: So that is on the side of the experiencing, so the processing music, but then there's the other side, which is the creation.

Speaker:

[SPEAKER_01]: And I think

Speaker:

[SPEAKER_01]: I strongly believe that we have an urge to create music, right?

Speaker:

[SPEAKER_01]: Because music is just a medium for us to express our internal worldviews, internal states and emotions.

Speaker:

[SPEAKER_01]: And this is something that's going to be there regardless of whether we have, I don't know, like AI Beethoven or AI Taylor Swift.

Speaker:

[SPEAKER_01]: No, that process is going to stay.

Speaker:

[SPEAKER_01]: because we have a need to express ourselves creatively.

Speaker:

[SPEAKER_01]: And we've seen this with all the other previous technological revolutions, like for example the advent of the digital audio workstation that all of a sudden opened up the possibility of producing music to millions, tens or hundreds of millions of people.

Speaker:

[SPEAKER_01]: who started to create music and they couldn't create that type of music earlier just because the technology wasn't there.

Speaker:

[SPEAKER_01]: So the more we give people a facilitation to music making and the more people we have making music, and that's going to stay there regardless of whether we have Taylor Swift AI and Beethoven AI or not.

Speaker:

[SPEAKER_00]: Yeah, absolutely.

Speaker:

[SPEAKER_00]: So it's, it's kind of ingrained in the human experience.

Speaker:

[SPEAKER_00]: Absolutely.

Speaker:

[SPEAKER_00]: And also something I was thinking about as you were talking as a, I think it's a quote from Wynton Marsalis, which is a jazz musician and educator who's in an interview, he's asked if the music is for, for the musician or for the listener.

Speaker:

[SPEAKER_00]: And his answer is the music is for the listener, but the first listener is the musician.

Speaker:

[SPEAKER_00]: it's quite a clever quote so probably people are interested in the filter that is the artist and their take on the world I think so it's already a very developed term in popular music authenticity and its importance it doesn't seem like that's going away anytime soon

Speaker:

[SPEAKER_00]: I have one last question for you, Valerio, because I work at an educational institution and we're supposed to advance technological innovation practices in higher music education.

Speaker:

[SPEAKER_00]: Do you have any opinion on how we should go about doing it or how people in music education should deal with AI tools in general?

Speaker:

[SPEAKER_01]: Right.

Speaker:

[SPEAKER_01]: So I'm not an expert in education.

Speaker:

[SPEAKER_01]: Yeah, of course, like I make tutorials, courses and things like that, but I am by no means an expert in it.

Speaker:

[SPEAKER_01]: But if I were to give you like my two cents is just embrace the technology, because the technology is going to be there no matter what, right?

Speaker:

[SPEAKER_01]: If you don't embrace it, then it just means that your students are going to be disadvantaged the moment they get out.

Speaker:

[SPEAKER_01]: of the education bubble, right, in the real world, because the real world is using and will be using this type of technology.

Speaker:

[SPEAKER_01]: So this is true for AI music, but also for AI in general.

Speaker:

[SPEAKER_01]: Tools like Chateau Petit, Complexity,

Speaker:

[SPEAKER_01]: I like my two cents again like it's just like embracing them because at the end of the day I see like a clear parallel here with the advent of Google back in the day right I guess like I was a little bit like too young like to actually like remember that on a an academic like level but still probably I was like in high school I think it came out and there were a lot of

Speaker:

[SPEAKER_01]: Teachers were mad at it because they thought, okay, so now we have Google, so we're going to lose the ability to search for information, which was highly praised.

Speaker:

[SPEAKER_01]: But now, 20, 25 years later, we just found out that.

Speaker:

[SPEAKER_01]: All Google did was facilitate research at a level which was incomparable with what we did earlier.

Speaker:

[SPEAKER_01]: Imagine just the hassle of finding papers in a library with limited amount of articles and now doing the same thing in Google Scholar or just using a simple Google search.

Speaker:

[SPEAKER_01]: I think the same thing will happen with AI-based tools.

Speaker:

[SPEAKER_01]: in particular in the AI music niche or the music niche, I think what's going to possibly happen is the appearance of new products that are going to help students

Speaker:

[SPEAKER_01]: Let's say you are majoring in classical music, let's assume, right?

Speaker:

[SPEAKER_01]: So one thing is counterpoint, perhaps you're studying counterpoint.

Speaker:

[SPEAKER_01]: So now all of a sudden you have an AI music tool, AI counterpoint tool that helps you find the best

Speaker:

[SPEAKER_01]: counterpoints and that helps you with real-time feedback on all the counterpoints, music that you write, this kind of thing.

Speaker:

[SPEAKER_01]: I think this is going to be instrumental to streamline the educational process and I think if done correctly, this can be

Speaker:

[SPEAKER_01]: yet another tool, because at the end of the day, AI is another tool.

Speaker:

[SPEAKER_01]: It's a very smart one, or at least some of these tools tend to be quite smart, but it's just a tool, and as such, we should use it and embrace it.

Speaker:

[SPEAKER_00]: Great.

Speaker:

[SPEAKER_00]: Well, it's been a great honor to have you on the show, Valerio.

Speaker:

[SPEAKER_00]: It's been a pleasure.

Speaker:

[SPEAKER_00]: Thank you so much.

Speaker:

[SPEAKER_00]: Thank you.

Follow

Chapters

Video

More from YouTube