Artwork for podcast Audio Branding
Singing Blobs and Electric Melodies
Episode 867th July 2021 • Audio Branding • Jodi Krangle
00:00:00 00:05:25

Share Episode

Shownotes

Machine learning has helped shape just about every aspect of our digital lives, whether it’s deciding which Netflix show or YouTube video to recommend to us or even teaching cars to drive themselves. One of the most innovative uses for machine learning, however, is in creating music. Just recently Google released Blob Opera, a machine learning tool by David Li that “pays tribute to and explores the original musical instrument: the voice.” There’s a link below for you to try it out for yourself: all you have to do is direct each of the singing blobs by sliding its range up and down with your mouse, and listen as they compose their own harmonies.

https://experiments.withgoogle.com/blob-opera

This sort of musical collaboration between humans and computers has been evolving for a surprisingly long time. There’s some debate on just when the first electronic music was created, but the oldest recording comes from 1951. It’s a sample of three songs created by Alan Turing’s Ferranti Mark 1 computer, which filled up a whole room; the melodies were programmed by Christopher Strachey, a computer scientist who also drew upon his experience as a piano player to teach the computer how to play music. This early melding of art and science would pave the way for similar fusions of musical and scientific genius over the years.

Want to hear the Ferranti’s groundbreaking music for yourself? Just check out the link below for a digitally restored recording of that historic moment, and what the people listening had to say:

https://soundcloud.com/guardianaustralia/first-ever-recording-of-computer-music

Of course, synthetic music’s come a long way over the past seventy years. Now, thanks to machine learning and the development of artificial neural networks, computers can compose their own songs with hardly any human guidance at all. Here’s a link to “Mister Shadow,” a song entirely composed and performed by Sony Computer Science Lab’s “Flow Machines” AI system:

https://www.youtube.com/watch?v=lcGYEXJqun8

They can even mimic human voices, using deep learning paired with existing recordings to study and then duplicate a particular kind of voice. In 2019 Yamaha used its new VOCALOID:AI vocal software to recreate the voice of legendary singer Hibari Misora on the 30th anniversary of her passing. There’s a link to the song below, and I think you’ll agree that the result are uncanny:

https://www.youtube.com/watch?v=eq_YIvx-lVc

But how did we come all the way from a computer beeping the national anthem to writing and singing its own songs? What does machine learning really mean, and what does it mean for the future of the audio industry? The answer lies in patterns, and a computer’s ability to recognize patterns and then generate new ones.

Computers don’t really know what music means, or even that they’re making music. But by studying thousands and thousands of different songs, machine learning allows them to create a profile of all the different elements that those songs have in common, much the same way that facial recognition programs use thousands of pictures of people’s faces to teach them what to look for.

Then the computer takes everything it’s learned about those songs and tries to create something new that fits the same profile. The first try probably won’t be very good, but computers work fast, and each failure gives them more to learn from for their next try. With enough samples and enough feedback, the results start to sound less like noise and more like real music, even real singing.

For Google’s Blob Opera experiment, David Li recorded 16 hours of audio from six different opera singers and used it to teach the program how to sing opera. What we hear when we play it, however, doesn’t come from any of those singers, but from the program’s own attempt to create music based on what it’s learned.

Machine learning’s already starting to make a big impact on the audio industry. Amper, an online composition tool, offers computer generated music based on user settings like genre and tempo as a substitute for stock music. Another app, Endel, creates personalized soundscapes that take into account factors like the time of day, weather and even the user’s vital signs, in addition to the program’s own unique compositions. Content creators in particular need more music than ever before, and machine learning is helping to meet that growing demand and broaden the market for original music.

The next time you hear a piece of music in a commercial or streaming content, you may want to give it a closer listen. With more and more audio content now being produced through machine learning, you might just find a singing blob behind the microphone.

Would you consider giving this podcast an honest review? You can do that here: https://lovethepodcast.com/audiobranding. And if you like what you hear (and read!) – please do share it with anyone you think might be interested. Thanks so much!

And if you’re interested in crafting an audio brand for your business, why not check out my FREE download – 5 Tips For Implementing An Intentional Audio Strategy at https://voiceoversandvocals.com/audio-branding-strategy/

Chapters