Google’s new Lyra 3 Pro music generation AI model is undeniably impressive. From a short prompt, it can make a complete song, up to three minutes long. And it does a good job, in that the instruments sound like instruments, the vocals sound like vocals, the song has structure, and the lyrics (for the most part) make sense. It’s good enough that if you heard one of its songs playing in a coffee shop or snuck into a Spotify mix, you probably wouldn’t notice. It is, in other words, competent at mediocrity.
But “wouldn’t notice” is also why I predict (with all the risk making predictions entails) that not many people are going to willingly listen to much AI generated music.
If you ask Gemini to generate a ska punk song, it will generate a ska punk song that sounds just like ska punk songs. But “just like” means not any particular ska punk song, or any particular ska punk band, but instead a kind of generic ska punk, the sort that the unnamed band plays in the background of a college party movie. What Gemini won’t do is generate a Mustard Plug song, or a Suicide Machines song, or a Telegraph song. It makes competent songs that are representative of their genre, but aren’t representative of any particular take on that genre.
What this means in practice is that, if you like a genre in its generic sense, and just want a vague “more of that,” then the model can oblige. But none of the songs it creates will become your new jam, your new favorite song, your new always-on-repeat obsession. And I think this matters for AI music uptake, and means there won’t be much. At least not until the models improve dramatically in producing novelty, which could well never happen.
When we listen to music, we tend to do so in one of two ways. We either want to hear new stuff we haven’t heard before, or we want to hear the same stuff we’ve heard a million times already. The former is about the thrill of discovery, of finding that new band that blows your mind, or enjoying the weird new turn your favorite band made with their latest album. The latter is about the comfort of the familiar.
We might posit a third mode of listening, which is “I just want something on in the background I can ignore.” And AI music could do this, but it can’t do it better than putting on what’s already out there from musicians. It can just do more of a smeared average of the same. It doesn’t add anything, and music that was once novel has a bigger audience, because it can serve all three modes equally well, instead of having its market limited to just one.
AI music can’t give us novelty, because the music it generates isn’t novel. And AI music can’t give us the comfort of the familiar because, no matter how similar it is to existing bands or genres, it isn’t exactly them, which means it isn’t the song you’ve listened to a million times before and now want to listen to the million and first. AI music sits in this middle ground that, in my experience at least, just isn’t how really anyone listens to music. If I want to hear something new, I need to go to a human band for that. If I want to hear comfort music, I need to go to the human bands that at one time in my life were the “something new” bands. What’s left is “stuff that’s strictly speaking unique, but uncannily familiar, but not familiar enough that I can sing along to every word.” I just don’t see much of an audience for that.