The emergence of artificial intelligence (AI) in the music industry is yet another example of how the most disruptive advancement in recent years is making significant strides in various areas of life, particularly in the realm of art.
Peter Gabriel, former member of Genesis, recently said in Uncut magazine – with a touch of resignation but open to new possibilities – “It’s a bit like King Canute on the beach, (note: Gabriel refers to the Scandinavian king Canute II and his great maritime dominion) it’s coming. We’re just building it. We have no idea where it will go. I can’t think of anyone whose work couldn’t be done better by AI in the next ten years, maybe five. While I drive to the studio in my Tesla, the vehicle does most of the driving for me – but I still hold the steering wheel. The same will happen in any process, including the creative one. Half of the artists will enjoy playing with AI, while the other half will want to ban it. But I think it’s better to work with a new powerful tool than to just complain or pretend it doesn’t exist.”
Canadian DJ and producer Joel Zimmerman, better known as Deadmau5, has also expressed his views on this matter: “It’s pretty scary,” says Joel. “But it’s scary in the sense of how dumb music is anyway, so it’s not that scary. Like, ‘This thing can make a pop song!’ Have you heard a pop song? Great. Let it go. Release the beast, you know, that would open up the niche market for real musical talent. [ChatGPT] is good. But it’s only as good as what it knows. It’s a huge training model, right? So take the collective stupidity of the world and make a robot vomit it. It’s not going to be a genius, but it’ll give you what you want. It’s not clear whether we can expect an AI-based album from the electronic music producer, but one thing is certain, and that is that AI is going to take control of the world in one way or another.”
Others, like arguably the greatest exponent of pop electronic music of this century, David Guetta, are more enthusiastic. The Frenchman answers the question we asked some time ago, and his answer is a resounding “Yes.” “Nothing will replace taste,” he commented. “What defines an artist is… you have a certain taste, a certain kind of emotion you want to express, and you’re going to use all the instruments to do it.” The DJ and producer used a well-known argument to defend his position: “I think every new musical style comes from new technology. AI will possibly define new musical styles. There wouldn’t have been rock and roll without the electric guitar. Nor acid house without the Roland TB-303 or the Roland TR-909. There wouldn’t be hip hop without samplers.”
Another advocate for AI is Hazel Savage, co-founder of Musiio, an AI capable of listening to music whose function (among others) is to find the best segment between a range of 30 to 60 seconds for insertion into social media videos such as TikTok, Instagram, and YouTube. “We’re not here to replace humans,” she stated. “We’ve had that argument since we started with Musiio, and it all came from a place of fear and misunderstanding. But there has been progress, now there are fewer people telling me ‘All AI is terrible.’ All those people now sound like those who once said ‘all synthesizers are garbage, it’s not real music.’ So we’re heading in the right direction.” Two years ago, Hazel noticed a positive shift in the perception of AI: “The reality is that an AI, in its current state, can’t do anything we haven’t taught it. In Musiio, our technology seems magical because of its speed and accuracy. But there’s no magic, it’s very simple: it’s based on high-performance computing and pattern recognition that gives it the appearance of intelligence. That’s something people need to understand.” Still, Hazel puts a damper on those who are more than alarmed: “With Musiio, we’ve had meetings with record labels and publishers, and I’ve noticed that there is very little demand for music created by AI, so there is a disconnect between actual demand and artificial music.” Savage even goes further and draws a line regarding what Guetta and Gabriel said: “AI-generated music is going nowhere, and people won’t want it. Humans love creating music, so it doesn’t need to be interrupted by AI.”
In line with Guetta and Gabriel, software engineer Berkeley Malagon is one of the key players in this emerging but strong establishment of AI in our collective unconscious. Malagon is the co-founder of Audiolab, a US-based company that is working on music production under AI. “We’re not looking to press a button and generate a definitive song. We’re not interested in that, but rather empowering sound engineers, sound designers.” From this functional perspective to each individual’s creative process, Malagon continues, “(Engineers) spend a lot of time achieving a defined desired sound before shaping it. They love what it means to go straight to curation, fine-tuning, shaping a sound that is already in line with what they were seeking.”
No one can question such potential or its noble intention. But while their goal is the ease provided by the digital realm, in this case, AI, and the cost-saving it brings to the search and exploration process of the creative process, aren’t we potentially depleting one of its most exciting yet frustrating dimensions? Perhaps the pursuit of an antidote to frustration may not yield bad results, but at the very least, it modifies serendipity as we know it. Let’s assume that a musician/producer cannot find a specific sound or struggles to shape it and becomes frustrated. The possible alternatives could be: 1) abandoning the project and starting something new, 2) seeking help and enriching oneself through collaboration (the fundamental role of sound engineers and recording studio producers, as per the paradigm of the 20th century), or 3) in the obsession for a particular audio, finding something new that is appealing and continuing in a new direction. This last option would no longer arise from human action but from pre-established selectable options offered by software.
Malagon began his career in the world of video games, then became self-taught in the fundamentals of AI, data science, and machine learning, and eventually worked on intelligent chats and audiovisual design. Meanwhile, music production was one of his hobbies, and his curiosity led him to apply the work of neural networks from visual art to sound design. His desire was to do away with sample packs for track production, which is now commonplace in music production.
Unstoppable advancement in the market
Amidst this artificial proliferation, Spotify has introduced a new feature that puts AI in the spotlight. It’s called “DJ” and is a more refined version of the various and well-known algorithmic song selections (suggested playlists) curated by the application for each user. It’s no longer just about more precise collection; instead, this new feature, powered by OpenAI (yes, the same laboratory that designed ChatGPT), speaks to you between songs to tell you more about them. The voice is generated by Sonantic, a company previously acquired by Spotify, whose voice is designed based on that of the Swedish corporation’s Director of Cultural Partnerships, Xavier Jernigan. “DJ” has been temporarily launched as a beta format in the United States and Canada, and it is expected to cross more borders soon.
On the other hand, if we were dissatisfied when singing our favorite songs in karaoke, today there is an AI that allows for track extraction, such as removing the bass, drums, guitars, etc., from songs. While it was something we were somewhat familiar with (who hasn’t tried it with somewhat poor results in Audacity), AudioShake has achieved an approach to this process like never before. It is allowing companies in the US to obtain synchronizations by removing vocals from tracks and creating instrumental versions, which suggests that “track separation” (as a musical source tool) will soon be the next thing to become widespread.
The first AI-based drum machine in history, “Emergent Drums,” was recently launched by Audiolab, and although it is not a total replacement for a sample library or a traditional drum machine, it can be a great complement. Emergent Drums is not the first plugin to use artificial intelligence (iZotope Neutron 4, Focusrite’s FAST, among others), but the difference lies in the fact that it invites us to invent new sounds.
This electronic drum does not use any pre-existing samples but generates drum samples using AI-based technology and machine learning. It is the first plugin to bring AI-based sound generation to Digital Audio Workstations (also known as DAW), which is the home studio that can exist within a computer in our homes.
Its operation is based, like the others, on neural networks trained with tens of thousands of existing samples. By studying the patterns deeply, with the shapes and waveform models of the sounds, this plugin gradually starts generating its own representation of what a kick drum or a hi-hat would sound like, among the components of a drum kit. The data set of Emergent Drums is constantly growing, and it can use its own sounds to improve itself and even derive new sounds from itself. Its working method mimics the human method: we act, observe the results, combine this with everything we knew previously, and update our knowledge with new experience and information.
Malagon, whom we mentioned earlier, hopes for the emergence of “a new era of sound design and production” and that “we are working towards building a DALL-E (an AI image-generating platform) for sound design.” According to the American, a new tool is currently being developed under his co-authorship, with which “it will be possible to include any sound from your library, where our AI will analyze it and give you variations of that specific sound. So you won’t have to take what comes from our models but take the sound you love and get 100 variations of it.” The engineer aims for Audiolab to be able to “give you any sound you need.”
In 2020, OpenAI was innovative in launching a platform called Jukebox, which generates complete pieces of music in the style of any chosen artist or genre. From a technical standpoint, the results were excellent, but low-quality sound was generally produced by Jukebox, which did not match what the selected artists produced.
Continuing along this line, Riffusion is another development capable of combining a series of loops and producing “jams” based on subtle variations of the initial input. Riffusion is similar to the development of images through AI via text (like DALL-E), meaning that one inputs a certain description of the desired result via text, and the platform produces music in the case of Riffusion.
This same text-based data reception for music production has been further developed by Google and its MusicLM. Although they have not made it available for general public use, the initial impressions displayed by Google have proven to be vastly superior. The most fascinating and advanced aspect of MusicLM is its accuracy in capturing instructions.
All AI is political: the first legal battles
Unlike the music piracy of the past, AI does not copy material to redistribute or sell it under the same name. Instead, the conflict lies in how all that data has made its way into their engines and who owns that information.
Advocates and developers of AI argue that their engines can learn from existing data without permission because there is no law against “learning,” and that the transformation of data into something completely new is protected by law. They rely on the extensive case law from Google’s cases against writers and publishers regarding its book index (Google Books), which cataloged and displayed excerpts from a large number of works. On the other hand, detractors counter-argue that the use of original material created by an artist that is then processed by AI should require a copyright license.
Currently, the US legislation – a benchmark and paradigm for copyright issues, especially in relation to music as one of the major global markets – has stated that it will not accept copyright registration for any work created by AI. However, as we can observe, it is still in a state of limbo regarding whether the output of AI infringes the rights of other authors when it incorporates original works or materials into a new AI product (regardless of how minimal it may contain human-originated material).
Although current AI platforms are imperfect, it is necessary to promptly address these discrepancies (not to mention violations of rights) as they become “smarter” with each use. We may be in the early stages now, but we know they have the potential to advance faster than we can handle.
Recently, DJ and producer David Guetta was involved in a controversial incident when a voice generator – in this case, Eminem’s voice – appeared in a song he was playing live. He stated, “I put the text into that thing, played the song, and people went crazy.” Yes, the results were amazing. But this is where the true debates on ethics, artistic integrity, image rights, property rights, and other laws that AI casually violates begin, subverting philosophical, anthropological, and other fields of human knowledge that we thought were (relatively) static or without perceiving such a shake-up as today.
Although the artist has no intentions of adding that piece to their discography, it is a clear demonstration of the power of this tool. But was its inclusion in the show without the artist’s consent still correct? Is it just a trivial imitation, or does it come from a human being whose aesthetics, phrasing, and artistic quality have become a trademark – and whose creation has taken more or less time – thus deserving legitimate recognition as such? These are questions that arise today and demand a clear and, if possible, prompt answer.
While the thesis about the future of music through technology is understandable, will we really care if new musical trends are entirely dictated by the artificial? Since the invention of new electronic technologies for music – for example, the theremin – human input has been vital because the operational capacity of these machines relied on the concrete manipulation of these elements. Now, however, a single platform is capable of creating music on its own, independent of desired (or undesired) human input, just by “pressing a button.” New trends may emerge from this technology. However, a human touch that alters it is necessary, no matter how minimal it may be. That will be the true potential of a work with such characteristics, or at least what remains for us humans, so that music is not engulfed by digital algorithmic logic.
In Eminem’s voice, and in many others, its emulation counts as an (undoubtedly accurate) approximation of the real voice. While there are cases where it seems like an identical calculation, even if one wants to deny it to enjoy the unreal as real, there is a minimal percentage – even if it is less than 1% – that makes that voice the fiction it represents. And it’s not due to psychological suggestion but because the emulation is not complete and perfect, and at least (we assume) that is something to be grateful for today. It’s precisely that human touch, without which it loses all its essence and integrity.
“My AI-generated persona is highly valued” – a striking phrase that crystallizes the sign of these times. Singaporean singer Stefanie Sun spoke out for the first time via social media after her AI-generated voice exploded with fury to the point where she is now one of the most listened-to artists in Asia. Sun, who hasn’t released new material for about six years, saw her career revitalized thanks to AI, but with the caveat that she is not the one responsible for it; it is an intelligent engine. This emblematic case sets a precedent regarding the massive reach of AI-generated song versions, such as the covers of “Hair Like Snow” by Jay Chou and “Rainy Day” by Nan Quan Mama, which emulate her vocal coloratura and tone from the early 2000s when the artist burst into the Asian market and became one of the region’s most popular performers.
The Ghostwriter Incident
Something similar happened in April with The Weeknd and Drake when an original composition created by AI went viral on social media. The song, called “Heart on my sleeve,” was initially uploaded to YouTube and TikTok and then gained hundreds of thousands of streams on platforms like Spotify, Apple Music, DEEZER, Tidal, and Soundcloud.
The track, created by GHOSTWRITER (as the uploader self-identified), was removed by all the corresponding platforms. The detail of its removal on YouTube states: “This video is no longer available due to a copyright claim by Universal Music Group.”
The numbers were staggering (and alarming). In just a few days, it reached fifteen million views on TikTok, over half a million on Spotify, and a quarter of a million on YouTube. After being removed, it accumulated four million views thanks to other users who uploaded excerpts of the song. After revolutionizing social media, Universal Music Group asked music streaming services to block any access to melodies or lyrics belonging to the record label that have been produced by AI. “We will not hesitate to take action to protect our rights and those of our artists,” the corporation announced in March of this year. UMG’s response, and eventually that of other major record labels, involves two lines of action: the first against the platforms that host such tracks, and the second regarding the distribution platform that has made their appearance on music streaming possible.
Currently, UMG is leading the battle against AI. “We have a moral and commercial responsibility to our artists to prevent unauthorized uses of their music and stop platforms from ingesting content that violates the rights of artists and creators. We hope that our partners on these platforms will want to prevent the same misuse in their services that harms artists,” stated one of their leaders.
All this public commotion surrounding AI and music began with the now ubiquitous ChatGPT when a user instructed the platform to generate “a song lyric in the style of Nick Cave.” The lyrics reached the hands of the Australian singer-songwriter, sparking his discontent. In his blog, the icon wielded a scathing response to the incident:
“Dear Mark, since its launch in November of last year, many people, many of them shaken by algorithmic awe, have sent me ‘Nick Cave-style’ songs created by ChatGPT. There have been dozens of them. That being said, I don’t feel the same enthusiasm about this technology. I understand that ChatGPT is in its infancy (in terms of cognitive growth and logical reasoning), but perhaps that is the emerging horror of AI — that it will always remain in its infancy, always moving forward, always fast. It cannot regress or linger while leading us toward a utopian future, perhaps, or our total destruction. Who can say which of the two? Judging by this ‘Nick Cave-style’ song, it doesn’t look good, Mark. The apocalypse is well underway. This song stinks.
What ChatGPT is, in this instance, is a parody of a replica. ChatGPT may be capable of writing a speech or an essay or a sermon or an obituary, but it can never create an authentic song. It may be able to produce a song that, on the surface, is indistinguishable from an original, but it will always be a replica, something mocking.
Songs emerge from suffering, by which I mean they preach about the complex internal human conflict in creation, and well, as far as I can tell, algorithms do not feel. Data does not suffer. ChatGPT does not possess an inner being, it has not been anywhere, it has not overcome anything, it has not had the audacity to surpass its own limitations, and therefore, it lacks the ability to share a transcendent experience, just as it lacks the limitations to transcend. The melancholic role of ChatGPT is that it is destined to imitate and will never be able to have a genuine human experience, no matter how devalued and inconsequential that experience may be in the future.
What makes a good song is not its resemblance to a recognizable work. Writing a good song is not mimicry or replication or pastiche; it is the opposite. It is an act of self-destruction that destroys everything one has strived to produce in the past. It is these dangerous and heart-stopping escapes that catapult the artist beyond the limits of what he or she recognizes as their self. This is part of an authentic creative conflict that precedes the invention of a single letter of true value. It is in the breathless confrontation with one’s vulnerability, one’s own danger, one’s own smallness, shattered against a sudden and shocking sense of discovery. It is the redeeming artistic fact that stirs the heart of the listener, where the listener recognizes in the song’s internal compounds their own blood, their own conflict, their own suffering. This is what we, humble humans, can offer. AI can only imitate, the transcendent journey of an artist who is always grappling with their own flaws. That is where human genius resides, deeply internalized, yet still reaching its transmission by transcending those limitations.
It may seem like I’m taking all of this a bit personally, but I am a songwriter who is currently involved in a composition process. It is a labor of blood and guts, here on my desk, that requires something from me to initiate a new and fresh idea. It requires humanity. What that new idea is, I don’t know, but it’s there somewhere, seeking me out. We will meet at some point.
Mark, thank you for the song, but with all the love and respect in the world, this song is shit. A grotesque mockery of what it means to be human, and well, I really don’t like it much — although… wait! Rereading it, there’s a line there that speaks to me: ‘I’ve got the fire of hell in my eyes,’ says the ‘Nick Cave-style’ song, and that is something true. I’ve got the fire of hell in my eyes — and that’s ChatGPT”.
With love, Nick
Sardonically, the Chat was programmed to respond to Cave, and this was its reply:
“Thank you for your letter, Nick. I understand your concerns about the limitations of an AI creating genuine songs.
It is true that ChatGPT is a machine and lacks the capacity for emotions or human experiences. However, it is important to note that the purpose of AI in composition is not to replace human composers but rather to assist and inspire them in their creative processes.
ChatGPT can generate lyrics and melodies that a human composer can edit, refine, and add their own personal touch to. It is a tool, not a replacement.
Additionally, it is important to recognize that the value of a song lies not only in its originality but also in its ability to connect with the listener and evoke emotions. While a song generated by ChatGPT may not have the same personal significance as one written by a human, it can still be meaningful and impactful to those who listen to it.”
Writen by Diego Armando.
Translated by Chatgpt. Corrections by Muta Magazine.