Editor’s Note: In this article, fashion creative technologist Nini Hu gives real-world examples of the ways in which musicians and artists are using AI to further their creative vision and bring AI technology into the ‘mainstream’ creative world. She explains the potential for this technology to be used as a tool for democratizing creativity, and prompts the question, “How can the rest of us become more creative through AI?”
Keywords: AI; creativity; GAN (Generative Adversarial Network); Holly Herndon; Never Before Heard Sounds; Portrait XO
In an era where anyone can be an influencer, nonprofessionals can make use of off-the-shelf artificial intelligence (AI) solutions to participate in branding and storytelling. For content, there are mobile app tools that support novice creators in bringing their vision to life.1 For music production, music generators enable music creation by customizing parameters such as mood, genre, tempo, and instrument.2
To go further, artists and musicians are also partnering with AI researchers to push the boundary of art using technology. By processing their art through deep generative machine learning models, for example, artists are discovering their digital selves. An example is Sony’s Continuator, which is a musical instrument that learns musical styles quickly and interacts with musicians in their styles. As told by jazz musician Benard Lubat, "The system shows me all the potential ideas I could have developed, but that would take me years to actually develop. It is years ahead of me, yet everything it plays is, unquestionably, me" (Pachet, 2002, p. 188).
Renowned beatboxer Harry Yeff, aka Reeps One, worked in 2018 with CJ Carr of Dadabots, a team of musical AI technologists, to use a deep learning model to explore the ‘voice of the future.’ This collaborative process was recorded in the docuseries We Speak Music. At the beginning of training, the output generated only static noises. After a few thousand iterations, faint beats and rhythms started to come through. Yeff recounted:
I can already hear it growing, it makes me think of an embryo. It is obviously still not me, but . . . I can start to hear [me] in the audio that it is generating. It sounds like it is trying to be alive. If I am honest, it does make me feel pretty uncomfortable.
Eventually, it started to generate patterns, phrases, structures that I have never done… I had a whole new perspective on my voice and my practice that I could not achieve any other way. . . . In a 10 minute clip, I was told that I did 140 variations of the kick drum. That type of observation can only happen through machine learning...we are able to observe ourselves, understand ourselves, quantify ourselves in a way we cannot without technology (Beyond Conference, 2020).
The collaboration led to a spectacular performance Second Self, a live duet of Yeff and his digital twin; in beatboxing terms, a 'battle' between Yeff and his AI opponent.
Lubat and Reef are accomplished artists with a rich body of work to use as data to feed the machine learning (ML) model. How can the rest of us become more creative through AI?
Never Before Heard Sounds, a music startup founded by Yotam Mann and Chris Deaner, builds ‘ML instruments’ and tools that are designed for musicians. Mann and Deaner are musicians as well as programmers and constructed their new instruments through the artist's lens.
Gan.Style was one of their first published projects. Using generative adversarial networks (GANs), the system resynthesized audio from YouTube with neural nets trained on vocal and instrumental recordings. The result was an AI interpretation of the audio that retained the original's pitches and rhythms. A slider control enables the listener to adjust the audio interpretation between the synthesized output and the original recording, creating an eerie genre-bending experience.
The duo also created a hardware device to encapsulate the AI, so anyone can plug their chosen instrument or a microphone into the system. Musicians can experience the neural network models physically, test the range, learn its limitations, and make some 'never before heard sounds.' Deaner said in an interview with me in October 28, 2021, "I couldn't imagine this being as useful as I wanted it to be unless I was able to interact with it in real-time like a musician. . . . You will improvise with it to figure out what it can and cannot do. Each model will do something slightly different." Deaner and Mann envision musicians using this instrument to write music as well as in live performances.
Extending the same concept to a cloud-based setting, experimental musician Holly Herndon, in collaboration with Never Before Heard Sounds, released Holly+ in 2021. Holly+ is an ML instrument created from Herndon's voice. Herndon allowed others to play with the instrument of her digital likeness and the best submission was selected to make an album (Herndon & Dryhurst, 2021).
Mann described one submission making use of a long screeching sound of the London underground as input, with the result a “beautiful kind of haunting ethereal-like sustained vocal sound; that kind of surprise moment is a really unexpected transformation” (Y. Mann, personal communication, October 28, 2021).
The massively popular Eurovision Song Contest was abruptly cancelled in 2020 due to COVID-19. The AI Song Contest, an online event, continued on YouTube. Teams of musicians, artists, scientists, and developers take on the challenge of creating a new Eurovision-like hit with the help of AI.
Anna Huang, a judge of the contest and a research scientist at Google Brain, has documented how teams of musicians, producers, and ML practitioners work together. Because AI models are not easily steerable, artists have to generate a large amount of samples before curating final pieces post hoc (Huang et al., 2020). In the 2021 AI Song Contest, Huang identified a more fluid process adopted within the teams of musicians and ML practitioners, that resulted in a kind of “call and response” between human and AI creation and an “interdependence between music and sound design” (Huans & Koops, 2021).
Portrait XO, who participated in the 2020 and 2021 contest, shared her experiences with me in a conversation. For “I'll Marry You, Punk Come” in 2020, Portrait XO collaborated with Databot. Databot’s duo Carr and Zukowski used seven different neural networks to mesh a multi-genre data set, including death metal, electronic dance music, and songs from 1950s Eurovisions. The artist “sliced lyrics and melodies” from the training output to compose part of the final song (P. XO, personal communication, October 29, 2021).
In the 2021 submission “Vessel,” Portrait XO collaborated with music producer Rezar but ran the ML training herself. The artist used open-sourced ML models that ran on Google Colab notebooks to train the models, then layering in outputs from a text to speech synthesis model built by Birds on Mars, an AI agency. Portrait XO also created the video using another ML model VQGAN+Clip. Both the visual and soundscape immerse the audience in a dream-like labyrinth that echoes between the real and the artificial.
Artist and creative technologist Scott Eaton, known for exploring the representation of human figures through drawing, digital sculpture, and photography, reached new creative heights when he trained generative AI using his extensive archive of stylized anatomy photographs and a sketching interface. The final artifact is beautiful, and it is mesmerizing to watch timelapse videos capturing the human and machine interaction. The AI model assesses the line, shapes, and contours of Eaton’s sketches, then makes decisions on shades and finishes the form started by Eaton.
Eaton predicts that artists will soon need to reassess the role of artmaking (CogX, 2020). Artists will take on a director's position, managing high-level artistic decisions, and become less involved in the physical handcrafting.
Mario Klingemann, a pioneer in the field of neural networks, computer learning, and AI art, shared the same opinion: “Everything you see that has been created with the help of AI is still the result of human creativity. The role of the machine is that of a lever or a filter that allows those who work with them to concentrate their time and effort better in order to extract what is relevant and interesting from the space of possibilities” (Crespo & Klingemann, 2021).
Looking forward, AI can empower artists, as a tool for democratizing creation and pushing boundaries, as a collaborator in connecting, influencing, and conveying intent to an audience, and as an observer that can give artists a new understanding of themselves.
The author has no disclosures to share for this manuscript.
Special thanks to David Parkes, Anna Huang, Portrait XO, Yotam Mann, Chris Deaner, Christina Hu, Todd Brous, and Elizabeth (Scotty) McConnaughey.
Beyond Conference. (2020, July 4). Beyond 2019 - Art of Intelligent Interruption & Augmented Relationships with Harry Yeff / Reeps One [Video]. YouTube. https://youtu.be/QqRw9gWfnp0
CogX. (2020, June 4). Artist vs machine | CogX 2020 [Video]. YouTube. https://youtu.be/hJ7AGWOTeKA
Crespo, S., & Klingemann, M. (2021, Month of September). [Email exchange with Alex Estorick]. Feral File. https://feralfile.com/close-ups/on-generative-ecologies
Herndon, H., & Dryhurst, M. (Host). (2021, July 14). Approachable AI for music, model markets, new DAWs and Holly+ with Never Before Heard Sounds (No. 48) [Audio podcast episode]. In Interdependence. Simplecast. https://interdependence.simplecast.com/episodes/approachable-ai-for-music-model-markets-new-daws-and-holly-with-never-before-heard-sounds
Huang, C. A., Koops, H. V., Newton-Rex, E., Dinculescu, M., & Cai, C. J. (2020). AI song contest: Human-AI co-creation in songwriting. Proceedings of the 21st International Society for Music Information Retrieval Conference (pp. 708–716). ISMIR. https://doi.org/10.5281/zenodo.4245529
Huans, C. A., & Koops, H. V. (2021, July 6). Feedback loop: ML, music, and sound design [Panel Session]. Wallifornia Music & Innovation Virtual Summit 2021. https://www.aisongcontest.com/ai-creation-day/feedback-loop
Pachet, F. (2002). Playing with virtual musicians: The continuator in practice. IEEE MultiMedia, 9(3), 77–82. https://www.francoispachet.fr/wp-content/uploads/2021/01/pachet-02b.pdf
This commentary is © 2022 by the author(s). The editorial is licensed under a Creative Commons Attribution (CC BY 4.0) International license (https://creativecommons.org/licenses/by/4.0/legalcode), except where otherwise indicated with respect to particular material included in the article. The article should be attributed to the authors identified above.