Lemiffe Music, Software, Stories & AI

The State of AI Music Generation

This is a list of tools, VSTs, services, websites and research on AI music generation. Last updated: 30/Mar/2024.

I’ve divided the list into four areas: Research, Websites & Services, AI vocalists, VSTs & Plugins for DAWs. There is quite a varied use of AI models across the board, the first section primarily focuses on prompt to music, aka “ChatGPT for music”, but there are also tools and VSTs that use RNNs, deep neural nets, diffusion models, etc.

I’ve also provided my opinion/thoughts on some of these tools, I didn’t have a chance to try out every website/service, nor did I purchase every VST as some are a bit expensive, but I did go through example videos, and sometimes comparison videos with other tools/services, to get a high-level overview on each tool/service/plugin. Hopefully it is useful to you as a reference list.

State of the art research in text to music prompts and audio effect generators

  • Meta
  • Google
  • Stability AI
    • Stable Audio: Diffusion-based generative models for latent audio diffusion
    • Had a few demos here that while they were impressive, I would put them on par with Google and Meta’s efforts so far. Check the official website here.
    • Another demo that seems to be more recent (and better!)
    • HN comment about Ed Newton-Rex who built stable audio quit after release re training data - he went on to found https://www.fairlytrained.org/
  • Generative Music in Games
    • Research paper on the topic
    • Could be the most interesting area of research: on-the-fly generation of music that adapts based on the character, story arc, etc. It could potentially work incredibly well for those games that have a bunch of different endings where every choice matters in the decision tree. This way the musicians and audio direction can focus on the theme and primary music while leaving atmosphere and subtle background music to embedded AI music generators.

Websites and services to generate music and sound effects with AI

  • Audiogen.co: Doesn’t mention what model they are using but it seems very likely they’ll be using one of the research ones covered above
  • Lalal.ai: Extracts or isolate stems in music - so you can, for example, remove the drums and overdub with your own, or use the vocal takes in a remix
  • AI Mastering Services: Quite a few options available here, more popping up all the time. AI mastering has already been a thing for a while, but certain tools/VSTs and websites are making it much easier to use and afford. This article from Ars Technica includes a very good breakdown and comparison amongst popular services such as LANDr, Ozone, etc.
  • SoundDraw: Interesting one as it doesn’t work with the usual prompt to music principle, it uses segments built of samples submitted by the devs, and you can turn segments on and off and rearrange them, like in a rudimentary DAW, as well as generate new segments as well. A big pro is that you can try it out without making an account; but I’m unsure about the licensing aspects (there is a FAQ that dives into this however)
  • Beatoven.ai: At a glance seems similar to the previous one
    • Fast registration (requires email confirmation)
    • Generation took less than a minute, with 4 alternate tracks
    • Quality was poor, and the default length was very small (plans start at $6)
  • AIVA: AI music generation assistant
    • Login / account creation was broken when I tried it out
  • Loudly: Also similar to previous websites
    • Quick registration process
    • Instead of just being prompt based, it offers parameters, where you select duration, instruments, genre, etc.
    • Generation is FAST
    • Provides 3 options per generation, liked some of the options but I’d like to see how it generates more variance with longer tracks
    • Limited to 31 second output for free, you have to upgrade to get longer tracks
  • Suno: Also similar to the previous websites.
    • BEST choice out there right now in my opinion
    • Very simple interface, got 50 credits for free upon logging in the first time.
    • V3 of their engine allows you to make a track up to 2 minutes long, you can also toggle to generate instrumental-only songs.
    • It took a couple of minutes to finish, and it generated 2 options for 5 credits per track, which means you can try 5 prompts for free
    • It allows you to continue from the end of a previously generated track, which would allow you to “stich the results” together to create longer tracks
    • The results were quite interesting, I asked for A droning techno with a slow buildup towards an epic theme and then with a decay back into techno… The first option delivered just this, albeit a bit lackluster, but the second option went from absolute rubbish into amazing bright vivid super uplifting music which quite surprised me
    • Login only works with Discord, Google or Microsoft? (why no email option?)
  • Boomy: Untested. Con: Need to make an account
  • Musicfy AI: Similar to previous but with AI vocals as well?
  • Soundful
    • Pro: License covers everything from videos, websites, corporate use, games, etc.
    • Pricing is per year, and is less than half the price that Epidemic Sound costs per year. Or you can pay a bit more to obtain full copyright for all creations and not just a license for usage.
    • First one billed as “royalty free music for creators” — I think this is what a lot of us thought when AI music generation started becoming a viable thing… that its primary use-case would be to generate background music for videos and live streaming on Twitch and YouTube, not to create new content to flood streaming services like Spotify with auto-generated songs… yet a few of the last 6 or 7 services seem aimed towards people without music or singing knowledge to create tracks and assist with publishing them on streaming services

There’s this section in Colin and Samir’s Jacob Collier Interview where they dive into the topic of AI and music and I really loved the positive attitude, specially where he talks about crafting new sounds and using it as a tool to unlock or further creativity.

As a music world builder (as Jacob Collier talks about himself) it can be instrumental to find sounds that can change the way we dream up and work with compositions, venturing into unexplored audio territory, kind of what analog and modular synths have done to open up a new world of possibilities over the past 50 or 60 years, or granular synthesis in the past 20 to 30 years.

It seems anyone with a say on AI and music generation take one of three stances:

  • Hate it, afraid of the implications for copyright, don’t like the potential for saturation with “fake work”, the dilution of quality music in a sea of artificial sounds.
  • Afraid, unsure about what it means, how much impact will it cause, will it be used as a tool or will people use it to create full works, will it take away jobs?
  • Positive but slightly weary, knowing there is potential for wrong-doing but the toolset it gives us outweighs the cons, such as crafting new soundscapes, imagining sounds that we could have never thought possible otherwise, helping create drafts of soundscapes which we can then build upon, or helping unblock continuation of songs for artists struggling with a creative block

AI singers and artist imitation

AI virtual music instruments and VSTs

  • Synplant 2: Lets you craft sounds and branches - this is an EPIC plugin!
  • iZotope Elements: RX, Neutron, Ozone and Nectar plugins
  • Guitar Rig 7 Pro: Guitar and Bass Amp Simulators. Uses AI (with Convolution?) to craft epic guitar tones
    • For example you feed it the tone of an amp or a sound you are looking for and it adjusts its parameters to allow you to reach that sound without having to own expensive amps / preamps. It is expensive though.
    • Great example video
  • Focusrite’s FAST bundle: Makes mixing much easier
  • Magenta Studio (for Ableton)
    • Covered 1.0 but now 2.0 is out and it has been redesigned to be much simpler to use. Only works in session view but you can drag and drop the clips into arrangement view. Very easy plugin to make drum loops sound a bit more natural, or to extend parts of a song.
    • Powered by RNNs
  • Synthesizer V - Voice synthesizer
    • From lyrics and midi to vocals, without having to open your mouth! I’m still conflicted about this area of plugins; it does open doors and reduces barrier to entry, but on the other hand live music is amazing and vocals are often the centrepiece and I’m not sure I want to live in a world where even more emphasis is placed on the DJ, specially when no vocalist is required anymore, everything can be virtual. Using it in combination with real lyrics, or in a few tracks where it can be used stylistically, hell yes! But as a full replacement? I’m not sure. I’m still on the fence.
    • The price? 89 dollars, sounds amazing - scarily amazing - supports 3 languages as of February 2024
    • Research (don’t include in credits, mentioned below)
    • Check out the website for videos/examples
    • Deep neural network-based
  • Vocaloid 6 (by Yamaha) - Voice synthesizer
    • Another vocal synth, a bit less impressive than than Synthesizer V in my opinion, with less voices and quite a bit more expensive out-of-the-box (although you can pay per voice pack with Synthesizer V). Also supports 3 languages. IMO sounds a bit more robotic, yet for some styles this might be the right fit!
    • Research (same link as Synthesizer V)
    • Check out this example video from their website

Continuation of my previous rant:

Even though the AI singers imitating known artists generated a lot of drama, in my opinion it pales in comparison to the voice synthesis… I mean, I think a lot of people were frustrated with autotune, it just means you don’t even need to know how to sing, YouTubers such as Roomie have generated many videos of autotune vs real vocals. Some artists would use it covertly to fix mistakes, but then it spawned a use-case where it was used as a texture, the whole point was cranking it to 10 making it a stylistic choice…

And I just think in a way AI voice synthesis for singing means that some people who make a living singing on tracks will maybe make less if producers don’t have to find the right talent anymore, not even just someone with a voice to then auto-tune, but now they can skip the vocalist completely. How will that work in a live setting?

These are just some of my raw thoughts as all of this is still quite new and I don’t know how it will pay out in the long run, but the pace of innovation in the AI music generation and voice synthesis in general seems to be accelerating, and I’m very curious what’s going to come out next, and how these tools will be used once the dust has settled.

Other Tools and Services:

There are a bunch of lists of top AI tools including quite a few I didn’t cover like Orb Producer Suite 3, Playbeat sequencer, Atlas 2, Spark Chords and others… This website covers some of them if you are interested in more overviews and links to the plugins

Thanks for reading!

On FRUTAL's Development and Scope Creep

It’s funny how sometimes you’ll have a spark of an idea, to create something small like a sketch, or a little art project, or a small video series, or an EP, or a web app. Something not too big, yet large enough to convey a message, to craft a little world, to tell a story with changing themes, to hold and express sentiment like a roller-coaster at sunset with a glistening bay in view, going up and down as the light dims revealing another world.

And then the output of the spark that ignited the creation spirals out of control and the original work no longer suffices, so you place the first creations on the side while you work on a different set of related ideas, elevated to a higher level, and then - months later - you rediscover the original work and it not only holds sentimental value but you see how you can bring it back into the narrative, so you reintegrate them, and work on them anew, yet after more months of work you find that they are too distinct, as if two bodies of work loosely connected by an idea, but instead of separating them into two separate projects you start drawing bridges, and you try to connect the work together with more work, shorter work, like commas in a long unruly sentence.

And suddenly the small project has turned out to be an expedition lasting close to a couple of years, and the work reaching an end no longer looks like the spark of the original idea, yet you are sure there is some merit, as it is its own world now, and you have to set it free, but you are unsure if it will meet expectations.

And then you remember: There are no expectations.

And thus I’ve been working on FRUTAL for a while; an album I started working with my friend Doorbell before I set off to Ireland. We decided to hang out for a series of 4-5 sessions to work on some material, where he’d play drums and I’d compose, play guitar and sing. The result was 4 songs related to fruits in one way or another, some based on real stories, and some fantasy. The intention was to craft them with a heavy grungy sound, hence the name of the album would be a wordplay on “Brutal”.

Yet 4 songs weren’t enough, so I paid a few voice actors and narrators on Fiverr to read out fruit names and definitions, and then these had to be woven into little musical worlds or motifs. But now we had 6 bridges and 4 songs, not enough songs for an album, but wait, this wasn’t supposed to be an album! Yet there I was, writing more songs, now in a different style, and then I had to construct more bridges, toppling some of the older ones down in the process, and more songs were crafted, about plums, about apples, about mangoes.

And the “Cantaloupes” barbershop quartet no longer matched the theme and had to be re-written thrice. And then I was satisfied, knowing the work consisted of three themes, that reference each other, that play together, that flow, that come back in one way or another throughout the album, a few themes to carry different feelings, different weights, some frutal, some brutal. In a way it resembles my personality, and life changes over the past few years, a story. Rock, indie, electronic, and bridges in between.

The work is nearly done, I hope you enjoy it and that it generates the feeling of going on a journey in you, one full of fruits with a touch of madness. Follow the blog, or subscribe to the newsletter to be notified when it drops.

And remember: Eat your Daily Mangoes.

The Discomfort of Evening (by Marieke Lucas Rijneveld)

As the weeks pass by after finishing this book, I realise more and more how perturbing the experience was. It’s like a lingering nightmare from nights gone by which looms every night, threatening to return while you sleep, to terrorise your dreams with a complex distorted reality of cold, coats, toads, rabbits, cows and “the other side”.

The Discomfort of Evening, known in Dutch as De Avond is Ongemak written by Marieke Lucas Rijneveld, is a whirlwind to say the least.

It is an easy read which is a pro, full of ample descriptions of this world which is described which is so visual yet so drab and gray, I can imagine the featureless landscape, as I write this looking out the gray skies of Belgium on a winter morning. I can picture the mud, the fields, the cows, as when I run long distances I often stumble upon many farms, with the cows grazing, often looking up at me as I run by and then returning to their rumination. I always wondered what kind of people live in these farms, how do they live their lives, and this book gives a portrayal of a glimpse into this life, albeit heavily distorted by the family’s circumstances.

Speaking of rumination, I still can’t digest this book. It is an endurance run of pain or a marathon of grimness, every page gray and drab, a bitter or metallic taste in the mouth, like blood. This is not to say it is badly written, as the language and the story carry beautifully from page to page, similar to how the monotonous voice of Charlie (aka MoistCr1TiKaL) on his YouTube channel delivers his videos, starkly honestly without a facade, direct, blunt, piercing. In this same light the book reads like an honest down-to-earth journal, narrating the smallest yet nastiest intricacies of the daily life of a troubled family in a run-down town in the middle of nowhere in the Netherlands.

I have mixed feelings after reading this book. Marieke crafted a world so unique that if it wasn’t for the mention of a few real world characters I swear it could have been about a run-down farm in Russia or Thailand, or Mars for all I care. It transports you to this place and puts you in those muddy boots and featureless landscape, and it drags you through terrible situation after terrible situation, screaming and kicking.

At times I’d be reading on the train, bus, or plane, hiding the pages from people, afraid they’d catch a glimpse of the perturbing sentences, page after page. It was a diarrhea of all possible intrusive thoughts, everything weird you might have thought of doing, to yourself, or to others, growing up and exploring the world, wondering “why not?”. All of those crazy disgusting weird thoughts, turned into a reality.

I don’t know whether I’d recommend this book. I’m still perturbed by it, so I guess it did its job. 7/10

Krakow

I just came back a few days ago from spending a week in Krakow, and what a beautiful city it is. With an amazing historic city centre, and lots to do around the city, it is well worth visiting.

I recorded quite a bit of footage with the intent of making a somewhat experimental video. Instead of being your standard vlog narrating our strolls around the city, food, and landmarks, it takes a different tone, with very little speech. The intent is more of an audio-visual experience, with music matching the tone of the scenes. It will take quite a while to edit and then record the score for it, but that was what I was thinking about while recording the individual scenes over the course of a week. Hopefully it comes out as I intend it to, maybe a few months down the line.

A few things to visit if you happen to be passing by the city for a few days:

  • The castle (1-2 hours, I’d go mostly around the castle and not to specific exhibits… great for photos)
  • There is a dragon statue that spits fire behind the castle (next to the river)
  • The historic city centre (the x-mas market there is amazing, but also the interior of the building with artesanal wares)
  • Kazimierz (the region has great bars and an AMAZING ribs restaurant called “RZEZNIA”)
  • To the south of Kazimierz you have a few bridges crossing the Wisła river, they have cool lights at night!
  • Park Bednarskiego (climb the hills for a nice view!)
  • Galeria Krakowska (shopping mall, probably spent too much time here)
  • The botanical garden just east of th ecity centre
  • Behind the TAURON arena in the east there is a really beautiful (BIG) park; it was snowing when I went making it amazing

There is plenty more to visit, so many cool restaurants and bars, landmarks and epic sightseeing spots, but this is the bare minimum I think you should have on your list if you are visiting for 2-3 days.

Public transport is great and super fast, you can take the train to get to multiple parts of the city which I found sometimes more optimal than buses/taxis/etc.

The Subtle Art of Not Giving a Fuck by Mark Manson

Note: This book review has minor spoilers (specially towards the end)!

I’m years late to reading this book. I bought it for my sister around 2019, because I thought the title was a bit edgy and maybe had a few glimmers of insight and protocols to deal with life, social situations, stress, anxiety, and the seemingly overwhelming need to perform and put on a face that has become ever-increasing in the era of TikTok.

Yet when it was proposed in our Book Club, even though I knew it was loosely a type of humoristic approach at a self-help book, I kind of thought it would be good to give it a go (mostly because we had just finished a longer and more complicated book, so it feels nice to dive into something that reads easily in between lengthier reads).

The core idea of the book is to set expectations to zero, the baseline is nothing, then you can only go up. If things were to get worse, that is the new baseline. Choose what things to be passionate or care for with frugality: If you care for everything and want to be everything and measure everything in terms of success then everything will cease to be worthwhile.

I think the most important chapters were 1, 6 and 9.

The last chapter was the one that felt the realest and most honest; but i feel like the whole going to a cliff in south africa was a weird way to confront the idea of death, I feel like you can be at peace with the fact we are going to die without having to look at the bottom of a cliff.

Overall I feel it was written in a state of flow, the sentences are fluid, and whilst targetting a specific audience with a bit of an overuse of expletives, it still feels honest and straight-forward. I feel there are a few nuggets of knowledge and it was entertaining to read.

The downside? I feel like it could have been a 20-30 minute video. For this I think I’ll give it a 3.5 out of 5.