NVIDIA’s latest AI tool, Fugatto, makes music from text and can change the way we look at audio

Using Fugatto, you'll be able to create the sound of anything you can type.
#nvidia #fugatto #ai

By Ken Wong - 27 Nov 2024

Ever been stuck trying to describe a sound to someone? Well, using the new Fugatto from NVIDIA, you can generate or transform “any mix of music, voices and sounds described with prompts using any combination of text and audio files”.

Following on from before, this means it can create a music snippet based on a text prompt, remove or add instruments from an existing song, change the accent or emotion in a voice, or even let you produce sounds never heard before. For example, NVIDIA said that Fugatto can make a trumpet bark or a saxophone meow. “Whatever users can describe, the model can create,” the company said in a blog post.

Rafael Valle, a manager of applied audio research at NVIDIA and an orchestral conductor and composer, said:

We wanted to create a model that understands and generates sound like humans do. Fugatto is our first step toward a future where unsupervised multitask learning in audio synthesis and transformation emerges from data and model scale.

Examples of Fugatto’s use given by NVIDIA include using Fugatto to quickly prototype or edit an idea for a song, trying out different styles, voices and instruments. A composer could also add effects and enhance the overall audio quality of an existing track. Learning a new language? Imagine being taught by a voice that sounds like your mother!

It can also “generate sounds that change over time”. Called Temporal Interpolation, it can for example, create the sounds of a rainstorm moving through an area with the crack of thunder that slowly fade into the distance.

Our articles may contain affiliate links. If you buy through these links, we may earn a small commission.

NVIDIA’s latest AI tool, Fugatto, makes music from text and can change the way we look at audio

Tags

Share this article