Neural network taught to separate vocals and background music in songs

Now you can create your own party for karaoke from any famous song.

The streaming service Deezer launched the Spleeter tool, which allows you to divide the music into components. A library based on neural networks is available to everyone for free – it was published on GitHub.

One song can be divided into a maximum of five components: vocals, bass, drums, piano and everything else. To do this, just upload any audio file to Spleeter, in response it will produce several files.

Sound Separation in Spleeter by David Bowie

According to developer Andy Bayo, Spleeter works on the TensorFlow model, which was trained on “tens of thousands of songs.” According to Bayot, while the service is not perfect: some artifacts remain on the tracks, and the vocals sometimes look like a robotic voice, but this is still better than other solutions.

To use Spleeter, you will need some technical skills. Users who have never used Python and the TensorFlow tool will have to download several programs to make it work. In addition, you need to interact with Spleeter through the command line – the library does not yet have a graphical interface.

Deezer explained that this is not the first time that people use machine learning to automate such tasks, and the company’s development is based on a huge amount of previous research. In a conversation with The Verge, service representatives noted that they had trained the system on 20 thousand compositions of different genres with pre-isolated vocals.

The company is not going to turn Spleeter into a consumer tool. However, since the library was released with open source code, third-party developers can modify it.

Spleeter was developed primarily for use inside Deezer. Using the tool, the service solves complex problems, such as categorizing songs, transcribing and recognizing a language.

A neural network that divides a song into vocals and accompaniment

Back to top button