Register for free and continue reading

Join our growing army of changemakers and get unlimited access to our premium content

Login Register

Software can isolate one sound in a video

A new AI software can locate image regions which produce sounds and isolate the sound of individual instruments without needing additional manual supervision.

Have you ever listened to a piece of music and wished that you could hear just one of the instruments on its own? Isolating the sound of a single instrument in an audio or video recording has always been very difficult for sound engineers. Now, researchers at MIT’s Computer Science and Artificial Intelligence Lab (CSAIL) have developed technology that can help. The AI-based software allows users to make the audio of individual instruments louder or softer. This is simply done by clicking on them while they are being played in a video.

The researchers developed a deep learning algorithm called PixelPlayer. The system is made up of three neural networks. One network analyses the video’s audio while another examines the visuals. Additionally the third, called the synthesiser, puts it all together by connecting the different sound waves with the pixels of their instruments. The software was first trained using more than 60 hours of video footage. PixelPlayer learned to identify the sounds of more than 20 musical instruments. This was done by examining every pixel in the videos and determining which sound should be associated with each pixel. According to the researchers, as the software is fed more data, it will be able to learn to single out more instruments.

Possible uses for the system include being able to edit the audio of individual instruments simply by clicking on a video. In addition to helping musicians to learn new parts; or allowing audio engineers to swap out instruments in video footage. It could even be used to allow robots to better understand environmental sounds. For example animals or vehicles, and learn how to respond to them. Recently, we have seen a number of innovations using artificial intelligence. These have been developed for uses as diverse as composing music and generating yield predictions for commercial greenhouses. What other uses might there be for the PixelPlayer?