Google has developed a neural network background replacement algorithm that can process video in real time, even on mobile devices, without the need for a green screen. The new feature is already available to some users of the mobile application YouTube, according to the blog of Google.
The replacement of the background is often used when shooting modern films or serials. Usually for this purpose, chromakey is used – the technology in which actors are shot on a bright green or blue background, after which another image is superimposed on this background in the video editor. This technology is convenient for shooting large projects, but for everyday use by ordinary people it is not suitable, because it requires both the presence of the green screen for the background and the skills of processing such content. In addition, there are also technologies for automatic computer background cutting, but most of them have a poor quality of the final image or require large computing power.
Developers from Google have created a neural network algorithm to replace the background in real time, which is suitable even for mobile devices. They took as a basis the previously proposed architecture of a convolutional neural network, designed specifically for image segmentation. Since the problem was that the neural network could work in real time on ordinary smartphones, the developers optimized the neural network, reducing the number of channels on the average network level by more than 10 times.
The essence of the work of a neural network is reduced to a standard segmentation procedure for the object and the background. In order to increase the temporal consistency of neighboring frames, programmers added one more to the three channels (red, green, blue), which is the previous segmented frame.
To train the neural network, the developers created a data set consisting of tens of thousands of photos of people with marked parts of the face and additional objects, for example, glasses.
As a result, programmers received a neural network model that processes video from the camera in real time at a speed of more than one hundred frames per second on the Apple iPhone 7 and more than forty frames per second on Google Pixel 2. At the same time, the similarity ratio of the neural networked image with human markup reaches almost 95 percent. Google has already built a neural network in the history section of the YouTube mobile app, which offers several background replacement modes, though, until a new feature is available to just a few users.Last year, programmers from Adobe showed a technology that allows you to delete objects from video, for example, people, and realistically replace the place with a background. But the algorithm does not work in real time and requires a fairly powerful computer. And the developers from Google introduced a neural network algorithm that can in real time process video from the smartphone’s camera and expand its dynamic range.