Neural network taught to predict gestures based on human voice

True, it is not too accurate.

Researchers at the University of California at Berkeley have created a speech2gesture neural network that can predict gestures based only on a person’s voice. It gives a realistic result and is accurate in almost half the cases.

The neural network was trained on 144 hours of video recordings with 10 people who have to gesticulate a lot, including television anchors, teachers and preachers. As a result, the algorithm learned to transmit realistic gestures, which were synchronized with the original.

While the system works with a small accuracy and only in 44% of cases the result coincides with the original. In some cases, the neural network confuses the position of the hands, but in any case generates a very plausible result.

To support future research, the team published a dataset with characteristic gestures and source code in open access.

Back to top button