social network

Google has taught robots to visual-motor coordination of movements

Peter Pastor / YouTube

Researchers Google with the help of deep machine learning taught the robots to coordinate their movements when capturing objects. This is reported in the company’s blog, the preprint of the article is available on the ArXiv.orgwebsite . 

Google specialists trained robots for visual-motor coordination of movements during the capture of objects. To do this, they taught the convolutional neural network to predict the probability of successful capture, based on camera images, regardless of its calibration and the original position of the robot.

In order to train the convolutional neural network, the researchers carried out more than 800 thousand attempts to capture by means of 14 robots, which is approximately equivalent to 3 thousand hours of training. At the same time robots could be trained in parallel, which significantly accelerated the process.

The system itself consists of two components: the first is a “predictive” convolutional neural network that processes visual input information and motion commands and calculates the probability of successful capture of objects. The second part is represented by a control function that constantly monitors the robot and sends it to a convenient position. Thus, the robot watches its own exciting mechanism and corrects its behavior in accordance with the predictions of the neural network.

Researchers evaluated the success of this approach by comparing the results of trained robots with the results of robots that did not use the “feedback” for capture. They conducted two tests: in both experiments a limit of 100 attempts to capture was established, but in the first, robots could return items back to the basket, which allowed them to re-select more convenient things, and in the second, no.

As a result, robots equipped with a convolutional neural network coped with the task almost twice as much. In the first experiment, the number of unsuccessful attempts was 17.5 percent, while robots that used only one image to take the item failed to achieve 33.7 percent. In the event that items could not be returned back to the basket, the difference between the number of failures was 23 percent. 

Specialists also noticed the interesting behavior of the gripping mechanism: for example, in some cases the robot first discarded interfering objects. The video also shows how he takes aim to take the object – this, according to experts, gives the robot movements some “humanity”.

A convolutional neural network is a special architecture of artificial neural networks, aimed at efficient recognition of images. The structure of this network is unidirectional, that is, without feedbacks, and basically multilayered. 

Back to top button