Researchers from Laboratory X (formerly known as Google X) have developed and tested a system that allows robots to quickly learn the same tasks with the help of collective learning. This is briefly described in the Google blog, with article preprints available at arXiv.org ( 1 , 2 , 3 ).
Methods of in-depth training help the robots to master the performance of rather complex actions, including those related to motor skills. With a simulator or a ready-made dataset, this process can be relatively short-lived, however, for independent training with reinforcement in the real world, one robot usually takes much longer.
Developers from Google decided to shorten this period, for which several manipulators with seven degrees of freedom used the same task – the independent opening of the door. The robots were combined into a single network with a central server, on which additional training was conducted and the current version of the neural network was stored. Each of the robots had its own copy of the neural network, which autonomously worked on the task of “opening the door with a pen.”
In the first experiment, each robot worked with its own door, and all the doors were in different positions. Controlling each robot copy of the central neural network was a sequence of future actions, and at this stage, engineers specifically distorted the chain of commands from the network to the robot by extraneous noise to increase the range of selected values. After that, the robot made another attempt to open the door.
The information about the actions chosen by the neural network by the physical movements in the real world and the results of each attempt was sent back to the server. This data was used for additional configuration of the central neural network, after which the server sent a copy of the new version to the robots, which worked a little better than the previous one, and the whole operation was repeated again.
As a result of the experiments, it was found out that even two robots train the neural network much more efficiently than one robot. Two robots in two and a half hours reached a rate of 100 percent of successful attempts, and a robot that worked alone, during this time learned to just move the manipulator to the door handle. Four hours later, a single robot learned to open the door only in 20 percent of cases.
Also, researchers describe two other approaches to collective training of robots using the example of practically the same equipment. In addition to the above experiment, the authors used preliminary mechanical training by the operator, who “showed” the necessary action to the robot. Another publication describes collective learning using cameras – in this case, the image from robot cameras is also used for collective learning of a central neural network and predicting the consequences of an action in the physical world.
Earlier, specialists of Laboratory X already tested collective training of robots with cameras. The authors have taught the neural network to predict the probability of successful capture, based on camera images regardless of the calibration and the original position of the robot.