In this research paper we present a Deep Reinforcement Learning (DRL) framework -with Gazebo and ROS- that simplifies the process of building modular robots and their corresponding tools. It also includes baseline implementations for the most common DRL techniques dealing with policy iteration methods.
Source and extended article: “Evaluation of deep reinforcement learning methods for modular robots” by Risto Kojcev, Nora Etxezarreta, Alejandro Hernandez and Víctor Mayoral
Current robot systems are designed, built and programmed by teams with multidisciplinary skills. The traditional approach to program such systems is typically referred to as the robotics control pipeline and requires going from observations to final low-level control commands through:
State estimation -> modeling and prediction -> planning -> low level control translation.
Every step in the pipeline requires fine tuning, leading to a relevant complexity.
In recent years, several techniques for DRL (Deep Reinforcement Learning) have shown good success in learning complex behaviour skills and solving challenging control tasks in high-dimensional state-space.
Modular robots can be extend seamlessly through modular components. This brings advantages for their construction, but training them with current DRL methods becomes cumbersome as:
- Every small change in the physical structure of the robot will require a new training.
- Building the tools to train modular robots is a time consuming process.
- Transferring the results to the real robot is complex given the flexibility of these systems.
In this research paper we present a framework -with Gazebo and ROS- that simplifies the process of building modular robots and their corresponding tools. It also includes baseline implementations for the most common Deep Reinforcement Learning techniques dealing with policy iteration methods.
Using this framework, we present configurations with 3 and 4 degrees-of-freedom (DoF), while performing the same task.
At Acutronic Robotics, we trained two modular robots, namely the SCARA 3DoF and 4DoF robots, where the Gazebo simulator and corresponding ROS packages convert the actions generated from each algorithm to appropriate trajectories the robot can execute.
You can find the experimental results here, in the original paper: “Evaluation of deep reinforcement learning methods for modular robots”.
We know there still remain many challenges within the Deep Reinforcement Learning field for robotics. Such as:
- the long robot training times
- the simulation-to-real robot transfer
- reward shaping sample efficiency, or
- extending the behaviour to diverse tasks and robot configurations.
So far, our work with modular robots has focused on simple tasks like reaching a point in space. In order to have an end-to-end training framework (from pixels to motor torques) and to perform more complex tasks, we aim to integrate additional rich sensory input, such as vision.
We envision the future of robotics to be about modular robots where the trained network can generalize online to modifications in the robot, such as change of a component or dynamic obstacle avoidance.
Interested in knowing more about our research on AI? Check our most recent AI research articles.