Recent improvements in MARA's gazebo driver allow trajectories to be executed smoothly. This improved driver handles interpolation much better, which completely eliminates the occasional shaky behavior present in the previous one. Furthermore, this driver is almost identical to the one used in the real robot.

Executing a fully converged policy at 0.1 rad/s with the new driver

Not only we eliminate the trembling behavior during execution, we are also able to completely stop the motion of the robot once the target is precisely reached. The image below shows the previous driver, where small trembling is present.

Executing a fully converged policy at 0.1 rad/s with the old driver

Take a look at the open-source code of the driver at github.com/AcutronicRobotics/MARA.

And feel free to raise your questions in the GitHub issue section.

Check ROS2Learn and gym-gazebo2 if you want to achieve policies like this one. You will also find environments where you can add collision and orientation to the reward system, which will allow you to give the robot a more custom behavior.