These are demonstrations of motion learning tasks.
The robot is a small size humanoid robot that has 17-DoF (degree of freedom).
Experiments are performed in simulation using a dynamics simulator ODE (Open Dynamics Engine). The dynamics simulation is calculated with a time step 0.2 [ms].
A crawling and a turning task are performed with this robot.
The objective of the crawling task is to move forward along the x-axis as far as possible. According to this objective, the reward function is designed as follows: moving reward that is proportional to forward velocity, small penalty for torque usage, and penalty for falling down. Each episode begins with the initial state where the robot is standing up and stationary, and ends if t>20[s] or the amount of reward is less than -40 (i.e. too small).
The objective of the turning task is to turn around the z-axis as fast as possible. So, the dominant element of the reward is the rotational velocity. The other setup is the same as that of the crawling task other than the initial state where the robot lies down.
Execute:
$ cd benchmarks/humanoid01 $ make
See Documentation/Installation Guide for the detail.