humanoid01
Start:
* Running Humanoid01 Demos [#h9e66c0b]
These are demonstrations of motion learning tasks.
** Overview [#d008b6b6]
The robot is a small size humanoid robot that has 17-DoF ...
#ref(humanoid-manoi01-6-full.png,center,zoom,300x0)
Experiments are performed in simulation using a dynamics ...
The dynamics simulation is calculated with a time step 0....
A crawling and a turning task are performed with this rob...
The objective of the crawling task is to move forward alo...
According to this objective, the reward function is desig...
moving reward that is proportional to forward velocity, s...
Each episode begins with the initial state where the robo...
In default, a 5-DoF (Degree of Freedom) constraint is use...
The objective of the turning task is to turn around the z...
So, the dominant element of the reward is the rotational ...
The other setup is the same as that of the crawling task ...
** Build the Demo Program [#x66169b9]
Execute:
$ cd benchmarks/humanoid01
$ make
See [[Documentation/Installation Guide]] for the detail.
** Running Command [#of22dc5e]
Please read [[Common Usage>../Common Usage]] in advance.
The running command is:
$ ./DEMO_PRG -path PATH_LIST -agent AGENT_FILE -outdir ...
The demo-specific elements are:
:DEMO_DIR| humanoid01 (benchmarks/humanoid01).
:DEMO_PRG| humanoid01.out
:PATH_LIST| Crawling task: ../cmn,m,m/cr and turning task...
:AGENT_FILE| Available agent scripts are listed below.
:OUT_DIR| Any directory is possible.
For example, execute the following:
$ mkdir -p result/rl1
$ ./humanoid01.out -path ../cmn,m,m/cr -agent ql_dcob1 ...
You can see that a window opens:
#ref(sim-window.jpg,center,zoom,300x0)
and also see the output in the terminal like (debug lines...
random seed = 1306390058
simulation is initialized.
start simulation..
episode 0...
simulation is initialized.
Simulation test environment v0.02
Ctrl-P : pause / unpause (or say `-pause' on command ...
Ctrl-O : single step when paused.
Ctrl-T : toggle textures (or say `-notex' on command ...
Ctrl-S : toggle shadows (or say `-noshadow' on comman...
Ctrl-V : print current viewpoint coordinates (x,y,z,h...
Ctrl-W : write frames to ppm files: frame/frameNNN.ppm
Ctrl-X : exit.
Change the camera position by clicking + dragging in th...
Left button - pan and tilt.
Right button - forward and sideways.
Left + Right button (or middle button) - sideways and...
episode 1...
simulation is initialized.
episode 2...
simulation is initialized.
episode 3...
simulation is initialized.
episode 4...
simulation is initialized.
episode 5...
simulation is initialized.
episode 6...
simulation is initialized.
To exit the program, press Ctrl+x on the window.
In OUT_DIR (result/rl1), the result files are stored.
For example, use gnuplot to plot the learning curve as:
$ gnuplot
gnuplot> plot 'result/rl1/log-eps-ret.dat' w l
#ref(rl1-eps-ret.png,center,zoom,300x0)
** Agent Script [#ifbd8f68]
The following files can be specified as AGENT_FILE.
** Crawling Task [#o6eac1d6]
:ql_grid1 | Q(lambda)-learning, Grid action space, linear...
:ql_dcob1 | Q(lambda)-learning, DCOB (action space), line...
:ql_dcob_q1 | Q(lambda)-learning, joint angle version of ...
:ql_gwf1 | Wire-fitting (grid init) updated by Q(lambda)-...
:ql_wfdcob1 | WF-DCOB, Q(lambda)-learning.
:fqi_grid1 | Fitted Q iteration (updated in every 10 epis...
:qlfqi_grid1 | Q(lambda)-learning + Fitted Q iteration (u...
:qlfqi_dcob1 | Q(lambda)-learning + Fitted Q iteration (u...
:qlfqi_gwf1 | Q(lambda)-learning + Fitted Q iteration (up...
:qlfqi_wfdcob1 | Q(lambda)-learning + Fitted Q iteration ...
''Testing:''
:lspi_grid1 | LSPI (updated in every 5 episode), Grid act...
** Turning Task [#sf8de422]
:ql_dcob2 | Q(lambda)-learning, DCOB, linear action value...
:ql_dcob_q2 | Q(lambda)-learning, joint angle version of ...
** Miscellaneous [#k90e61ca]
*** Start with Pause [#m880f463]
Add the following option on the command line:
-pause 1
*** Capture the Frames [#g9863a62]
Press Ctrl+w on the window.
*** Execute in Console Mode [#z67309eb]
Add the following option on the command line:
-console true
*** Change the DoF (Degree of Freedom) [#b27d38fe]
Add the agent script that defines a DoF configuration.
For example:
./humanoid01.out -path ../cmn,m,m/cr -agent ql_dcob1,do...
Provided DoF configurations are:
:(default) | 5-DoF. Some joints are coupled, which gives ...
:dof4asym | 4-DoF. Some joints are coupled, which gives a...
:dof6 | 6-DoF. Some joints are coupled, which gives a bil...
:dof7 | 7-DoF. Some joints are coupled, which gives a bil...
End:
* Running Humanoid01 Demos [#h9e66c0b]
These are demonstrations of motion learning tasks.
** Overview [#d008b6b6]
The robot is a small size humanoid robot that has 17-DoF ...
#ref(humanoid-manoi01-6-full.png,center,zoom,300x0)
Experiments are performed in simulation using a dynamics ...
The dynamics simulation is calculated with a time step 0....
A crawling and a turning task are performed with this rob...
The objective of the crawling task is to move forward alo...
According to this objective, the reward function is desig...
moving reward that is proportional to forward velocity, s...
Each episode begins with the initial state where the robo...
In default, a 5-DoF (Degree of Freedom) constraint is use...
The objective of the turning task is to turn around the z...
So, the dominant element of the reward is the rotational ...
The other setup is the same as that of the crawling task ...
** Build the Demo Program [#x66169b9]
Execute:
$ cd benchmarks/humanoid01
$ make
See [[Documentation/Installation Guide]] for the detail.
** Running Command [#of22dc5e]
Please read [[Common Usage>../Common Usage]] in advance.
The running command is:
$ ./DEMO_PRG -path PATH_LIST -agent AGENT_FILE -outdir ...
The demo-specific elements are:
:DEMO_DIR| humanoid01 (benchmarks/humanoid01).
:DEMO_PRG| humanoid01.out
:PATH_LIST| Crawling task: ../cmn,m,m/cr and turning task...
:AGENT_FILE| Available agent scripts are listed below.
:OUT_DIR| Any directory is possible.
For example, execute the following:
$ mkdir -p result/rl1
$ ./humanoid01.out -path ../cmn,m,m/cr -agent ql_dcob1 ...
You can see that a window opens:
#ref(sim-window.jpg,center,zoom,300x0)
and also see the output in the terminal like (debug lines...
random seed = 1306390058
simulation is initialized.
start simulation..
episode 0...
simulation is initialized.
Simulation test environment v0.02
Ctrl-P : pause / unpause (or say `-pause' on command ...
Ctrl-O : single step when paused.
Ctrl-T : toggle textures (or say `-notex' on command ...
Ctrl-S : toggle shadows (or say `-noshadow' on comman...
Ctrl-V : print current viewpoint coordinates (x,y,z,h...
Ctrl-W : write frames to ppm files: frame/frameNNN.ppm
Ctrl-X : exit.
Change the camera position by clicking + dragging in th...
Left button - pan and tilt.
Right button - forward and sideways.
Left + Right button (or middle button) - sideways and...
episode 1...
simulation is initialized.
episode 2...
simulation is initialized.
episode 3...
simulation is initialized.
episode 4...
simulation is initialized.
episode 5...
simulation is initialized.
episode 6...
simulation is initialized.
To exit the program, press Ctrl+x on the window.
In OUT_DIR (result/rl1), the result files are stored.
For example, use gnuplot to plot the learning curve as:
$ gnuplot
gnuplot> plot 'result/rl1/log-eps-ret.dat' w l
#ref(rl1-eps-ret.png,center,zoom,300x0)
** Agent Script [#ifbd8f68]
The following files can be specified as AGENT_FILE.
** Crawling Task [#o6eac1d6]
:ql_grid1 | Q(lambda)-learning, Grid action space, linear...
:ql_dcob1 | Q(lambda)-learning, DCOB (action space), line...
:ql_dcob_q1 | Q(lambda)-learning, joint angle version of ...
:ql_gwf1 | Wire-fitting (grid init) updated by Q(lambda)-...
:ql_wfdcob1 | WF-DCOB, Q(lambda)-learning.
:fqi_grid1 | Fitted Q iteration (updated in every 10 epis...
:qlfqi_grid1 | Q(lambda)-learning + Fitted Q iteration (u...
:qlfqi_dcob1 | Q(lambda)-learning + Fitted Q iteration (u...
:qlfqi_gwf1 | Q(lambda)-learning + Fitted Q iteration (up...
:qlfqi_wfdcob1 | Q(lambda)-learning + Fitted Q iteration ...
''Testing:''
:lspi_grid1 | LSPI (updated in every 5 episode), Grid act...
** Turning Task [#sf8de422]
:ql_dcob2 | Q(lambda)-learning, DCOB, linear action value...
:ql_dcob_q2 | Q(lambda)-learning, joint angle version of ...
** Miscellaneous [#k90e61ca]
*** Start with Pause [#m880f463]
Add the following option on the command line:
-pause 1
*** Capture the Frames [#g9863a62]
Press Ctrl+w on the window.
*** Execute in Console Mode [#z67309eb]
Add the following option on the command line:
-console true
*** Change the DoF (Degree of Freedom) [#b27d38fe]
Add the agent script that defines a DoF configuration.
For example:
./humanoid01.out -path ../cmn,m,m/cr -agent ql_dcob1,do...
Provided DoF configurations are:
:(default) | 5-DoF. Some joints are coupled, which gives ...
:dof4asym | 4-DoF. Some joints are coupled, which gives a...
:dof6 | 6-DoF. Some joints are coupled, which gives a bil...
:dof7 | 7-DoF. Some joints are coupled, which gives a bil...
Page: