| Approach |
Method |
Functions learned |
On/Off |
Section |
| Value-based |
SARSA |
|
On |
\crefsec:SARSA |
| Value-based |
-learning |
|
Off |
\crefsec:Qlearning |
| Policy-based |
REINFORCE |
|
On |
\crefsec:REINFORCE |
| Policy-based |
A2C |
,
|
On |
\crefsec:A2C |
| Policy-based |
TRPO/PPO |
,
|
On |
\crefsec:PPO |
| Policy-based |
DDPG |
,
|
Off |
\crefsec:DDPG |
| Policy-based |
Soft actor-critic |
,
|
Off |
\crefsec:SAC |
| Model-based |
MBRL |
|
Off |
\crefsec:MBRL |