notesum.ai
Published at December 9Off-Policy Maximum Entropy RL with Future State and Action Visitation Measures
cs.LG
stat.ML
Released Date: December 9, 2024
Authors: Adrien Bolland1, Gaspard Lambrechts, Damien Ernst
Aff.: 1University of Liege

| Parameter | Value |
|---|---|
| Neurons for each network layers | |
| Layers policy | |
| Layers critic | |
| Learning rate policy | |
| Learning rate critic | |
| Maximum trajectory length | |
| Buffer size | |
| Batch size | |
| Critic target update weight | |
| Discount factor | |
| SAC | |
| Layers visitation model OPAC+CV | |
| Learning rate visitation model | |
| MaxEntRL | |
| Density model target update weight |