notesum.ai

Published at December 9

cs.LG

stat.ML

Released Date: December 9, 2024

Authors: Adrien Bolland¹, Gaspard Lambrechts, Damien Ernst

Aff.: ¹University of Liege

Parameter	Value
Neurons for each network layers	$256$
Layers policy	$2$
Layers critic	$2$
Learning rate policy	$10^{-5}$
Learning rate critic	$10^{-4}$
Maximum trajectory length	$200$
Buffer size	$1000$
Batch size	$32$
Critic target update weight $\tau$	$0.1$
Discount factor $\gamma$	$0.98$
SAC $\lambda_{SAC}$	$0.002$
Layers visitation model OPAC+CV	$2$
Learning rate visitation model	$10^{-5}$
MaxEntRL $\lambda$	$0.01$
Density model target update weight $\tau$	$1$