notesum.ai
Published at November 25Continual Deep Reinforcement Learning with Task-Agnostic Policy Distillation
Released Date: November 25, 2024
Authors: Muhammad Burhan Hafez1, Kerim Erekmen2
Aff.: 1School of Electronics and Computer Science, University of Southampton, Southampton, UK; 2Department of Informatics, University of Hamburg, Hamburg, Germany

| Task 1 | Task 2 | Task 3 | Task 4 | Task 5 | |
| First Visit | |||||
| Progress & Compress (Active-Col) | -4.25 | 327.41 | 732.12 | 1029.89 | 2468.09 |
| TAPD (Active-Col) | 14.86 | 439.62 | 1441.82 | 2479.65 | 2605.39 |
| Online EWC | 12.14 | 248.85 | 528.377 | 209.62 | 505.30 |
| Progressive Nets | 16.94 | 437.3 | 646.17 | 910.16 | 1279.05 |
| Second Visit | |||||
| Progress & Compress (Active-Col) | 14.56 | 335.97 | 877.66 | 963.57 | 2529.31 |
| TAPD (Active-Col) | 18.51 | 472.91 | 1537.52 | 2699.95 | 2580.15 |
| Online EWC | -14.71 | 275.79 | 606.64 | 460.08 | 433.57 |
| Progressive Nets | -3.4 | 618.7 | 697.6 | 612.37 | 1455.93 |
| Third Visit | |||||
| Progress & Compress (Active-Col) | 15.04 | 387.19 | 892.40 | 1034.01 | 2501.98 |
| TAPD (Active-Col) | 19.94 | 483.87 | 1550.30 | 2383.18 | 2506.41 |
| Online EWC | -17.19 | 326.31 | 641.08 | 592.97 | 390.29 |
| Progressive Nets | 20.33 | 457.5 | 864.68 | 1084.24 | 2312.06 |