TY - JOUR
T1 - Uncertainty-Based Experience Replay for Task-Agnostic Continual Reinforcement Learning
AU - Remonda, Adrian
AU - Terrell, Cole
AU - Veas, Eduardo
AU - Masana, Marc
N1 - Publisher Copyright:
© 2025, Transactions on Machine Learning Research. All rights reserved.
PY - 2025/2
Y1 - 2025/2
N2 - Model-based reinforcement learning uses a learned dynamics model to imagine actions and select those with the best expected outcomes. An experience replay buffer collects the outcomes of all actions executed in the environment, which is then used to iteratively train the dynamics model. However, as the complexity and scale of tasks increase, training times and memory requirements can grow drastically without necessarily retaining useful experiences. Continual learning proposes a more realistic scenario where tasks are learned in sequence, and the replay buffer can help mitigate catastrophic forgetting. However, it is not realistic to expect the buffer to infinitely grow as the sequence advances. Furthermore, storing every single experience executed in the environment does not necessarily provide a more accurate model. We argue that the replay buffer needs to have the minimal necessary size to retain relevant experiences that cover both common and rare states. Therefore, we propose using an uncertainty-based replay buffer filtering to enable an effective implementation of continual learning agents using model-based reinforcement learning. We show that the combination of the proposed strategies leads to reduced training times, smaller replay buffer size, and less catastrophic forgetting, all while maintaining performance.
AB - Model-based reinforcement learning uses a learned dynamics model to imagine actions and select those with the best expected outcomes. An experience replay buffer collects the outcomes of all actions executed in the environment, which is then used to iteratively train the dynamics model. However, as the complexity and scale of tasks increase, training times and memory requirements can grow drastically without necessarily retaining useful experiences. Continual learning proposes a more realistic scenario where tasks are learned in sequence, and the replay buffer can help mitigate catastrophic forgetting. However, it is not realistic to expect the buffer to infinitely grow as the sequence advances. Furthermore, storing every single experience executed in the environment does not necessarily provide a more accurate model. We argue that the replay buffer needs to have the minimal necessary size to retain relevant experiences that cover both common and rare states. Therefore, we propose using an uncertainty-based replay buffer filtering to enable an effective implementation of continual learning agents using model-based reinforcement learning. We show that the combination of the proposed strategies leads to reduced training times, smaller replay buffer size, and less catastrophic forgetting, all while maintaining performance.
UR - http://www.scopus.com/inward/record.url?scp=85219417249&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:85219417249
SN - 2835-8856
VL - 2025-February
SP - 1
EP - 26
JO - Transactions on Machine Learning Research
JF - Transactions on Machine Learning Research
ER -