What question did this study set out to answer?

This research aims to improve dark experience replay and reservoir sampling methods for better continual learning in autonomous agents.

February 22, 2026Open Access

Improvements to dark experience replay and reservoir sampling for better balance between consolidation and plasticity

Key Points

This research aims to improve dark experience replay and reservoir sampling methods for better continual learning in autonomous agents.
Improved DER with automatic adaptation of weights and blocking inconsistent data playback.
Enhanced RS through generalizing acceptance probability and creating multiple stratified buffers.
Evaluated improvements using benchmarks in regression, classification, and reinforcement learning.
Improvements led to a consistent enhancement in learning performance.
Achieved a better balance between memory consolidation and plasticity.
Showed effective retention of past outputs while acquiring new skills.

Abstract

Continual learning is one of the most essential abilities for autonomous agents, which can incrementally learn daily-life skills even with limited computer resources. To achieve this goal, a simple yet powerful method called dark experience replay (DER) was recently proposed. DER mitigates catastrophic forgetting, where the skills acquired in the past are unintentionally forgotten when learning new skills, by stochastically storing streaming data in a reservoir sampling (RS) buffer and relearning them or retaining their past outputs. However, because DER considers multiple objectives, it does not function properly without appropriate weighting for each problem. In addition, the ability to retain past outputs inhibits learning if past outputs are inconsistent owing to distribution shifts or other effects. This is because of the trade-off between memory consolidation and plasticity. The trade-off is hidden even in the RS buffer, which gradually stops storing new data for new skills as data are continuously passed to it. To alleviate this trade-off and achieve a better balance, this study proposes improvement strategies for each DER and RS. Specifically, DER is improved by the automatic adaptation of weights, blocking of replaying inconsistent data, and correction of past outputs. RS is also improved with the generalization of acceptance probability, stratification of multiple buffers, and intentional omission of inconsistent data. These improvements were verified using multiple benchmarks including regression, classification, and reinforcement learning problems. Consequently, the proposed methods achieved a steady improvement in learning performance by balancing memory consolidation and plasticity.

Bookmark

View Full Paper

Bookmark

View Full Paper

Improvements to dark experience replay and reservoir sampling for better balance between consolidation and plasticity

Key Points

Abstract

Cite This Study