Continual learning is one of the most essential abilities for autonomous agents, which can incrementally learn daily-life skills even with limited computer resources. To achieve this goal, a simple yet powerful method called dark experience replay (DER) was recently proposed. DER mitigates catastrophic forgetting, where the skills acquired in the past are unintentionally forgotten when learning new skills, by stochastically storing streaming data in a reservoir sampling (RS) buffer and relearning them or retaining their past outputs. However, because DER considers multiple objectives, it does not function properly without appropriate weighting for each problem. In addition, the ability to retain past outputs inhibits learning if past outputs are inconsistent owing to distribution shifts or other effects. This is because of the trade-off between memory consolidation and plasticity. The trade-off is hidden even in the RS buffer, which gradually stops storing new data for new skills as data are continuously passed to it. To alleviate this trade-off and achieve a better balance, this study proposes improvement strategies for each DER and RS. Specifically, DER is improved by the automatic adaptation of weights, blocking of replaying inconsistent data, and correction of past outputs. RS is also improved with the generalization of acceptance probability, stratification of multiple buffers, and intentional omission of inconsistent data. These improvements were verified using multiple benchmarks including regression, classification, and reinforcement learning problems. Consequently, the proposed methods achieved a steady improvement in learning performance by balancing memory consolidation and plasticity.
Taisuke Kobayashi (Thu,) studied this question.