March 18, 2024Open Access

Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters

Key Points

Key points are not available for this paper at this time.

Abstract

Continual learning can empower vision-language models to continuously acquire new knowledge, without the need for access to the entire historical dataset. However, mitigating the performance degradation in large-scale models is non-trivial due to (i) parameter shifts throughout lifelong learning and (ii) significant computational burdens associated with full-model tuning. In this work, we present a parameter-efficient continual learning framework to alleviate long-term forgetting in incremental learning with vision-language models. Our approach involves the dynamic expansion of a pre-trained CLIP model, through the integration of Mixture-of-Experts (MoE) adapters in response to new tasks. To preserve the zero-shot recognition capability of vision-language models, we further introduce a Distribution Discriminative Auto-Selector (DDAS) that automatically routes in-distribution and out-of-distribution inputs to the MoE Adapter and the original CLIP, respectively. Through extensive experiments across various settings, our proposed method consistently outperforms previous state-of-the-art approaches while concurrently reducing parameter training burdens by 60%. Our code locates at https://github.com/JiazuoYu/MoE-Adapters4CL

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Jiazuo Yu

Yunzhi Zhuge

Lu Zhang

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study