Concept drift and class imbalance are two major challenges in data stream classification, and their interaction leads to more complex issues. Ensemble learning is widely recognized as an excellent approach to address this problem. However, most existing methods develop different strategies for each aspect, overlooking their relationships and interactions, which hinders the expected performance of these strategies. To address this issue, we propose a multidimensional self-paced ensemble (MSPE) for imbalanced data streams with concept drift. In MSPE, instances with multiple dimensions (i.e., timeliness and hardness) are evaluated for creating a robust and real-time ensemble model. A timeliness-based data augmentation (TDA) strategy is proposed to enhance the training dataset. Furthermore, the self-paced ensemble model based on the hardness of instances is employed to generate diverse sets of classifiers. Moreover, a novel ensemble management component, named classifier selection autopilot (CSA), is proposed to select the optimal base classifier automatically for different data streams with concept drift. Experimental results show that MSPE achieves higher classification performance and exhibits enhanced robustness compared to 8 state-of-the-art methods on 16 synthetic imbalanced data streams and 10 real-world data streams with different types of concept drift.
Chen et al. (Tue,) studied this question.