Key points are not available for this paper at this time.
With the rapid advancement of Earth observation technologies and the growing demand for intelligent remote sensing applications, high-resolution remote sensing imagery provides critical data support for a range of downstream applications, including land monitoring and disaster assessment. In this context, multi-label remote sensing image classification has become an important research task, because a single image may contain multiple ground-object categories with complex spatial distributions and semantic co-occurrence relationships. However, challenges such as the coexistence of multiscale objects, complex semantic dependencies, and long-tail category distributions impose significant limitations on existing methods in terms of feature representation capacity and class-balanced modeling. To address these challenges, a Multiscale Dynamic Reasoning Network (MSDR-Net) is proposed. Different from methods that focus on localized optimization for a single challenge, MSDR-Net establishes a task-driven modeling framework that jointly integrates multiscale feature extraction, label-aware semantic reasoning, and long-tail category optimization within an end-to-end architecture. The proposed network consists of three core modules. The Multiscale Feature Enhancement (MSFE) module incorporates a Feature Pyramid Network-based fusion mechanism, integrating deep semantic information with shallow, detailed features to effectively enhance the representation of multiscale objects. The Dynamic Semantic Reasoning (DSR) module introduces a Transformer-based global attention mechanism that models long-range dependencies among image features, enabling the capture of complex global semantic relationships. In the loss optimization stage, a Difficulty-Weighted Loss (DW-Loss) is introduced, which jointly incorporates category frequency weights and prior difficulty coefficients to dynamically regulate the contributions of rare classes and hard samples during training, thereby mitigating bias induced by class imbalance. Experiments conducted on the large-scale Detection in Optical Remote Sensing Images dataset demonstrate that the proposed method achieves superior performance. Ablation studies validate the effectiveness of each component, while comparative experiments indicate that MSDR-Net achieves a mean Average Precision of 95.88%, outperforming existing state-of-the-art methods. An improvement of approximately 1.74% is observed over the strongest baseline, MSCA, with consistent advantages demonstrated across Overall F1 and Class-wise F1 metrics. By unifying multiscale feature extraction, global semantic reasoning, and balanced loss optimization within a single framework, MSDR-Net provides a robust and efficient solution for multi-label classification in complex remote sensing scenarios.
Sun et al. (Mon,) studied this question.