Multi-label disease diagnosis in chest X-rays necessitates simultaneous consideration of both global organ structures and local lesion characteristics. However, current methodologies primarily utilize single-branch architectures and lack effective attention guidance mechanisms, which complicates the balance between global context and local details. Furthermore, multi-label datasets for chest X-rays often suffer from significant class imbalance. We propose CR-MSNet, a dual-branch multi-scale attention network designed for multi-label chest X-ray classification. The global branch is constructed using CoAtNet-2-rw to capture holistic semantic representations, while the local branch employs a residual convolutional neural network to extract detailed lesion features. We incorporate a cross-attention mechanism to facilitate adaptive interaction and information exchange between global and local representations. Additionally, we propose a Parallel Multi-Scale Channel-Spatial Attention (PMS-CSA) module to enhance both key semantic channels and potential lesion regions, thereby increasing the discriminative power of feature representations. A two-stage training strategy with an adjusted loss function is implemented to effectively alleviate the detrimental effects of class imbalance on model performance. Experimental results indicate that CR-MSNet achieves a macro-average AUC of 0.847 on the ChestX-ray14 dataset, confirming its effectiveness and potential for application in multi-label classification tasks for chest X-rays. By seamlessly integrating a dual-branch architecture with multi-scale attention mechanisms, this study confirms the critical role of attention-guided feature interactions in reconciling global and local representations.
Building similarity graph...
Analyzing shared references across papers
Loading...
Yu Wang
Caiyin Bao
Zichen Wang
Scientific Reports
Gansu University of Traditional Chinese Medicine
The 180th Hospital of PLA
Building similarity graph...
Analyzing shared references across papers
Loading...
Wang et al. (Mon,) studied this question.
www.synapsesocial.com/papers/69c37be2b34aaaeb1a67eb70 — DOI: https://doi.org/10.1038/s41598-026-44591-5