Abstract Chest X-rays are an inexpensive and widely available imaging modality for diagnosing or monitoring a variety of medical conditions. Given their abundance, healthcare providers could greatly benefit from automated systems capable of screening healthy patients and supporting the diagnosis of pathological cases. Deep learning has become central to such decision-support systems, offering accurate and efficient image classification that can improve clinical workflows and reduce radiologist workload. However, despite the rapid evolution of general-purpose neural architectures, particularly attention-based models, their application to medical imaging remains constrained by limited incorporation of medical domain knowledge. Most existing attention mechanisms optimize only task-specific losses, disregarding crucial anatomical and lesion-location priors, which can hinder generalization and interpretability. In this work, we introduce a fully automated, attention-guided classification framework that integrates medical priors through an on-the-fly segmentation of the lungs, followed by a spatially aware attention loss that directs the network’s focus toward clinically relevant regions. The method requires minimal physician input—only a single annotated X-ray indicating potential lesion areas at initialization and generalizes effectively across patients without relying on absolute bounding-box coordinates. Gradient-based activation mapping is further employed to ensure alignment between attention and lesion-specific regions. Our approach is architecture-agnostic and integrates seamlessly into end-to-end pipelines. Experiments on two medical image datasets demonstrate that the proposed segmentation-enhanced attention loss improves both classification accuracy and representation interpretability compared to the standard cross-entropy loss. The code is available at: https://github.com/rcorizzo/cxr-segmentation-attention/ .
Wu et al. (Sat,) studied this question.