Modern AI models use deep architectures that obscure how predictions are made. Without understanding how models reach their predictions, it becomes difficult to verify reasoning, identify biases, or trust their reliability in high-stakes domains like healthcare. Many COVID-19 chest X-ray (CXR) studies report high accuracy and present qualitative gradient-weighted class activation mapping (Grad-CAM) heatmaps, providing no quantitative evidence of alignment with lung anatomy and relying on manual, subjective inspection. We introduce an automated quantitative pipeline that converts interpretability into objective, anatomy grounded metrics between Grad-CAM heatmaps and lung masks. We evaluate six convolutional neural networks (CNNs): VGG16, VGG19, ResNet-101, NASNet-Mobile, NASNet-Large, and Xception, for both classification performance and anatomical interpretability in COVID-19 CXR detection. Classification accuracies ranged from 90% to 96%, with Xception achieving the highest accuracy (95.90%) and a balanced precision, recall, and F1-score of 95.92%. NASNet-Large and VGG19 followed at 94.87%, with VGG19 reaching the highest precision (98.89%). To assess model transparency, we automated interpretability analysis by thresholding the Grad-CAM outputs and comparing them to radiologist-annotated lung masks using Intersection-over-Union (IoU) and Dice score metrics.
Building similarity graph...
Analyzing shared references across papers
Loading...
Aiman Abd Saeed
Rasber Dhahir Rashid
SHILAP Revista de lepidopterología
Salahaddin University-Erbil
Building similarity graph...
Analyzing shared references across papers
Loading...
Saeed et al. (Wed,) studied this question.
www.synapsesocial.com/papers/69f6e5cf8071d4f1bdfc66d2 — DOI: https://doi.org/10.21271/zjpas.38.2.13