A multimodal model integrating echocardiogram images and electronic medical records achieved an AUC of 0.8147 for heart disease screening, outperforming an image-only baseline (AUC 0.7785).
Observational
No
Does a multimodal AI framework integrating echocardiogram images and EMRs improve screening performance for heart disease compared to single-modality models?
26,936 patients with linked echocardiogram (ECHO) and electronic medical record (EMR) records (1,470 with clinically confirmed heart disease and 25,466 without) in a retrospective single-center study.
Multimodal explainable artificial intelligence framework integrating echocardiogram images (4 imaging views) and EMR data (demographics, comorbidities, registration history) using an extreme gradient boosting classifier.
Image-only baseline model and EMR-only baseline model.
Screening performance for heart disease (presence vs absence) measured by Area Under the Curve (AUC).surrogate
Integrating echocardiographic images and EMR data using a multimodal AI framework improves screening performance for heart disease compared to using either modality alone.
Abstract Background Echocardiography is a fundamental imaging modality for the diagnosis of heart disease (HD), but its interpretation remains operator-dependent and lacks standardized, data-driven decision support. Although artificial intelligence has improved image-based diagnosis, the added value and interpretability of integrating routinely collected electronic medical records (EMRs) with echocardiogram (ECHO) for large-scale screening remain underexplored. Objective This study aims to develop an explainable artificial intelligence framework that integrates multimodal data, including ECHO images and EMRs, with derived temporal and clinical features, to enhance the screening and interpretability of heart disease diagnosis. Methods In this retrospective single-center study, we analyzed 26,936 patients with linked ECHO and EMR records (1470 with clinically confirmed HD and 25,466 without). We constructed cross-sectional EMR features (demographics, comorbidities, and registration history) and extracted view-specific image features with a simplified Inception-v3 backbone. Five modality-specific feature vectors (4 imaging views + EMR) were concatenated and input to an extreme gradient boosting classifier within a patient-level stratified 5-fold cross-validation framework. To mitigate class imbalance during training, non-HD cases in the training folds were randomly downsampled to a 1:3 HD:non-HD ratio; held-out test folds preserved the original prevalence. Post hoc explainability was provided via Grad-CAM heatmaps for images and Shapley additive explanations analysis for EMR features. Results For the primary binary task (presence vs absence of HD), the multimodal model achieved an AUC of 0.8147 (SD 0.009) on held-out test folds, compared with 0.7785 (SD 0.014) for the image-only baseline, and 0.7343 (SD 0.009) for the EMR-only baseline. At the clinically selected decision threshold, sensitivity, specificity, positive predictive value, and negative predictive value were 84.6% (SD 1.6%), 72.4% (SD 2%), 15% (SD 2.3%), 98.8% (SD 0.3%) for the multimodal model, versus 82.3% (SD 1.9%), 73.5% (SD 2%), 15.2% (SD 2.1%), 98.6% (SD 0.3%) (image-only), and 77.5% (SD 2.1%), 67% (SD 2.3%), 11.9% (SD 1.7%), 98.1% (SD 0.4%) (EMR-only). Grad-CAM visualizations qualitatively highlighted anatomically and physiologically meaningful regions (eg, valvular structures and abnormal flow jets), while Shapley additive explanations analysis identified age, registration years, sex, and hypertension among the top EMR contributors—findings that align with established cardiovascular risk factors. Conclusions Integrating echocardiographic images and EMR data in an explainable multimodal framework yields improved and clinically plausible screening performance for heart disease. Future work should focus on external multicenter validation, quantitative assessment of visual explanations against expert annotations, and prospective studies assessing clinical use and workflow integration.
Building similarity graph...
Analyzing shared references across papers
Loading...
Bokai Yang
Yirong Qin
Ye Li
JMIR Medical Informatics
Building similarity graph...
Analyzing shared references across papers
Loading...
Yang et al. (Thu,) conducted a observational in Heart disease (n=26,936). Multimodal fusion of echocardiogram images and electronic medical records vs. Image-only and EMR-only models was evaluated on Presence vs absence of heart disease (AUC). A multimodal model integrating echocardiogram images and electronic medical records achieved an AUC of 0.8147 for heart disease screening, outperforming an image-only baseline (AUC 0.7785).
www.synapsesocial.com/papers/6a080b4ea487c87a6a40d8ac — DOI: https://doi.org/10.2196/78949