Intraoperative frozen section analysis is critical for evaluating tumor malignancy, assessing surgical margins, and informing real-time clinical decisions. However, limitations such as inconsistent slide quality, diagnostic variability, and time constraints challenge its reliability in routine workflows. To address these issues, we collected a real-world dataset of 4,667 hematoxylin and eosin-stained whole slide images (WSIs) from intraoperative consultations spanning diverse organ types. A Vision Transformer (ViT)-based model enhanced with a Soft Mixture of Experts (Soft MoE) module was developed to perform binary classification (benign vs. malignant) under weak supervision using only slide-level labels. The proposed model achieved excellent performance on the test set (AUC = 0.957, sensitivity = 0.817, specificity = 0.961), and demonstrated consistent diagnostic utility across common and rare tissue types. Instance-level heatmap visualizations revealed consistent visual alignment with diagnostic tumor regions, supporting model interpretability. Importantly, the model enables local inference on standard clinical hardware (e.g., GPU with 24 GB memory), making it feasible for real-world deployment. These findings suggest that Soft MoE-ViT offers a practical and interpretable solution for AI-assisted intraoperative pathology.
Wu et al. (Tue,) studied this question.