Differentiating epileptic from functional seizures is a clinical challenge; while smartphone videos can aid diagnosis, they often require expert review, causing delays. We evaluated the accuracy of four successive multimodal large language models (LLMs), Gemini 1.5 Pro, 2.0 Flash, 2.5 Flash, and 2.5 Pro, in differentiating seizure types from smartphone videos without clinical context. In this prospective diagnostic study at a tertiary epilepsy center, 24 videos from 15 patients were analyzed, with video-electroencephalography monitoring as the gold standard. Of the 24 events (19 epileptic, 5 functional), diagnostic accuracy improved with successive models: Gemini 1.5 Pro (33.3%), Gemini 2.0 Flash (25.0%), and both Gemini 2.5 Flash and Pro (54.2%). In exploratory pairwise comparisons, Gemini 2.5 pro showed higher accuracy than Gemini 1.5 Pro (p = 0.01) and Gemini 2.0 Flash (p = 0.003). Performance was influenced by video features; for example, diagnosis was more accurate for the Gemini 2.5 models when videos focused on the upper body/face (80.0%-90.0%) compared to a whole-body view (28.6%-35.7%). All models reported high confidence scores (median 8.0–9.0) that were poorly aligned and did not correlate with correctness. Successive LLMs show improved yet modest accuracy for seizure classification from video alone, highlighting the need for domain-specific fine-tuning before clinical implementation.
Building similarity graph...
Analyzing shared references across papers
Loading...
Anshum Patel
Sai Krishna Vallamchetla
Adrian Safa
Scientific Reports
Mayo Clinic in Arizona
Mayo Clinic in Florida
Jacksonville College
Building similarity graph...
Analyzing shared references across papers
Loading...
Patel et al. (Tue,) studied this question.
www.synapsesocial.com/papers/69d893626c1944d70ce046e2 — DOI: https://doi.org/10.1038/s41598-026-46333-z