Objectives: To map applications of large language models (LLMs) and generative AI across the infectious disease spectrum and identify gaps in the evidence base. Methods: This scoping review followed JBI methodology and PRISMA-ScR guidelines (protocol pre-registered on OSF). PubMed, Embase, Scopus, and Web of Science were searched (April 2026), supplemented by medRxiv/bioRxiv, OpenAlex, Semantic Scholar, and citation chaining. Studies evaluating LLMs or generative AI for any infectious disease task from November 2022 onward were eligible. Screening used an automated rule-based algorithm validated against a 20% stratified sample (92.8% agreement, Cohen's kappa = 0.70, PABAK = 0.86). Results: From 42,030 records, 516 studies were included. Of these, 503 (97.5%) did not assess hallucination or safety, and only 10 (1.9%) reported clinical deployment. Publication volume grew from 11 (2022) to 245 (2025). COVID-19 dominated (132, 25.6%), followed by HIV/AIDS (54) and antimicrobial resistance (53). No studies addressed neglected tropical diseases. Diagnosis (116, 22.5%) and education (91, 17.6%) were the most common tasks. ChatGPT was the most evaluated model (162, 31.4%), though most studies did not specify the model version. 472 studies (91.5%) remained at proof-of-concept. Discussion: Generative AI research in infectious disease is growing rapidly but remains concentrated on diseases prevalent in high-income settings, dominated by proprietary models with poor version reporting, and almost entirely pre-clinical. The near-complete absence of safety assessment and the zero-coverage gap for neglected tropical diseases are urgent concerns requiring minimum reporting standards, redirection toward high-burden diseases, and a shift from benchmark testing to implementation science. Protocol registration: https://doi.org/10.17605/OSF.IO/QE629. Data availability: Full extracted dataset, screening algorithms, and analysis code available at https://osf.io/xbtne. Version 2 changes (13 April 2026): Added Zhou et al. (2025) as comparator review; added PABAK (0.86) alongside Cohen's kappa; updated cross-reference to companion review (Zenodo DOI); minor wording improvements throughout.
Building similarity graph...
Analyzing shared references across papers
Loading...
Hayden Farquhar
Building similarity graph...
Analyzing shared references across papers
Loading...
Hayden Farquhar (Mon,) studied this question.
www.synapsesocial.com/papers/69df2c50e4eeef8a2a6b14f6 — DOI: https://doi.org/10.5281/zenodo.19549004