What question did this study set out to answer?

To assess the performance of a large language model in adjudicating surgical site infections.

April 15, 2026Open Access

Use of a large language model integrated within the electronic medical record for the evaluation of surgical site infections – Northern California, 2025

Key Points

To assess the performance of a large language model in adjudicating surgical site infections.
Evaluated gpt-4o-mini in the context of electronic medical records.
Measured sensitivity and specificity for surgical site infection detection.
Compared workload reduction before and after implementation.
Achieved 100% sensitivity in detecting surgical site infections.
Recorded 69.4% specificity, indicating a high rate of false positives.
Reduced manual screening workload by 66% in the evaluation process.

Abstract

Our study evaluated a large language model (gpt-4o-mini) for surgical site infection (SSI) adjudication, achieving 100% sensitivity but 69.4% specificity. While reducing the manual screening workload by 66%, the agent generated many false positives, underscoring the need for refined models to improve specificity without compromising accuracy.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Eugenia Miranti

Timothy Keyes

Alvaro Ayala

Journals

Infection Control and Hospital Epidemiology

Actions

Institutions

Stanford University

Stanford Health Care

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Use of a large language model integrated within the electronic medical record for the evaluation of surgical site infections – Northern California, 2025

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider