What question did this study set out to answer?

The study aims to determine if machine learning can predict ampicillin susceptibility directly from MALDI-TOF spectra of Enterococcus faecium.

March 23, 2026Open Access

Early detection of ampicillin susceptibility in Enterococcus faecium with MALDI-TOF MS and machine learning

Key Points

The study aims to determine if machine learning can predict ampicillin susceptibility directly from MALDI-TOF spectra of Enterococcus faecium.
Analyzed MALDI-TOF spectra from two clinical datasets: Technical University of Munich and MS-UMG.
Evaluated logistic regression and LightGBM machine learning models for predictive accuracy.
Conducted external validation to assess model transferability across datasets.
Used LC-MS/MS to identify key MALDI-TOF peaks associated with ampicillin susceptibility.
LightGBM showed superior performance to logistic regression in identifying ampicillin-susceptible isolates in both datasets.
Achieved AUPRC of 0.907 for LightGBM and 0.902 for logistic regression in the TUM dataset.
LightGBM demonstrated good transferability with an AUPRC of 0.869 when trained on a combination of datasets.
A MALDI-TOF peak at m/z ≈ 5091 was identified as the most discriminative for susceptibility linked to bacteriocin T8.

Abstract

• ML rapidly predicted ampicillin susceptibility from E. faecium MALDI-TOF spectra • LightGBM performed best across two independent clinical datasets • External validation suggested models transfer well between datasets • LC-MS/MS linked the most discriminatory peak to bacteriocin T8 Enterococcus faecium can cause severe infections and is often resistant to the first-line antibiotic ampicillin. Consequently, clinicians usually prescribe broad-spectrum antibiotics, promoting the selection of multidrug-resistant bacteria. We investigated whether machine learning models can detect ampicillin susceptibility directly from MALDI-TOF mass spectra to enable earlier optimised treatment in ampicillin-susceptible E. faecium infections. Two datasets of clinical E. faecium MALDI-TOF spectra and their resistance phenotype were analysed: our own Technical University of Munich (TUM) dataset and the publicly available MS-UMG dataset. We evaluated logistic regression (LR) and LightGBM models and explored transferability including a target-domain-adapted external validation. Discriminatory MALDI-TOF peaks were investigated using LC-MS/MS. LightGBM slightly outperformed LR in identifying ampicillin-susceptible isolates in both datasets (area under the precision-recall curve (AUPRC) 0.907 ± 0.016 vs 0.902 ± 0.030 for TUM; 0.902 ± 0.029 vs 0.899 ± 0.054 for MS-UMG). Target-domain-adapted training demonstrated good transferability of LightGBM models (AUPRC of 0.869 ± 0.013 when trained on TUM plus 30% MS-UMG, tested on the remaining 70% MS-UMG). SHAP analysis consistently identified a MALDI-TOF spectral peak at m/z ≈ 5091 as most discriminative, which LC-MS/MS analysis mapped to bacteriocin T8. LightGBM and LR models can identify ampicillin-susceptible E. faecium isolates from MALDI-TOF spectra and generalise well to unseen datasets. Bacteriocin T8 serves as a key discriminatory feature associated with ampicillin resistance. While clinical implementation currently still requires confirmatory testing, the addition of larger datasets will support the development of more robust machine learning models.

Early detection of ampicillin susceptibility in Enterococcus faecium with MALDI-TOF MS and machine learning

Key Points

Abstract

Cite This Study