• ML rapidly predicted ampicillin susceptibility from E. faecium MALDI-TOF spectra • LightGBM performed best across two independent clinical datasets • External validation suggested models transfer well between datasets • LC-MS/MS linked the most discriminatory peak to bacteriocin T8 Enterococcus faecium can cause severe infections and is often resistant to the first-line antibiotic ampicillin. Consequently, clinicians usually prescribe broad-spectrum antibiotics, promoting the selection of multidrug-resistant bacteria. We investigated whether machine learning models can detect ampicillin susceptibility directly from MALDI-TOF mass spectra to enable earlier optimised treatment in ampicillin-susceptible E. faecium infections. Two datasets of clinical E. faecium MALDI-TOF spectra and their resistance phenotype were analysed: our own Technical University of Munich (TUM) dataset and the publicly available MS-UMG dataset. We evaluated logistic regression (LR) and LightGBM models and explored transferability including a target-domain-adapted external validation. Discriminatory MALDI-TOF peaks were investigated using LC-MS/MS. LightGBM slightly outperformed LR in identifying ampicillin-susceptible isolates in both datasets (area under the precision-recall curve (AUPRC) 0.907 ± 0.016 vs 0.902 ± 0.030 for TUM; 0.902 ± 0.029 vs 0.899 ± 0.054 for MS-UMG). Target-domain-adapted training demonstrated good transferability of LightGBM models (AUPRC of 0.869 ± 0.013 when trained on TUM plus 30% MS-UMG, tested on the remaining 70% MS-UMG). SHAP analysis consistently identified a MALDI-TOF spectral peak at m/z ≈ 5091 as most discriminative, which LC-MS/MS analysis mapped to bacteriocin T8. LightGBM and LR models can identify ampicillin-susceptible E. faecium isolates from MALDI-TOF spectra and generalise well to unseen datasets. Bacteriocin T8 serves as a key discriminatory feature associated with ampicillin resistance. While clinical implementation currently still requires confirmatory testing, the addition of larger datasets will support the development of more robust machine learning models.
Pichl et al. (Sun,) studied this question.