Objectives: This study evaluated the feasibility and performance of a multi-agent (MA) system designed to support early sepsis management in intensive care units. The system integrates three specialized agents—sepsis management, antibiotic recommendation, and guideline compliance—to provide evidence-based recommendations at T = 0 hours (before culture results), extending prior single-case findings across 10 diverse cases.Methods: The MA system was powered by Palmyra- Med 70B (selected for superior MedQA performance average score, 85.9) and compared with GPT-3.5 Turbo and GPT-4o mini (all at a temperature of 0.25). It used retrieval-augmented generation (RAG) with ChromaDB (2021 Surviving Sepsis Campaign, over 20 high-impact manuscripts reviews published 2018–2025 on sepsis etiologies, and other relevant sources). Eight cases from the MIMIC-IV demo and two cases from the literature were formatted as vignettes. RAG used the BAAI/bge-base-en-v1.5 embedding model with cosine similarity (threshold, 0.75) and top-5 chunks. Performance was assessed via TruLens (groundedness, approximately 0.62) and by two intensivists using a standardized questionnaire.Results: The system generated guideline-compliant recommendations (e.g., prompt surgical debridement plus meropenem and vancomycin for necrotizing fasciitis). Hallucinations occurred in three of 10 cases (e.g., “altered mental status”). Expert agreement was quantified by a Cohen kappa of 0.26. Programmatic and expert assessments showed negligible correlation.Conclusions: In this exploratory study, the MA system shows preliminary promise for early sepsis support but requires human oversight to mitigate hallucinations. Code is available in GitHub; further validation is needed.
Iapăscurtă et al. (Thu,) studied this question.