What question did this study set out to answer?

The research aims to enhance emergency triage predictions by integrating structured clinical knowledge with language models.

May 7, 2026Open Access

TRAG: A Rule Retrieval-Augmented Generation Framework for Emergency Triage

Key Points

The research aims to enhance emergency triage predictions by integrating structured clinical knowledge with language models.
Proposed TRAG framework combines knowledge retrieval with LLM reasoning for ESI prediction.
Evaluated multiple LLMs under different retrieval settings (Top-5, Top-10, Top-15).
Compared performance with zero-shot baselines on 100 curated triage cases.
Assessed performance using accuracy, precision, recall, F1-score, and Quadratic Weighted Kappa (QWK).
Retrieval-augmented prompting improved classification performance for lower-performing models.
GPT-3.5 accuracy increased from 0.45 to 0.68 and QWK from 0.67 to 0.82 under Top-15 setting.
Improvements noted in reducing under-triage across several configurations.

Abstract

Accurate emergency triage is critical for patient safety and efficient resource allocation. Large language models (LLMs) have shown promise in clinical reasoning tasks; however, their predictions may be inconsistent when not grounded in structured clinical knowledge. This study proposes TRAG (Triage Retrieval-Augmented Generation), a domain-specific framework that integrates rule-based knowledge retrieval with LLM reasoning to support Emergency Severity Index (ESI) prediction. TRAG retrieves triage rules encoded as if–then logic from an ESI knowledge base and incorporates them into the model prompt to guide ESI prediction. The framework evaluated multiple LLMs under different retrieval settings (Top-5, Top-10, Top-15) and compared with zero-shot baselines on 100 curated triage cases. Performance was assessed using accuracy, precision, recall, F1-score, and Quadratic Weighted Kappa (QWK). Results show that retrieval-augmented prompting improves classification performance, particularly for lower-performing models. For example, GPT-3.5 accuracy increased from 0.45 to 0.68 and QWK from 0.67 to 0.82 under the Top-15 setting. Improvements were also observed in reducing under-triage in several configurations, while higher-performing models demonstrated more modest and configuration-dependent gains. These findings suggest that integrating structured clinical rules within a retrieval-augmented framework can enhance the consistency and reliability of LLM-based triage prediction. The proposed TRAG framework highlights the potential of combining structured clinical knowledge with generative models to support safer and more interpretable decision-making in emergency care.

TRAG: A Rule Retrieval-Augmented Generation Framework for Emergency Triage

Key Points

Abstract

Cite This Study