What question did this study set out to answer?

The study aims to assess how a retrieval-augmented generation (RAG) framework improves decision support for therapeutic plasma exchange (TPE).

May 8, 2026

An automatic consult reply system for therapeutic plasma exchange using retrieval‐augmented generation

Key Points

The study aims to assess how a retrieval-augmented generation (RAG) framework improves decision support for therapeutic plasma exchange (TPE).
Developed a hybrid RAG pipeline using BAAI/bge-base-en-v1.5 embeddings with Chroma and BM25.
Converted thirty de-identified cases into standardized queries; evaluated across six RAG and three non-RAG models.
Measured performance on accuracy for six item elements and reproducibility of responses.
RAG configurations outperformed non-RAG baselines in accuracy, especially in plasma volume calculation and ASFA classification.
Reproducibility with RAG was significantly better across repeated runs.
RAG GPT-4.1-mini achieved the best balance of accuracy and low latency.

Abstract

Abstract Background and Objectives Large language models (LLMs) show promise for clinical decision support but remain vulnerable to factual errors. Retrieval‐augmented generation (RAG) mitigates this limitation by grounding outputs in authoritative domain knowledge. Therapeutic plasma exchange (TPE) requires consistent, guideline‐driven decisions based on the 2023 American Society for Apheresis (ASFA) recommendations. This study aimed to evaluate whether an RAG‐based framework could improve accuracy, reliability and standardization of decision support for TPE, compared to conventional LLMs. Materials and Methods We built a hybrid RAG pipeline combining BAAI/bge‐base‐en‐v1.5 embeddings with Chroma and BM25, coupled with structured prompts that encode ASFA categories and grades, Health Insurance Review and Assessment (HIRA) service criteria, and plasma volume computation rules. Thirty de‐identified real‐world consultation cases were converted into standardized queries. Across six RAG and three non‐RAG generative pre‐trained transformer (GPT)‐series model configurations, each case was answered five times (1,350 outputs). Performance was assessed by item‐level accuracy for six elements (diagnosis, ASFA category, grade, insurance applicability, plasma volume, and replacement fluid) and reproducibility on 14 disease‐name prompts. Response time and output length were also analyzed. Results RAG configurations consistently outperformed non‐RAG baselines across items, with the largest gains in plasma‐volume calculation and ASFA classification. Reproducibility was markedly higher with RAG across repeated runs. Among all configurations, RAG GPT‐4.1‐mini showed the most balanced and superior performance, delivering high accuracy with low latency. Conclusion A guideline‐grounded RAG approach substantially enhances the accuracy, stability and standardization of TPE consultation compared with conventional LLMs. This RAG‐TPE framework demonstrates the feasibility of reliable, clinically oriented decision support in transfusion medicine, warranting further evaluation in prospective clinical workflows.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Jong Kwon Lee

Sooin Choi

Sholhui Park

Journals

Vox Sanguinis

Actions

Institutions

Sungkyunkwan University

University of Ulsan

Samsung Medical Center

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

An automatic consult reply system for therapeutic plasma exchange using retrieval‐augmented generation

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider