Given the knowledge-intensive and rapidly expanding nature of medical field, accurately synthesizing and interpreting findings remain a major challenge for clinicians and medical students. Although Large Language Models (LLMs) have advanced automated summarization or generated responses, their deployment is limited by hallucinations, outdated knowledge, and insufficient domain adaptation. Retrieval-Augmented Generation (RAG) addresses these issues by grounding LLMs in external knowledge bases. However, as the document corpus scales, maintaining RAG accuracy becomes increasingly difficult, making retrievers critical for contextual relevance. In this paper, we examined the efficiency of a modular RAG framework with a hybrid retrieval strategy that combines sparse retrieval (BM25) and dense retrieval (MedCPT) to extract the most relevant documents from the corpus, thereby providing contextual grounding for the LLM to improve medical responses. Evaluation was conducted on three benchmark healthcare datasets: PubMedQA, MedMCQA, and MedQA-US, using two LLMs, GPT-4o and BioGPT. Performance was assessed using retrieval metrics (context precision, context recall, F1-score) and generation metrics (BERTScore, RAG Assessment Score). The hybrid retriever achieved 92.14% recall, 74.36% precision, and an F1-score of 82.30%. GPT-4o with hybrid retrieval reached 89.4% faithfulness, 82.7% answer relevancy, and an F1BERT of 88.0% on PubMedQA. Results demonstrated that hybrid retrieval within a modular architecture substantially improves retrieval effectiveness and response quality. The proposed work offers a scalable, generalizable solution for high-stakes healthcare applications, supporting flexible retriever integration and robust evaluation to advance transparent QA systems.
Building similarity graph...
Analyzing shared references across papers
Loading...
Bushra Aljohani
Tawfeeq Alsanoosy
Building similarity graph...
Analyzing shared references across papers
Loading...
Aljohani et al. (Sun,) studied this question.
www.synapsesocial.com/papers/6980ffd6c1c9540dea812a02 — DOI: https://doi.org/10.3390/info17020133
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: