Large language models can lighten the workload of clinicians and patients, yet their responses often include fabricated evidence, outdated knowledge, and insufficient medical specificity. We introduce a general retrieval-augmented question-answering framework that continuously gathers up-to-date, high-quality medical knowledge and generates evidence-traceable responses. Here we show that this approach significantly improves the evidence validity, medical expertise, and timeliness of large language model outputs, thereby enhancing their overall quality and credibility. Evaluation against 15,530 objective questions, together with two physician-curated clinical test sets covering evidence-based medical practice and medical order explanation, confirms the improvements. In blinded trials, resident physicians indicate meaningful assistance in 87.00% of evidence-based medical scenarios, and lay users find it helpful in 90.09% of medical order explanations. These findings demonstrate a practical route to trustworthy, general-purpose language assistants for clinical applications. To fully realize LLMs' potential value in clinical applications, effective methods to enhance their quality and credibility are required. Here, the authors present LINS, a framework to enhance medical LLM responses by integrating up-to-date evidence and supporting clinical tasks, and validate it through new physician-curated datasets and large-scale user trials.
Building similarity graph...
Analyzing shared references across papers
Loading...
Sheng Wang
Fangyuan Zhao
Dan Bu
Nature Communications
Chinese Academy of Sciences
Peking University
University of Chinese Academy of Sciences
Building similarity graph...
Analyzing shared references across papers
Loading...
Wang et al. (Mon,) studied this question.
www.synapsesocial.com/papers/68efbd16d61273c8652d7f23 — DOI: https://doi.org/10.1038/s41467-025-64142-2
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: