Large language models (LLMs) are rapidly advancing natural language processing and driving increasing interest across biomedical research, clinical care, and healthcare operations. This systematic review synthesizes current empirical evidence on LLM applications in healthcare and biomedical informatics, focusing on their capabilities, evaluation practices, and reported outcomes. Existing studies highlight promising uses in research, where LLMs assist literature-informed genomic interpretation, functional annotation, and biological hypothesis generation, as well as support tasks related to protein and multiomics analysis. In clinical contexts, LLMs are primarily evaluated for natural language-driven tasks, including electronic health record summarization, clinical documentation support, medical question answering, and information extraction, rather than autonomous diagnostic or therapeutic decision-making. Early investigations also describe potential value in healthcare operations, patient communication, clinician education, and drug discovery workflows largely through knowledge retrieval, text generation, and semantic search. However, current evidence remains preliminary: most studies are retrospective, benchmark-based, simulation-driven, or limited to controlled research settings. Key challenges include hallucination and unreliable reasoning, bias and inequitable performance across populations, data privacy and security constraints, reproducibility limitations, and lack of prospective clinical validation and regulatory guidance. Emerging strategies, such as domain-specific pretraining, retrieval augmentation, multi-modal architectures, and privacy-preserving learning, aim to improve reliability, safety, and real-world applicability. This review concludes by outlining methodological, infrastructural, and governance requirements needed to responsibly integrate LLMs into biomedical workflows, emphasizing that clinical deployment remains exploratory and must be rigorously evaluated.
Building similarity graph...
Analyzing shared references across papers
Loading...
Andrew Hornback
Harinishree Sathu
Kyungbeom Kim
Innovation and Emerging Technologies
Georgia Institute of Technology
The Wallace H. Coulter Department of Biomedical Engineering
Building similarity graph...
Analyzing shared references across papers
Loading...
Hornback et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69a67ec3f353c071a6f0a301 — DOI: https://doi.org/10.1142/s2737599426300011