What question did this study set out to answer?

The aim is to determine the feasibility of using large language models to extract structured advance care planning information from clinical notes.

May 29, 2026Open Access

Large Language Models for Summarizing Advance Care Planning Information From Goals of Care Notes in the EHR

Key Points

The aim is to determine the feasibility of using large language models to extract structured advance care planning information from clinical notes.
Sample of 100 de-identified Goals of Care notes annotated by clinicians.
Two large language models, Mistral 24.07 and LLaMA 3.1, were tested without domain-specific fine-tuning.
Model outputs were evaluated against human annotations using cosine similarity of BioBERT embeddings.
Mistral 24.07 showed high semantic similarity scores: 0.814 for Code Status, 0.781 for Documentation, and 0.770 for Patient Priorities.
Lower alignment in Decision Maker category with a score of 0.609.
Indicates LLMs can extract structured ACP information effectively, while further refinement is needed for certain categories.

Abstract

ABSTRACT Objectives Embedding systematic, structured data extraction within electronic health records (EHR) is vital for improved real‐time insights into care delivery. This study evaluates the feasibility of using large language models (LLMs) to extract structured advance care planning (ACP) information from unstructured Goals of Care (GoC) clinical notes in the EHR. Materials and Methods A sample of 100 de‐identified GoC notes was manually annotated by clinicians across four ACP categories: Patient Priorities, Code Status, Decision Maker, and Documentation. Two LLMs (Mistral 24.07 and LLaMA 3.1) were prompted to extract structured outputs without domain‐specific fine‐tuning. Model outputs were compared to human annotations using cosine similarity of BioBERT embeddings. Results Mistral 24.07 achieved high semantic similarity in Code Status (0.814), Documentation (0.781), and Patient Priorities (0.770), but lower alignment in Decision Maker (0.609). Conclusions LLMs can effectively extract structured ACP information, particularly in well‐documented categories, suggesting potential for scalable, data‐driven feedback loops that improve the provision of care. However, accuracy challenges remain, and further refinement is needed for nuanced qualitative content categories.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Ekbote et al. (Fri,) studied this question.

synapsesocial.com/papers/6a192c0ffab5b468c4414faa https://doi.org/https://doi.org/10.1002/lrh2.70086

Bookmark

View Full Paper