What question did this study set out to answer?

May 9, 2026Open Access

Automated Approaches of Text Simplification of Patient Education Materials: Scoping Review

Key Points

This review aims to map evidence on using automated language processing technologies for simplifying patient education materials (PEMs) for laypeople.
Systematic search of 5 bibliographic databases from 2019 to May 2025.
Focused on eligible empirical studies examining large language models and AI-supported tools for text simplification.
Analyzed linguistic quality and content fidelity of simplified PEMs.
A total of 31 eligible studies were reviewed, with GPT-4.0 demonstrating the best improvements in readability metrics.
Challenges remain in meeting target readability levels, especially at lower grades.
Content fidelity findings were mixed, showing high similarity scores but often compromised accuracy.

Abstract

Background Patient education materials (PEMs) often exceed the American Medical Association’s (AMA) recommended sixth-grade reading grade level (RGL). While artificial intelligence (AI) offers potential for automated text simplification, concerns persist regarding linguistic quality, content fidelity, and the understandability of simplified PEMs by laypeople. Objective This scoping review maps existing evidence on automated language processing technologies for simplifying PEMs for laypeople. Methods Following the Joanna Briggs Institute (JBI) methodology and the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guideline, 5 bibliographic databases (Ovid MEDLINE, Embase, CINAHL, PsycInfo, and IEEE Xplore) were systematically searched from 2019 to May 2025, supplemented by reference screening and gray literature searches. Eligible sources were peer-reviewed empirical studies published in English that examined large language models (LLMs), AI-supported writing assistants, AI-based conversational agents, or AI-supported tools designed for automatic text simplification of PEMs. Targeted outcomes included linguistic quality (ie, linguistic comprehensibility, linguistic correctness) and content fidelity (ie, factual accuracy, factual completeness) of simplified PEMs. Excluded sources comprised rule-based systems, manual text simplification, non-laypeople target groups, and technology-focused performance metrics. Results were synthesized via thematic analysis across the domains of targeted outcomes. In accordance with JBI methodology, a risk-of-bias assessment was not performed. Results A total of 31 eligible studies met the inclusion criteria, examining various LLMs, including OpenAI’s GPT series, Gemini, Bard, Claude, Copilot, and Llama. Specifically, GPT-4.0 achieved the most consistent improvements in standardized readability metrics (eg, the Flesch-Kincaid Grade Level FKGL). However, achieving predefined target RGLs remained challenging across all LLMs, particularly at lower RGLs. Findings on content fidelity were inconsistent: despite high content similarity scores, content accuracy was often compromised. Conclusions This is the first scoping review to comprehensively synthesize evidence on automated technologies for text simplification in PEMs. The review identified 2 critical validation gaps. First, no study examined the linguistic correctness (eg, grammar and typographical errors) of automatically simplified PEMs. Second, and most notably, the understandability of the simplified PEMs was assessed exclusively by experts, with no empirical evaluation involving laypeople. Although LLMs effectively reduce text complexity as measured by objective readability metrics, reliance on these formulas represents a critical limitation, as they serve merely as structural proxies. Improvements in readability do not guarantee the maintenance of content accuracy or laypeople’s understandability. Current evidence is further limited by the lack of systematic prompt quality evaluation and the predominant focus on English-language PEMs in US contexts, restricting generalizability. This review provides a foundation for future research by highlighting the need for validated evaluation frameworks that encompass layperson testing and content verification. For clinical practice, LLMs should currently serve as assistive tools, with mandatory expert review remaining essential to verify content fidelity before disseminating LLM-simplified PEMs to laypeople.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Krenn et al. (Thu,) studied this question.

www.synapsesocial.com/papers/69fecfcdb9154b0b82876d4e — DOI: https://doi.org/10.2196/88365

Authors

Cornelia Krenn

Christine Loder

Natalie Berger

Journals

Journal of Medical Internet Research

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Automated Approaches of Text Simplification of Patient Education Materials: Scoping Review

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Journals

Actions

References and Citations

Citation Network

Connected Papers

Discussion