Abstract To evaluate responses from three publicly available Large Language Models (LLMs) of common patient questions regarding the treatment of hepatocellular carcinoma (HCC) by interventional radiology, focusing on embolization and ablation. A standardized set of ten questions addressing procedure indications, risks, benefits, and outcomes was developed by the research team. Three LLMs—ChatGPT 4o Mini (OpenAI), Gemini (Google), and Copilot (Microsoft)—were prompted to generate responses to the questions. Two attending interventional radiologists independently evaluated responses using a web-based survey instrument, assessing response accuracy, comprehensiveness, readability, compassion, and overall quality on a numerical scale from 0 to 100. Comparisons between models in each domain were made using one-way ANOVA, and the survey provided opportunities for qualitative comments. LLMs were found to provide readable, generally accurate responses with no statistically significant difference within any of the evaluated domains (p > 0.05). Qualitative analysis revealed inconsistencies in LLM responses for addressing procedure subtypes, techniques, and clinical nuances of HCC treatment. While LLMs show promise as an adjunct tool for preprocedural patient education, current limitations highlight the necessity for professional oversight. Future studies incorporating patient feedback are essential to assess their impact on comprehension and satisfaction.
Building similarity graph...
Analyzing shared references across papers
Loading...
Harrison Blume
DE Williams
Arvind Dev
Digestive Disease Interventions
University of California, Los Angeles
Albert Einstein College of Medicine
Montefiore Medical Center
Building similarity graph...
Analyzing shared references across papers
Loading...
Blume et al. (Tue,) studied this question.
www.synapsesocial.com/papers/69d895206c1944d70ce060ef — DOI: https://doi.org/10.1055/a-2835-3036