What question did this study set out to answer?

The goal was to evaluate how the GPT algorithm improves differential diagnoses using clinical reasoning.

April 19, 2026Open Access

Generative pretrained transformer (GPT) algorithm can be taught to improve the task of differential diagnosis according to simple principles of clinical reasoning. The case of GPT-4o and a lesson to apply in medical education

Key Points

The goal was to evaluate how the GPT algorithm improves differential diagnoses using clinical reasoning.
GPT-4o analyzed three common clinical scenarios.
Iterative feedback was provided after each discussion.
Responses were documented to track improvements in accuracy.
Significant improvement in GPT's performance after initial feedback.
Third and fourth discussions showed even greater accuracy gains.
100% concordance achieved between GPT-generated and intended diagnoses.

Abstract

Background and objective As the use of artificial intelligence (AI) in medicine expands, applications of GPT (generative pretrained transformer) assimilate into the world of medical education. Our objective was to rigorously evaluate the capacity of the GPT algorithm to enhance its methodology for generating differential diagnoses. We hypothesize that the success of GPT would contribute significantly to the advancement of pedagogical strategies in medical education. Methods ChatGPT-4o was provided with three common clinical scenarios and was asked to give three lists of four differential diagnoses. Through iterative feedback and targeted instruction, we systematically documented the GPT responses as they demonstrated progressive improvements in the accuracy of differential diagnoses. The study includes four discussions, each succeeded by feedback and assessment of ChatGPT's responses. The study took place in the education authority of the Chaim Sheba Medical Center, Israels’ largest hospital, during a 1-month period. Results For all four clinical scenarios, GPT performance was significantly improved right after the initial human feedback, with higher notable advancement and implementation of feedback in the third and fourth discussions. GPT effectively assimilated the technical recommendations, resulting in differential diagnoses that achieved complete concordance with the intended diagnoses (100% accuracy). Conclusion GPT-4o demonstrates a robust capacity for learning and operating appropriate methodologies within the clinical reasoning process essential for accurate differential diagnosis. Our findings will guide future directives that will be taught to human medical students.

Generative pretrained transformer (GPT) algorithm can be taught to improve the task of differential diagnosis according to simple principles of clinical reasoning. The case of GPT-4o and a lesson to apply in medical education

Key Points

Abstract

Cite This Study