Key points are not available for this paper at this time.
The application of the Multi-modal Large Language Models (MLLMs) in medical clinical scenarios remains underexplored. Previous benchmarks only focus on the capacity of the MLLMs in medical visual question-answering (VQA) or report generation and fail to assess the performance of the MLLMs on complex clinical multi-modal tasks. In this paper, we propose a novel Medical Personalized Multi-modal Consultation (Med-PMC) paradigm to evaluate the clinical capacity of the MLLMs. Med-PMC builds a simulated clinical environment where the MLLMs are required to interact with a patient simulator to complete the multi-modal information-gathering and decision-making task. Specifically, the patient simulator is decorated with personalized actors to simulate diverse patients in real scenarios. We conduct extensive experiments to access 12 types of MLLMs, providing a comprehensive view of the MLLMs' clinical performance. We found that current MLLMs fail to gather multimodal information and show potential bias in the decision-making task when consulted with the personalized patient simulators. Further analysis demonstrates the effectiveness of Med-PMC, showing the potential to guide the development of robust and reliable clinical MLLMs. Code and data are available at https://github.com/LiuHC0428/Med-PMC.
Building similarity graph...
Analyzing shared references across papers
Loading...
Liu et al. (Fri,) studied this question.
www.synapsesocial.com/papers/68e5bfa3b6db6435875573e5 — DOI: https://doi.org/10.48550/arxiv.2408.08693
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:
Hongcheng Liu
Yusheng Liao
Siqv Ou
Building similarity graph...
Analyzing shared references across papers
Loading...