Abstract Background Coronary computed tomography angiography (CCTA) is pivotal in the evaluation of suspected coronary heart disease (CHD). Various pretest probability (PTP) assessment methods have been employed to enhance the diagnosis of CCTA for obstructive CHD in patients with suspected CHD. Nonetheless, significant challenges persist, including substantial variability in assessment methods and limited clinical applicability. We aim to develop and validate a framework that integrates large language models with domain-specific smaller models for PTP assessment. Methods The collaborative framework (BRIDGE-CHD) were proposed through four sequential modules. For questioning module, structured clinical interviews were guided by four standardized clinical information categories to optimize natural language interactions. For option extraction module, a prompt-driven LLM interface was used to convert user responses into JSON-formatted options with confidence scores. For follow-up module, one of five adaptive clarification strategies (BASIC, Binary, Numerical, Scale and Rationale) was selected based on prior experimental validation to refine missing or ambiguous inputs. For conclusion module, small-model outputs were used for aggregated via LLM synthesis for final CHD probability estimation. The efficiency of BRIDGE-CHD was validated, 100 cases from the multi-center C-Strat cohort (N=30039) were utilized to evaluate the design of the LLM base and the prompts in both the option extraction and follow-up modules. 1000 cases of C-Strat cohort were performed on the selected configuration, focusing on the accuracy of information extraction and outcome prediction. A prospective real-world cohort was conducted to further assess the effectiveness of the approach. Results In the pre-experiment (C-Strat, N=100), the GPT-4o achieved the highest accuracy in option extraction compared to Qwen-Max, DeepSeek-R1 and Gemini 2.0 Pro, with accuracy rates of 0.926 versus 0.899, 0.923 and 0.866, respectively (P0.050); The integration of GPT-4o with the Numerical Clarification Strategy achieved an information extraction accuracy of 0.830, a correct questioning rate of 0.810 and an overall system outcome prediction accuracy of 0.820, surpassing the performance of small models alone. In the larger-experiment (C-Strat, N=1000), the BRIDGE-CHD demonstrated an overall accuracy of 0.750. In the prospective validation (Real-world cohort, N=83), the diagnostic accuracy of BRIDGE-CHD surpassed that of clinician judgments, standalone LLMs and small models alone (0.860 vs. 0.680, 0.610 and 0.710, P0.050). User evaluations indicated strong usability, with a mean score of 4.4 out of 5, reflecting the interface's intuitiveness and workflow efficiency. Conclusion BRIDGE-CHD demonstrates superior diagnostic consistency with CCTA for obstructive CHD and excels in PTP assessment and information extraction, outperforming clinician judgments, standalone LLMs and small models alone.Overview of design and main results
Xin et al. (Sat,) studied this question.