The TRIAGE AI system accurately determined trial-level eligibility from real-world EHR data, achieving 78.3% sensitivity, 98.5% specificity, and 92.6% PPV at the predefined 0.40 match threshold.
Observational (n=827)
No
Does an AI-enabled trial matching solution (TRIAGE) accurately predict clinical trial eligibility compared to expert clinical research coordinator adjudication in patients with cancer?
An AI-enabled trial matching system accurately determined clinical trial eligibility from real-world EHR data, demonstrating high specificity and strong agreement with expert coordinators.
Effect estimate: Specificity 98.5%, PPV 92.6%, NPV 94.8%, Cohen's κ 0.68
1501 Background: Under-enrollment in cancer trials remains a major barrier for advancing cancer research in part because few patients are offered clinical trials and pre-screening is manual and laborious. One solution involves artificial intelligence (AI)-enabled centralized screening of patients, but this requires robust validation to facilitate trust and adoption. Methods: This is a retrospective study from a large community-academic hybrid cancer center comparing the performance of an AI system - Trial Recommendations using Intelligent Assessment to Guide Eligibility and Enrollment (TRIAGE) - to real-world enrollment using longitudinal electronic health records (EHRs), full versioned trial protocols, and expert clinical research coordinator (CRC) adjudication at the trial- and criterion-levels. The train set was composed of 629 patients and 4,094 patient-trial pairs. A large language model (LLM)-only solution was used to predict successful patient enrollment based only on answers to individual eligibility criteria. Next, a machine learning-based approach was used to train the model on top of LLM responses to yield robust, consistent decision rules across patients, with priorities set by the real-world behavior of CRCs. The test set was comprised of 198 patients with breast, lung, and pancreatic cancers and 793 patient-trial pairs. A stratified random sample of 100 patient-trial pairs (83 patients, 21 trials) underwent manual CRC adjudication. The primary outcomes were sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and Cohen’s κ for trial-level eligibility using a predefined trial match threshold of 0.40, and accuracy for criterion-level decisions. Performance evaluations were based on a binary classification: eligible or potentially eligible vs. ineligible. Results: In the training set, TRIAGE demonstrated 72% sensitivity and 90% specificity for trial enrollment. The table summarizes performance metrics in the test set for trial-level eligibility across 3 trial match thresholds. Criterion-level evaluation across 1,770 adjudications showed 93.1% raw agreement, improving to 94.2% after structured re-adjudication; 10/25 (40%) initial CRC discordances were overturned in favor of the AI decision. Conclusions: TRIAGE accurately determined trial-level eligibility from real-world EHR data with high performance and strong criterion-level agreement for oncology protocols. The system surfaced potential missed enrollment opportunities and supports adjustable trial-level decision thresholds. Prospective studies of TRIAGE implementation into research workflows are ongoing. Trial Match Threshold Sensitivity Specificity PPV NPV Cohen’s κ 0.13 (max Sensitivity) 98.7% 97.6% 91.2% 99.7% 0.84 0.40 (study threshold) 78.3% 98.5% 92.6% 94.8% 0.68 0.62 (max Specificity) 39.3% 100% 100% 87.0% 0.45
Patel et al. (Wed,) conducted a observational in Cancer (breast, lung, pancreatic) (n=827). TRIAGE AI system vs. Expert clinical research coordinator (CRC) adjudication was evaluated on Sensitivity, specificity, PPV, NPV, and Cohen's κ for trial-level eligibility using a predefined trial match threshold of 0.40 (Specificity 98.5%, PPV 92.6%, NPV 94.8%, Cohen's κ 0.68). The TRIAGE AI system accurately determined trial-level eligibility from real-world EHR data, achieving 78.3% sensitivity, 98.5% specificity, and 92.6% PPV at the predefined 0.40 match threshold.