A hybrid AI adjudication approach replicated the EUCLID primary treatment effect (HR 1.04; 95% CI 0.94-1.15) compared to human adjudication (HR 1.02; 95% CI 0.93-1.13).
Does an adaptive AI algorithm (ADAPT-CEC) or a hybrid AI-human approach accurately adjudicate cardiovascular endpoints compared to human adjudication?
13,885 suspected cardiovascular endpoint events from the EUCLID trial
ADAPT-CEC artificial intelligence algorithm and a hybrid approach (AI + human adjudication for the 30% of suspected events with lowest prediction certainty)
Human adjudication (gold standard) and direct GPT 4.0 adjudication
F1 score for classification of cardiovascular endpoints (CV death, MI, stroke, bleeding) and replication of the EUCLID primary treatment effect
A hybrid AI-human adjudication approach can accurately classify cardiovascular endpoints and replicate trial treatment effects, potentially reducing the time and cost of clinical endpoint classification in trials.
Background: Clinical endpoint classification (CEC) is the gold standard for cardiovascular endpoint measurement in clinical trials, but adds time and cost. We developed and validated an artificial intelligence (AI) algorithm (ADAPT-CEC) that adjudicates multiple cardiovascular endpoints and adapts to new definitions. Methods: ADAPT-CEC was derived on myocardial infarction (MI), stroke, and heart failure from the ODYSSEY OUTCOMES trial and externally validated on MI, stroke, bleeding and CV death from the EUCLID trial after adaptation with 20 EUCLID suspected events per endpoint. ADAPT-CEC was compared via F1 score with direct generative pretrained transformer (GPT) 4.0 adjudication and a hybrid approach where the 30% of suspected events with the lowest AI prediction certainty used human adjudication. The EUCLID primary endpoint of CV death, MI, or stroke was re-estimated for all three adjudication strategies. Results: Amongst 13,885 suspected EUCLID primary endpoint events, ADAPT CEC, hybrid, and GPT 4.0 strategies correctly classified 86.4%, 95.6%, and 76.3% of all endpoints and 99.4%, 99.6%, and 99.8% of all non-endpoints compared with human adjudication, respectively. Hybrid adjudication F1 metrics were the highest CV death (0.94, 95% CI 0.92 – 0.96), MI (0.80, 95% CI 0.77 – 0.82), stroke (0.82; 95% CI 0.78 – 0.86), bleeding (0.83, 95% CI 0.82 – 0.85). ADAPT-CEC F1 metrics were lower for CV death, MI, and stroke but similar to GPT 4.0 while bleeding (0.78, 95% CI 0.77 – 0.79) was superior to GPT 4.0. The EUCLID primary treatment effect was similar by human adjudication (HR 1.02, 95% CI 0.93 – 1.13), hybrid (HR 1.04; 95% CI 0.94 – 1.15) ADAPT-CEC (HR 0.98, 95% CI 0.88 – 1.09) and GPT 4.0 (1.06, 95% CI 0.95 – 1.19) adjudication. Conclusions: After brief adaptation, a single trial derived AI algorithm can adjudicate similar (MI and stroke) and new endpoints (CV death and bleeding) in a second trial and replicate the EUCLID primary outcome treatment effect. A hybrid approach with humans adjudicating those suspected events with the lowest 30% of ADAPT-CEC prediction certainty was superior to ADAPT-CEC alone or GPT 4.0 alone and replicated the EUCLID primary outcome treatment effect. Prospective studies of adaptive AI adjudication are needed to determine future trial implementation.
“The challenge is how we are going to translate this into a product and an output that is going to be acceptable by regulatory agencies. How are we going to do it in a way that can be ethical [and] that meets all the standards for transparency and traceability?”
Building similarity graph...
Analyzing shared references across papers
Loading...
Sreekanth Vemulapalli
Karla Pena Guerra
Daniel Wojdyla
Circulation
Stanford University
Cornell University
Inserm
Building similarity graph...
Analyzing shared references across papers
Loading...
Vemulapalli et al. (Mon,) conducted a other in Cardiovascular events (n=13,885). ADAPT-CEC and Hybrid AI adjudication vs. Human adjudication and GPT 4.0 was evaluated on EUCLID primary treatment effect (CV death, MI, or stroke) (HR 1.04, 95% CI 0.94 - 1.15). A hybrid AI adjudication approach replicated the EUCLID primary treatment effect (HR 1.04; 95% CI 0.94-1.15) compared to human adjudication (HR 1.02; 95% CI 0.93-1.13).
www.synapsesocial.com/papers/69ccb62016edfba7beb87d38 — DOI: https://doi.org/10.1161/circulationaha.126.080072
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: