We read with great interest the article by Tanaka et al.1 This is a valuable contribution, as few prediction models in stroke have undergone external validation. The model’s simplicity and its discriminative performance in late-window patients are notable strengths. However, several methodological aspects warrant attention.First, the authors assessed calibration by grouping scores and comparing them with observed outcome frequencies in a bar chart. This limits interpretation across the full risk spectrum. Converting scores into predicted probabilities and plotting a smooth calibration curve would better reveal miscalibration, thereby informing model updating.2Second, recalibration was not examined. Systematic over- or underestimation is common when applying models to new populations. Given that late-treated patients have worse outcomes than early-treated ones (e.g. mRS 0-2: 32.2% in AURORA and 46.6% in HERMES), recalibration is essential to avoid biased risk estimates in future patients.3Finally, subgroup analyses of discriminative performance are difficult to interpret. C statistics reflect both model validity and heterogeneity of validation cohort,4 making it unclear whether observed changes indicate true performance differences or heterogeneity in subgroups. Wide confidence intervals further suggest under-powered and uncertain results.This study is commendable, and we hope these comments support further refinement of this promising model.
Building similarity graph...
Analyzing shared references across papers
Loading...
Xi; id_orcid 0009-0006-7899-8239 Li
Bob Roozenbeek
Hester Lingsma
Neurology, Inc
Faculty of Public Health
Building similarity graph...
Analyzing shared references across papers
Loading...
Li et al. (Wed,) studied this question.
www.synapsesocial.com/papers/69e7138bcb99343efc98d080 — DOI: https://doi.org/10.1212/wnl.0000000000213796#letters-section