We read with great interest the article by Tanaka et al.1 This is a valuable contribution, as few prediction models in stroke have undergone external validation. The model’s simplicity and its discriminative performance in late-window patients are notable strengths. However, several methodological aspects warrant attention.First, the authors assessed calibration by grouping scores and comparing them with observed outcome frequencies in a bar chart. This limits interpretation across the full risk spectrum. Converting scores into predicted probabilities and plotting a smooth calibration curve would better reveal miscalibration, thereby informing model updating.2Second, recalibration was not examined. Systematic over- or underestimation is common when applying models to new populations. Given that late-treated patients have worse outcomes than early-treated ones (e.g. mRS 0-2: 32.2% in AURORA and 46.6% in HERMES), recalibration is essential to avoid biased risk estimates in future patients.3Finally, subgroup analyses of discriminative performance are difficult to interpret. C statistics reflect both model validity and heterogeneity of validation cohort,4 making it unclear whether observed changes indicate true performance differences or heterogeneity in subgroups. Wide confidence intervals further suggest under-powered and uncertain results.This study is commendable, and we hope these comments support further refinement of this promising model.
Building similarity graph...
Analyzing shared references across papers
Loading...
Li et al. (Wed,) studied this question.
www.synapsesocial.com/papers/69e7138bcb99343efc98d080 — DOI: https://doi.org/10.1212/wnl.0000000000213796#letters-section
Xi; id_orcid 0009-0006-7899-8239 Li
Bob Roozenbeek
Hester Lingsma
Neurology, Inc
Faculty of Public Health
Building similarity graph...
Analyzing shared references across papers
Loading...