This study demonstrates that ML models with similar overall performance can yield substantially divergent predictions at both the individual and subgroup levels, and that no single algorithm consistently outperforms others across all patient subgroups. These findings highlight the limitations of relying solely on global performance metrics and underscore the need for context-aware evaluation of ML models in heterogeneous clinical populations.
Building similarity graph...
Analyzing shared references across papers
Loading...
Júlia Chaves Neuenschwander Magalhães
Alexandre Dias Porto Chiavegatto Filho
PLoS ONE
Universidade de São Paulo
Building similarity graph...
Analyzing shared references across papers
Loading...
Magalhães et al. (Fri,) studied this question.
www.synapsesocial.com/papers/69ada8dfbc08abd80d5bc4ab — DOI: https://doi.org/10.1371/journal.pone.0344354