Traditional regression models struggle with non-linear data, limiting their ability to predict health outcomes such as activities of daily living (ADL) in stroke survivors. Machine learning (ML) methods offer improved accuracy but often lack interpretability. Explainable ML (XML) algorithms, including SHapley Additive exPlanations (SHAP), promise both precision and interpretability. This study aimed to develop interpretable XML-based prediction models for discharge ADL in subacute stroke survivors and to identify key prognostic factors. We retrospectively analyzed 980 subacute stroke survivors admitted to the Tokyo Bay Rehabilitation Hospital between March 2015 and September 2019. Thirty-three explanatory variables, including age, Functional Independence Measure (FIM) subitems, physical function measures, and nutritional assessments, were used to predict discharge FIM scores and FIM gain (motor, cognitive, total). Predictive models were developed using six algorithms (Elastic Net, Extra Trees, LightGBM, Random Forest, XGBoost, and their combinations - Mixed), and accuracy was evaluated using the coefficient of determination (R²) and root mean squared error (RMSE). Variable importance was quantified using SHAP scores. Maximum R² values for motor, cognitive, and discharge total FIM scores were 0.776, 0.630, and 0.758, respectively. Those for FIM gain were 0.604, 0.413, and 0.536, respectively. SHAP scores highlighted the importance of age, trunk function, and grip strength. Furthermore, low-difficulty FIM motor subitems were significant predictors of total scores, whereas high-difficulty subitems were critical for FIM gain. XML achieved predictive performance comparable to reported parsimonious ML models, while providing higher clinical granular interpretability. By utilizing SHAP values, this approach overcomes the ‘black box’ nature of standard ML, providing transparent and actionable insights into how individual functional deficits contribute to outcomes. Notably, the predictive accuracy for FIM gain improved significantly compared to previous studies. Even without explicitly incorporating domain knowledge about the relationships between variables, XML was able to generate clinically relevant prediction models that accounted for these relationships. Thus, the examination of ADL prognostic factors using XML is clinically meaningful and has promising implications for real-world clinical applications.
Miyazaki et al. (Thu,) studied this question.