Background Parkinson’s disease (PD) progression is clinically heterogeneous, complicating predictive modeling and personalized monitoring. We developed a protein-peptide sequential convolutional neural network (pSCNN) enhanced with kernel density estimation (KDE) to capture nonlinear relationships between cerebrospinal fluid (CSF) biomarkers, clinical features, and longitudinal Unified Parkinson’s Disease Rating Scale (UPDRS). Methods Data from 248 patients in the Accelerating Medicines Partnership–PD (AMP-PD) cohort, including CSF and visit metadata, were aligned via arithmetic averaging of UniProt-level technical replicates within visits, then pivoted into wide-format matrices and merged on visit identifiers to create unified multimodal profiles. Aligned features underwent Box-Cox transformation, filtering, and KDE-based distributional modeling. The trained pSCNN was evaluated using the mean absolute error (MAE) and the symmetric mean absolute percentage error (SMAPE). Findings The pSCNN reduced average SMAPE from 125.4% to 92.6% and MAE from 6.60 to 5.35 across UPDRS subscales relative to baseline models, with the largest gains observed for motor symptoms (UPDRS-III: SMAPE 125.5% to 72.9%). SHapley Additive exPlanations (SHAP) analysis identified medication status (upd23b) and temporal progression (visit month) as the strongest predictors, while space-time analysis demonstrated close alignment between predicted and observed molecular trajectories. Interpretation KDE-enhanced multimodal biomarker integration improves regression-based modeling of PD progression by capturing medication-dependent distributional shifts invisible to linear methods. The pSCNN offers an interpretable and reproducible framework for biomarker-driven disease tracking. However, external validation across independent cohorts with documented disease staging is required before clinical deployment.
Saba et al. (Thu,) studied this question.