March 3, 2026Open Access

SKALE: An Interpretable Multiscale Machine Learning Model for Decoding Phase‐Specific Protein Aggregation in Neurodegenerative Proteinopathies

Key Points

SKALE identifies critical hotspots in protein sequences linked to aggregation, enhancing prediction accuracy.
In ALS-linked SOD1 G86R, a risk region was defined between residues 72-91, indicating aggregation factors.
Observational analysis using machine learning frameworks reveals key structural factors driving protein misfolding.
Interpretability analysis highlights shared molecular dynamics, emphasizing the role of beta-sheet structures in nucleation.

Abstract

ABSTRACT Protein aggregation drives proteinopathies ranging from ALS to systemic amyloidosis, yet the multiscale determinants bridging sequence, structure, and kinetics remain elusive. We present SKALE, an interpretable machine learning framework that integrates sequence motifs, AlphaFold‐derived structural descriptors, and experimental kinetics to decode aggregation mechanisms. SKALE identifies latent hotspots that evade conventional tools and matches high‐performing neural baselines while preserving computational efficiency. In ALS‐linked SOD1 G86R, the model isolates a risk region at residues 72–91 where preserved β‐sheet geometry coincides with weakened hydrogen bonding to drive nucleation. Similarly, analysis of TDP‐43 S332N reveals that a locally unwound helix increases surface exposure, a prediction validated by showing that targeted deletion of model‐identified regions significantly reduces cellular aggregation. The framework generalizes to Tau P301L and PRNP variants where it uncovers distal aggregation‐prone regions to discriminate pathogenic drivers from neutral mutations. Interpretability analysis further disentangles global from mutation‐local mechanisms to reveal that β‐sheet propensity acts as a shared determinant while hydrogen bond dynamics define specific routes to nucleation. These findings establish SKALE as a scalable, disease‐agnostic engine that combines high‐fidelity prediction with biophysical resolution to decode the molecular logic of misfolding and guide therapeutic design.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Wei Xuan Wilson Loo

Jia Shen Sio

Keyin Yap

Journals

SHILAP Revista de lepidopterología

Aggregate

Actions

Institutions

Monash University Malaysia

Chinese Institute for Brain Research

Sunway University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

SKALE: An Interpretable Multiscale Machine Learning Model for Decoding Phase‐Specific Protein Aggregation in Neurodegenerative Proteinopathies

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study