What question did this study set out to answer?

The aim is to improve the prediction of protein-peptide binding sites by integrating sequence-structure information using a novel framework.

May 8, 2026

MGAPep: LLM-Augmented Multimodal Graph Attention for Protein-Peptide Binding Site Prediction and Cross-Domain Transfer

Key Points

The aim is to improve the prediction of protein-peptide binding sites by integrating sequence-structure information using a novel framework.
Introduced MGAPep utilizing large language model embeddings and protein descriptors with a graph attention backbone.
Applied self-supervised pre-training and task-specific fine-tuning to enhance model performance.
Conducted extensive benchmarking against baseline methods for validation.
Achieved state-of-the-art accuracy in protein-peptide binding site prediction, with effective generalization to previously unseen proteins and peptides.
Demonstrated superior performance on protein-nucleic acid binding site prediction without altering architecture.
Showed that graph-enhanced LLMs significantly improve biomolecular binding modeling outcomes.

Abstract

Protein-peptide interactions drive peptide therapeutics, precision design, and biomarker discovery, yet most predictors underuse complementary sequence-structure information. LLM-augmented multimodal approaches offer a promising solution to these limitations. We introduce MGAPep, which fuses pre-trained large language model embeddings with protein sequence and structural descriptors via a residual graph attention backbone and a multi-head dual-attention module to capture fine-grained interface patterns. Leveraging large-scale corpora of protein fragment-peptide interaction data, MGAPep employs self-supervised pre-training, transfer learning, and task-specific fine-tuning to obtain rich, transferable representations. Extensive benchmarking shows consistent state-of-the-art accuracy for protein-peptide binding site prediction, with robust generalization to unseen proteins and peptides. The framework also transfers effectively across modalities, yielding superior performance to most baselines on protein-nucleic acid binding site prediction without architecture changes, underscoring broad applicability. Together with evidence that graph-enhanced LLMs improve biomolecular binding modeling, these results establish MGAPep as a general paradigm for protein-biomolecule interaction prediction.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Xiangzheng Fu

Xiaowen Li

Bosheng Song

Journals

IEEE Journal of Biomedical and Health Informatics

Actions

Institutions

Hunan University

Shenzhen University

University of North Carolina at Charlotte

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

MGAPep: LLM-Augmented Multimodal Graph Attention for Protein-Peptide Binding Site Prediction and Cross-Domain Transfer

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider