UPM Institutional Repository

Graph attention network integrating sequence semantics and structural constraint information for robust antiviral peptide prediction


Citation

Chongjun, Yuan and Mohammad Latif, Muhammad Alif and Abdul Rahman, Mohd Basyaruddin and Tejo, Bimo A. (2026) Graph attention network integrating sequence semantics and structural constraint information for robust antiviral peptide prediction. Artificial Intelligence Chemistry, 4 (1). art. no. 100114. pp. 1-14. ISSN 2949-7477

Abstract

Viral infections remain a major global health challenge, with current antiviral therapies often limited by drug resistance and high development costs. Antiviral peptides (AVPs) are promising alternatives to traditional antiviral drugs owing to their safety and broad-spectrum activity. Most existing AVP prediction classifiers rely solely on sequence-derived features while neglecting three-dimensional structural information, which limits their generalization ability under highly imbalanced virtual screening conditions. To overcome these limitations, we propose a graph-based deep learning framework that explicitly integrates residue-level three-dimensional structural information with sequence semantics for AVPs identification. Residue-level graphs are constructed from ESMFold-predicted structures and enriched with embeddings from the ESMC protein language model, enabling the model to capture both spatially proximal and sequentially distant interactions that are inaccessible to sequence-only approaches. These graphs are processed using a graph attention network with multiscale pooling to learn structure-aware representations. Evaluated on a large, imbalanced, and independent test set, our model demonstrates substantially improved robustness to class imbalance and structural variability, outperforming state-of-the-art sequence-based predictors. Notably, the proposed framework reduces false positives by 54% relative to Stack-AVP, improves the Matthews correlation coefficient by 29%, and achieves an accuracy of 84.1% and specificity of 84.8%. Furthermore, Grad-CAM-based interpretability analysis provides residue-level mechanistic insights, highlighting structurally and functionally relevant amino acids driving antiviral activity. By unifying sequence semantics with explicit structural constraints, this work advances AVPs prediction beyond sequence-only paradigms and provides a practical, interpretable tool for antiviral peptide discovery under realistic, imbalanced conditions.


Download File

[img] Text
123977.pdf - Published Version
Available under License Creative Commons Attribution.

Download (12MB)

Additional Metadata

Item Type: Article
Subject: Artificial Intelligence
Subject: Computer Science Applications
Subject: Chemistry (all)
Divisions: Faculty of Science
Centre for Foundation Studies in Science of Universiti Putra Malaysia
DOI Number: https://doi.org/10.1016/j.aichem.2026.100114
Publisher: Elsevier
Keywords: Antiviral peptides classifier; Esmfold; Grad-cam; Graph neural network; Machine learning
Depositing User: Ms. Siti Radziah Mohamed@mahmod
Date Deposited: 02 Apr 2026 08:59
Last Modified: 03 Apr 2026 07:25
Altmetrics: http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.1016/j.aichem.2026.100114
URI: http://psasir.upm.edu.my/id/eprint/123977
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item