iACP-SEI: An Anticancer Peptide Identification Method Incorporating Sequence Evolutionary Information
DOI:
https://doi.org/10.47852/bonviewJDSIS52024821Keywords:
anticancer peptides, ESM, deep representation learning, feature selection, ensemble learningAbstract
Anticancer peptides (ACPs) are a promising focus in clinical oncology due to their ability to inhibit tumor cell proliferation with minimal side effects. Nevertheless, large-scale, expeditious and efficacious identification of ACPs is hindered by the high cost and time demands of conventional wet-lab experiments. Therefore, we introduced a new method called iACP-SEI to identify ACPs using sequence evolution information. iACP-SEI method utilizes the ESM2 protein language model, based on Transformer architecture, to extract feature vectors that encapsulate evolutionary information from peptide sequences. These vectors underwent feature selection via the light gradient boosting machine and used in an ensemble learning approach. Using the AntiCP2.0 main and alternate datasets, iACP-SEI model achieved independent test accuracies of 77.78% and 94.82%, respectively. Furthermore, it outperformed current methods on an unbalanced dataset, achieving a cross-validation accuracy of 90.39%, demonstrating improved robustness in handling imbalanced class samples. Although iACP-SEI demonstrated higher predictive performance and robustness than other methods, some limitations of it are also discussed.
Received: 18 November 2024 | Revised: 8 January 2025 | Accepted: 23 January 2025
Conflicts of Interest
The authors declare that they have no conflicts of interest to this work.
Data Availability Statement
Data available on request from the corresponding author upon reasonable request.
Author Contribution Statement
Bowen Zheng: Validation, Formal analysis, Investigation, Resources, Writing – original draft, Writing – review & editing, Supervision, Project administration. Rujun Li: Investigation, Data curation, Visualization. Haotian Wang: Validation, Investigation, Visualization. Sheng Wang: Investigation. Shiyu Peng: Data curation. Mingxin Li: Formal analysis. Liangzhen Jiang: Writing – review & editing. Zhibin Lv: Conceptualization, Methodology, Software, Supervision, Project administration, Funding acquisition.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Authors

This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite
Funding data
-
Chengdu Science and Technology Bureau
Grant numbers 2024-YF08- 00022-GX -
National Natural Science Foundation of China
Grant numbers 62371318;32302083