iACP-SEI: An Anticancer Peptide Identification Method Incorporating Sequence Evolutionary Information

Authors

  • Bowen Zheng College of Biomedical Engineering, Sichuan University, China
  • Rujun Li College of Biomedical Engineering, Sichuan University, China
  • Haotian Wang College of Biomedical Engineering, Sichuan University, China
  • Sheng Wang College of Biomedical Engineering, Sichuan University, China
  • Shiyu Peng College of Biomedical Engineering, Sichuan University, China
  • Mingxin Li College of Biomedical Engineering, Sichuan University, China
  • Liangzhen Jiang College of Food and Biological Engineering, Chengdu University, and Country Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, China
  • Zhibin Lv College of Biomedical Engineering, Sichuan University, China

DOI:

https://doi.org/10.47852/bonviewJDSIS52024821

Keywords:

anticancer peptides, ESM, deep representation learning, feature selection, ensemble learning

Abstract

Anticancer peptides (ACPs) are a promising focus in clinical oncology due to their ability to inhibit tumor cell proliferation with minimal side effects. Nevertheless, large-scale, expeditious and efficacious identification of ACPs is hindered by the high cost and time demands of conventional wet-lab experiments. Therefore, we introduced a new method called iACP-SEI to identify ACPs using sequence evolution information. iACP-SEI method utilizes the ESM2 protein language model, based on Transformer architecture, to extract feature vectors that encapsulate evolutionary information from peptide sequences. These vectors underwent feature selection via the light gradient boosting machine and used in an ensemble learning approach. Using the AntiCP2.0 main and alternate datasets, iACP-SEI model achieved independent test accuracies of 77.78% and 94.82%, respectively. Furthermore, it outperformed current methods on an unbalanced dataset, achieving a cross-validation accuracy of 90.39%, demonstrating improved robustness in handling imbalanced class samples. Although iACP-SEI demonstrated higher predictive performance and robustness than other methods, some limitations of it are also discussed.

 

Received: 18 November 2024 | Revised: 8 January 2025 | Accepted: 23 January 2025

 

Conflicts of Interest

The authors declare that they have no conflicts of interest to this work.

 

Data Availability Statement

Data available on request from the corresponding author upon reasonable request.

 

Author Contribution Statement

Bowen Zheng: Validation, Formal analysis, Investigation, Resources, Writing – original draft, Writing – review & editing, Supervision, Project administration. Rujun Li: Investigation, Data curation, Visualization. Haotian Wang: Validation, Investigation, Visualization. Sheng Wang: Investigation. Shiyu Peng: Data curation. Mingxin Li: Formal analysis. Liangzhen Jiang: Writing – review & editing. Zhibin Lv: Conceptualization, Methodology, Software, Supervision, Project administration, Funding acquisition.

 


Downloads

Published

2025-03-04

Issue

Section

Research Articles

How to Cite

Zheng, B., Li, R., Wang, H., Wang, S., Peng, S., Li, M., Jiang, L., & Lv, Z. (2025). iACP-SEI: An Anticancer Peptide Identification Method Incorporating Sequence Evolutionary Information. Journal of Data Science and Intelligent Systems. https://doi.org/10.47852/bonviewJDSIS52024821

Funding data