Identification of Key Gene Modules and Novel Transcription Factors in Tetralogy of Fallot Using Machine Learning and Network Topological Features

Authors

  • Sona Charles Department of Bioinformatics, Bharathiar University and Crop Improvement and Biotechnology Division, ICAR- Indian Institute of Spices Research, India https://orcid.org/0000-0002-6781-7440
  • Jeyakumar Natarajan Department of Bioinformatics, Bharathiar University, India

DOI:

https://doi.org/10.47852/bonviewMEDIN32021554

Keywords:

Tetralogy of Fallot, machine learning, network features, transcription factor, gene modules, WGCNA

Abstract

Tetralogy of Fallot (TOF) is a combinatorial congenital abnormality comprising of ventricular septal defect (VSD), pulmonary valve stenosis, a misplaced aorta and a thickened right ventricular wall. Biologically relevant module identification from transcriptome data may be considered as a binary classification problem. We utilized publicly accessible mRNA expression data to extract the differentially expressed genes (DEGs) and further weighted gene co-expression network analysis to identify ten modules in TOF. Network topological properties of modular and non-modular genes were considered as features for binary classification. We applied SVM, Random Forest, Decision Trees, KNN and Naïve Bayes algorithm to network features. Random Forest and decision tree algorithms displayed an accuracy of 99.1% and 98% respectively. All the methods, in combination predicted 71 common genes which were used to construct a gene regulatory network. The network was expanded to include 30 miRNAs targeting the genes. Interestingly, 39 out of 71 genes were transcription factors out of which ELN, SOX6 and FOXO3 genes are novel candidates in TOF. The work also provides a sub-module of genes and miRNAs supported by statistical models as prospective candidates to be biomarkers.

 

Received: 18 August 2023 | Revised: 27 September 2023 | Accepted:  8 October 2023

 

Conflicts of Interest

The authors declare that they have no conflicts of interest to this work.

 

Data Availability Statement

The data that support this work are available upon reasonable request to the corresponding author.


Downloads

Published

2023-10-10

How to Cite

Charles, S., & Natarajan, J. (2023). Identification of Key Gene Modules and Novel Transcription Factors in Tetralogy of Fallot Using Machine Learning and Network Topological Features. Medinformatics, 1(1), 27–34. https://doi.org/10.47852/bonviewMEDIN32021554

Issue

Section

Research Articles