Identification of Key Gene Modules and Novel Transcription Factors in Tetralogy of Fallot Using Machine Learning and Network Topological Features
Keywords:Tetralogy of Fallot, machine learning, network features, transcription factor, gene modules, WGCNA
Tetralogy of Fallot (TOF) is a combinatorial congenital abnormality comprising of ventricular septal defect (VSD), pulmonary valve stenosis, a misplaced aorta and a thickened right ventricular wall. Biologically relevant module identification from transcriptome data may be considered as a binary classification problem. We utilized publicly accessible mRNA expression data to extract the differentially expressed genes (DEGs) and further weighted gene co-expression network analysis to identify ten modules in TOF. Network topological properties of modular and non-modular genes were considered as features for binary classification. We applied SVM, Random Forest, Decision Trees, KNN and Naïve Bayes algorithm to network features. Random Forest and decision tree algorithms displayed an accuracy of 99.1% and 98% respectively. All the methods, in combination predicted 71 common genes which were used to construct a gene regulatory network. The network was expanded to include 30 miRNAs targeting the genes. Interestingly, 39 out of 71 genes were transcription factors out of which ELN, SOX6 and FOXO3 genes are novel candidates in TOF. The work also provides a sub-module of genes and miRNAs supported by statistical models as prospective candidates to be biomarkers.
How to Cite
Copyright (c) 2023 Authors
This work is licensed under a Creative Commons Attribution 4.0 International License.