Gene Signatures for Autism Classification: Mining Biological Markers for Autism from Gene Expression Data
DOI:
https://doi.org/10.47852/bonviewMEDIN42024698Keywords:
machine learning, Naive Bayes classification, feature selection, autism detection, gene markersAbstract
Autism spectrum disorders are reported to be one of the most intriguing neurodegenerative conditions, while disinterring the possible causes of this malady has been a topic of intense research in the recent past. Many studies trace the origin of autism to gene mutations. However, it has been reported that analyzing hundreds of genes present in the human body led to extensive use of resources in terms of expertise, capital, and time. This in turn paved the way for computational investigations on gene expression data, which also proved to be a challenging task owing to the momentous number of attributes and the relatively low number of instances that were available to train the machine learning models. This research work thus explores the use of automated machine learning, deep learning, and traditional machine learning models to detect possible gene signatures that play the most contributory role in characterizing the presence of autism. The results suggest that the Bayesian classifier model fused with correlation feature filtering yielded higher accuracy, this being reported for the first time on this gene expression data. The proposed Bayesian machine learning model generated an accuracy of ∼87% with a minimal yet optimal gene signature that ranks a subset of 22 genes as significant gene markers from a total of 9454 genes.
Received: 31 October 2024| Revised: 5 June 2025 | Accepted: 18 July 2025
Conflicts of Interest
Shomona Gracia Jacob is an Associate Editor for Medinformatics and was not involved in the editorial review or the decision to publish this article. The author declares that she has no conflicts of interest to this work.
Data Availability Statement
The data that support the findings of this study are openly available in the NCBI database at www.ncbi.nlm.nih.gov.2011 and http://www.ncbi.nlm.nih.gov/sites/GDSbrowser?acc=GDS4431.
Author Contribution Statement
Shomona Gracia Jacob: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Resources, Data curation, Writing – original draft, Writing – review & editing, Visualization, Supervision, Project administration.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Author

This work is licensed under a Creative Commons Attribution 4.0 International License.