Machine Learning in Genomics: Applications in Whole Genome Sequencing, Whole Exome Sequencing, Single-Cell Genomics, and Spatial Transcriptomics
DOI:
https://doi.org/10.47852/bonviewMEDIN42024120Keywords:
machine learning, genomics, whole genome sequencing (WGS), whole exome sequencing (WES), spatial transcriptomics, metagenomics, epigenomicsAbstract
The application of machine learning (ML) to genomics has transformed the process of analyzing and interpreting large-scale, complex datasets, leading to important breakthroughs in our knowledge of biological systems. This review provides a comprehensive overview of ML applications in key genomic areas: Whole Genome Sequencing (WGS), Whole Exome Sequencing (WES), single-cell genomics, and spatial transcriptomics. In WGS and WES, ML techniques are employed for variant calling, genome-wide association studies, rare variant analysis, and the prediction of pathogenicity. In single-cell genomics, ML facilitates clustering, trajectory inference, and cell type identification, while in spatial transcriptomics, it aids in deciphering spatial patterns of gene expression and tissue heterogeneity. This review further explores the application of ML in related omics fields, including proteomics, transcriptomics, metagenomics, epigenomics, and microbiome research. These applications encompass protein structure prediction, functional annotation, microbial community profiling, and the analysis of epigenetic modifications. We address the challenges caused by high dimensionality, variability in the data, and the requirement for interpretable machine learning models when dealing with genomic data. Emerging technologies like explainable AI and federated learning are highlighted for their potential to address these challenges. Additionally, the review addresses ethical considerations, data privacy issues, and the necessity for standardized protocols in ML applications. This comprehensive examination underscores the transformative impact of ML in genomics and highlights its potential to drive future innovations in personalized medicine and biological research.
Received: 17 August 2024 | Revised: 17 October 2024 | Accepted: 31 October 2024
Conflicts of Interest
The authors declare that they have no conflicts of interest to this work.
Data Availability Statement
The data that support this work are available upon reasonable request to the corresponding author.
Author Contribution Statement
Saheed Adegbola Adeyanju: Conceptualization, Methodology, Validation, Data curation, Writing - original draft, Writing - review & editing, Supervision. Taiwo Temitope Ogunjobi: Methodology, Investigation, Writing - original draft, Writing - review & editing.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Authors
This work is licensed under a Creative Commons Attribution 4.0 International License.