Toward a Self-Supervised Architecture for Semen Quality Prediction Using Environmental and Lifestyle Factors
Keywords:unsupervised learning, machine learning, decision support, public health, dimensionality reduction, bioinformatics
Male fertility has been seen to be declining, prompting for more effective and accessible means of its assessment. Artificial intelligence methods have been effective toward predicting semen quality through a questionnaire-based information source comprising a selection of factors from the medical literature which have been seen to influence semen quality. Prior work has seen the application of supervised learning toward the prediction of semen quality, but since supervised learning hinges on the provision of data class labels it can be said to depend on an external intelligence intervention, which can translate toward further costs and resources in practical settings. In contrast, unsupervised learning methods partition data into clusters and groups based on an objective function and do not rely on the provision of class labels and can allow for a fully automated flow of a prediction platform. In this paper, we apply three unsupervised learning models with different model architectures, namely Gaussian mixture model (GMM), K-means, and spectral clustering (SC), alongside low dimensional embedding methods which include sparse autoencoder (SAE), principal component analysis (PCA), and robust PCA. The best results were obtained with a combination of the SAE and the SC algorithm, which was likely due to its nonspecific and arbitrary cluster shape assumption. Further work would now involve the exploration of similar unsupervised learning algorithms with a similar framework to the SC to investigate the extent to which various clusters can be learned with maximal accuracy.
How to Cite
Copyright (c) 2022 Authors
This work is licensed under a Creative Commons Attribution 4.0 International License.