Towards Predicting the Quality of Red Wine Using Novel Machine Learning Methods for Classification, Data Visualization, and Analysis

Jovial Niyogisubizo; Jean de Dieu  Ninteretse; Eric  Nziyumva; Marc  Nshimiyimana; Evariste  Murwanashyaka; Erneste  Habiyakare

doi:10.47852/bonviewAIA42021999

Authors

Jovial Niyogisubizo Fujian Provincial Universities Key Laboratory of Industrial Control and Data Analysis, Fujian University of Technology, China https://orcid.org/0000-0001-6595-0101
Jean de Dieu Ninteretse Department of Construction and Real Estate, Southeast University, China
Eric Nziyumva Fujian Provincial Universities Key Laboratory of Industrial Control and Data Analysis, Fujian University of Technology, China
Marc Nshimiyimana School of Civil Engineering, Southeast University, China
Evariste Murwanashyaka Institute of Rock and Soil Mechanics, University of Chinese Academy of Sciences, China
Erneste Habiyakare School of Geosciences and Info-Physics, Central South University, China

DOI:

https://doi.org/10.47852/bonviewAIA42021999

Keywords:

machine learning, random forest, decision trees, gradient boosting, red wine quality, stacking ensemble

Abstract

There is a growing concern among consumers and the wine industry regarding the quality of wine. Traditionally, wine experts determined its quality through tasting, which was time-consuming. Therefore, there is a need to predict wine quality based on specific key features to streamline these tasks. Technological developments like machine learning approaches have replaced human assessments with computational methods. However, some of these methods have faced criticism due to their low accuracy and lack of interpretability for humans. In this paper, a stacking ensemble method is introduced and demonstrates superior predictive performance when compared to other classification techniques like logistic regression, decision trees, gradient boosting, adaptive boosting (AdaBoost), and random forest. This evaluation is based on classification metrics such as accuracy, precision, recall, and F1-Score, all under the same conditions. Additionally, outlier detection algorithms were employed to identify exceptional or subpar wines, though their results did not match the accuracy of classification approaches. Lastly, a feature analysis study was conducted to assess the significance of each feature in the model’s performance.

Received: 3 November 2023 | Revised: 21 February 2024 | Accepted: 28 April 2024

Conflicts of Interest

The authors declare that they have no conflicts of interest to this work.

Data Availability Statement

The data that support the findings of this study are openly available in UCI machine learning repository at https://archive.ics.uci.edu/ml/datasets/wine+quality.

The technical details, code, and data can be accessed by visiting the link: https://github.com/jovialniyo93/red_wine_quality_prediction.

Author Contribution Statement

Jovial Niyogisubizo: Conceptualization, Methodology, Software, Validation, Formal analysis, Data curation, Writing - original draft, Writing - review & editing, Project administration. Jean de Dieu Ninteretse: Validation, Formal analysis, Writing - original draft, Writing - review & editing. Eric Nziyumva: Validation, Formal analysis, Resources, Writing - original draft, Writing - review & editing, Supervision. Marc Nshimiyimana: Validation, Formal analysis, Investigation, Writing - original draft, Writing - review & editing, Visualization. Evariste Murwanashyaka: Conceptualization, Methodology, Validation, Formal analysis, Investigation, Resources, Data curation, Writing - original draft, Writing - review & editing, Visualization, Supervision. Erneste Habiyakare: Validation, Formal analysis, Investigation, Resources, Writing - original draft, Writing - review & editing, Visualization, Supervision.