Identifying Risk Factors for Heart Failure: A Case Study Employing Data Mining Algorithms

Authors

  • Vitória S. Souza Laboratory of Intelligent Computing and Robotics, Federal Institute of Triangulo Mineiro Campus Patrocínio, Brazil https://orcid.org/0000-0001-9057-9738
  • Danielli A. Lima Laboratory of Intelligent Computing and Robotics, Federal Institute of Triangulo Mineiro Campus Patrocínio, Brazil https://orcid.org/0000-0003-0324-6690

DOI:

https://doi.org/10.47852/bonviewJDSIS32021386

Keywords:

data mining, machine learning, cardiology, heart failure, receive operating characteristic curve, artificial intelligence, random forest learner

Abstract

Heart diseases are increasingly present in the lives of human beings and are diseases that affect the heart and blood vessels and can lead the person who develops to death. In this article, we analyzed an open and public database on heart failure composed of a sample of 299 people and 12 attributes. This article presents a preprocessing technique using area under the curve (AUC) filters, which increases the efficiency of the algorithms by decreasing the parameters, leading to better memory usage and computational processing. To enhance our results, we used a methodology involving 102 simultaneous validations. This approach allowed us to obtain more robust and reliable results. In addition, we used the receiver operating characteristic curve to evaluate the overall performance of each attribute. We trained a set of nine classification algorithms, among which the random forest learner stood out with an accuracy of 87.21% when using a filter that considered attributes with AUC greater than 0.4, considering values of AUC. Additionally, the fuzzy rules learner demonstrated its effectiveness by achieving an accuracy of 84.45% with a filter limit of 0.6, focusing on ejection fraction, serum sodium, time attributes, and class for death events. This analysis demonstrated the ability of these algorithms to effectively use a reduced number of attributes for accurate predictions.

 

Received: 20 July 2023 | Revised: 28 August 2023 | Accepted: 14 September 2023

 

Conflicts of Interest

The authors declare that they have no conflicts of interest to this work.

 

Data Availability Statement

The data that support the findings of this study are openly available in [UC Irvine Machine Learning Repository] at https://doi.org/10.24432/C5Z89R.


Downloads

Published

2023-09-20

Issue

Section

Research Articles

How to Cite

Souza, V. S., & A. Lima, D. (2023). Identifying Risk Factors for Heart Failure: A Case Study Employing Data Mining Algorithms. Journal of Data Science and Intelligent Systems, 2(3), 161-173. https://doi.org/10.47852/bonviewJDSIS32021386