Heart disease classification using data mining tools and machine learning techniques
- PDF / 448,801 Bytes
- 8 Pages / 595.276 x 790.866 pts Page_size
- 52 Downloads / 287 Views
ORIGINAL PAPER
Heart disease classification using data mining tools and machine learning techniques Ilias Tougui 1
&
Abdelilah Jilbab 1 & Jamal El Mhamdi 1
Received: 14 February 2020 / Accepted: 11 May 2020 # IUPESM and Springer-Verlag GmbH Germany, part of Springer Nature 2020
Abstract Nowadays, in healthcare industry, data analysis can save lives by improving the medical diagnosis. And with the huge development in software engineering, different data mining tools are available for researchers, and used to conduct studies and experiments. For this, we have decided to compare six common data mining tools: Orange, Weka, RapidMiner, Knime, Matlab, and Scikit-Learn, using six machine learning techniques: Logistic Regression, Support Vector Machine, K Nearest Neighbors, Artificial Neural Network, Naïve Bayes, and Random Forest by classifying heart disease. The dataset used in this study has 13 features, one target variable, and 303 instances in which 139 suffers from cardiovascular disease and 164 are healthy subjects. Three performance measures were used to compare the performance of the techniques in each tool: the accuracy, the sensitivity, and the specificity. The results showed that Matlab was the best performing tool, and Matlab’s Artificial Neural Network model was the best performing technique. We concluded this research by plotting the Receiver operating characteristic curve of Matlab and by giving several recommendations on which tool to choose taking into account the users experience in the field of data mining. Keywords Data mining tools . Machine learning techniques . Heart disease classification . Performance measures
1 Introduction Heart disease refers to a number of conditions that affects the heart, which includes but not limited to blood vessel disease, heart attack, stroke, heart failure, arrhythmia, and so on. People usually confuse the two terms “heart disease” and “cardiovascular disease”, the latter that refers to the situations that may cause a heart attack, chest pain, or stroke. In 2016, a study by the world health organization [1] estimated that over 17 million individuals died from cardiovascular disease, which represents more than 30% of deaths globally. The same study indicated that over 70% of deaths takes place in the lowand middle-income countries. As an example, in 2018, a study by the ministry of health of Morocco [2] indicated that the main factor of strokes, cardiovascular morbidity, and mortality in Morocco is high blood pressure. Another study [3] conducted by the world health organization, estimated that 38% of
* Ilias Tougui [email protected] 1
ENSET, E2SN, Mohammed V University, 10500 Rabat, Morocco
yearly deaths in Morocco are caused by cardiovascular disease, 14% by cancer, a number that is expected to grow dramatically in 2030[4], if no precautions were taken into account. The good news is that we could prevent cardiovascular disease by avoiding dangerous factors, such as unhealthy diets and lifestyles, lack of physical activity that causes overweight a
Data Loading...