Comparative study on classification performance between support vector machine and logistic regression

PDF / 379,057 Bytes
12 Pages / 595.276 x 790.866 pts Page_size
95 Downloads / 226 Views

ORIGINAL ARTICLE

Comparative study on classification performance between support vector machine and logistic regression Abdallah Bashir Musa

Received: 4 September 2011 / Accepted: 2 January 2012 / Published online: 24 January 2012 Springer-Verlag 2012

Abstract Support vector machine (SVM) is a comparatively new machine learning algorithm for classification, while logistic regression (LR) is an old standard statistical classification method. Although there have been many comprehensive studies comparing SVM and LR, since they were made, there have been many new improvements applied to them such as bagging and ensemble. Recently, bagging and ensemble learning have become hot topics, widely used to improve the generalization performance of single learning algorithm. Therefore, comparing classification performance between SVM and LR using bagging and ensemble is an interesting issue. The average of estimated probabilities’ strategy was used for combining classifiers in this paper. Different evaluation metrics assess different characteristics of machine learning algorithm. It is possible for a learning method to perform well on one metric, but be suboptimal on other metrics. Therefore this study includes a variety of criteria to evaluate the classification performance of the learning methods: accuracy, sensitivity, specificity, precision, F-score and the area under the receiver operating characteristic curve. This has not been included in previous studies of SVM, owing to the fact that it did not support estimated probabilities at that time. Other metrics used in medical diagnosis, such as, Youden’s index (c), positive and negative likelihoods (q?, q-) and diagnostic odds ratio were evaluated to convey and compare the qualities of the two algorithms. This study is distinct by its inclusion of a comprehensive statistical analysis for the results of the SVM and LR algorithms on various data sets.

A. B. Musa (&) Faculty of Mathematical Sciences and Computer, University of Gezira, Wad Madani 20, Sudan e-mail: [email protected]

Keywords Support vector machine (SVM) Logistic regression (LR) Machine learning algorithm Bagging Ensemble Statistical analysis

1 Introduction Logistic regression (LR) [1, 2] is a multivariable method devised for dichotomous outcomes. It is a standard statistical classification method which is particularly appropriate for models involving disease state (healthy/diseased), decision making (yes/no), or mortality (dead, living). It is widely used in binary classification problems in applied sciences such as medicine, biology and epidemiology. It has been widely applied due to its simplicity and great interpretability. Logistic regression needs special requirements regarding the data under consideration, such as, little or no collinearly among the independent variables and linearity of the independent variables with the logit. In contrast, SVM [3, 4, 5] recently, has become a very popular machine learning tool for classification. It is easy and uncomplicated as compared to LR. Nowadays

Data Loading...

Comparative study on classification performance between support vector machine and logistic regression

Recommend Documents

Efficient Support Vector Machine Classification Using Prototype Selection and Generation

Support Vector Machine

Square Penalty Support Vector Regression

Support Vector Machine

Support Vector Machine

Application of Support Vector Machine in Base Liquor Classification

Support Vector Machine Classification for Object-Based Image Analysis

Research on Grain Moisture Curve Fitting Based on Support Vector Machine Regression

A comparative study on the landslide susceptibility mapping using logistic regression and statistical index models

Classification of Leaves Using Convolutional Neural Network and Logistic Regression

Covid-19 Classification Based on Gray-Level Co-occurrence Matrix and Support Vector Machine

Direction of Arrival Estimation Based on Support Vector Regression