An experimental study on symbolic extreme learning machine

  • PDF / 1,214,732 Bytes
  • 11 Pages / 595.276 x 790.866 pts Page_size
  • 25 Downloads / 201 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

An experimental study on symbolic extreme learning machine Jinga Liu1 · Muhammed J. A. Patwary2 · XiaoYun Sun1 · Kai Tao1 Received: 4 June 2018 / Accepted: 4 September 2018 © Springer-Verlag GmbH Germany, part of Springer Nature 2018

Abstract With the advent of big data era, the volume and complexity of data have increased exponentially and the type of data has also been increased largely. Among all different types of data, symbolic data plays an important role in the study on machine learning model. It has been proved that feed-forward neural network (FNN) has a good ability to deal with numeric data but relatively clumsy with symbolic data. In this paper, a special type of FNN called Extreme Learning Machine (ELM) is discussed for handling symbolic data. Experimental results demonstrate that, unlike traditional back propagation based FNN, ELM has a better performance in comparison with C4.5 which is generally acknowledged as one of the best algorithms in handling symbolic data classification problems. In this performance comparison, some key evaluation criteria such as generalization ability, time complexity, the effect of training sample size and noise-resistance ability are taken into account. Keywords  Big data · FNN · Symbolic ELM · C4.5 algorithm

1 Introduction In the study of Machine Learning, classification problem plays an important role because it has a rigorous mathematical definition and theoretical framework. Until recently, many classification algorithms have been proposed by machine learning researchers. From references, we can find many classical classification algorithms such as decision tree, rough set, Bayesian model and feed forward neural network (FNN) etc. The mathematical model of FNN includes an objective function and a learning function, and the target is to minimize the difference between both functions on training dataset. Usually, the most common mechanism of dealing with classification problem is to use the label-known data to train a classifier and then use the trained classifier to predict unlabeled data. This technique is well known as supervised learning. * XiaoYun Sun [email protected] Muhammed J. A. Patwary [email protected] 1



College of Electrical and Electronic Engineering, Shijiazhuang Tiedao University, Shijiazhuang, China



Big Data Institute, College of Computer Science and Software Engineering, Shenzhen University, Guangdong, China

2

Scholars of supervised learning have developed many classification algorithms for better mining knowledge from data. Two of the most prominent algorithms are feed-forward neural network and decision tree. Due to the strong nonlinear mapping ability, the feed-forward neural network has successful applications in many fields [1–5]. Back Propagation (BP) algorithm is the foundation of feed-forward neural network training. Decision tree was first introduced by Quinlan in 1986 [6] and the knowledge of information theory was used successfully into the algorithm and the information entropy was selected as a criterion t