Automatic optimized support vector regression for financial data prediction

  • PDF / 691,241 Bytes
  • 14 Pages / 595.276 x 790.866 pts Page_size
  • 35 Downloads / 202 Views

DOWNLOAD

REPORT


(0123456789().,-volV)(0123456789(). ,- volV)

ORIGINAL ARTICLE

Automatic optimized support vector regression for financial data prediction Dana Simian1 • Florin Stoica1 • Alina Ba˘rbulescu2 Received: 18 July 2017 / Accepted: 25 April 2019  Springer-Verlag London Ltd., part of Springer Nature 2019

Abstract The aim of this article is to introduce a hybrid approach, namely optimal multiple kernel–support vector regression (OMK– SVR) for time series data prediction and to analyze and compare its performances against those of support vector regression with a single RBF kernel (RBF-SVR), gene expression programming (GEP) and extreme learning machine (ELM) on the financial series formed by the monthly and weekly values of Bursa Malaysia KLCI Index, monthly values of Dow Jones Industrial Average Index (DJIA) and New York Stock Exchange. Our method provides an optimal multiple kernel and optimal parameters in Support Vector Regression, improving the accuracy of prediction. The proposed approach is structured on two levels. The macro-level uses a breeder genetic algorithm for choosing the optimal multiple kernel and the SVR optimal parameters. The fitness function of each chromosome is computed in the micro-level using a SVR algorithm. The regression model based on the optimal multiple kernel and optimal parameters is then validated and used for forecasting. The experimental results prove that OMK–SVR performs better than GEP, RBF-SVR and ELM for predicting the future behavior of the study series. A sensitivity study with respect to the number of kernels from the multiple kernel used by OMK–SVR and with respect to the ratio between training and testing data sets was conducted. Keywords Prediction methods  Support vector regression  Evolutionary computation  Financial forecasting  Genetic algorithms

1 Introduction Time series modeling and prediction are active topics of research in many areas like meteorology, ecology, finance, signal processing, dynamical systems and statistics. A time series is composed by a finite set of elements observed sequentially over time. The problem of time series prediction consists on finding a function f which predicts future values, xtþp of the data series fxt gNt¼1 using past values Xt ¼ ðxt ; xts ; . . .; xtðd1Þs Þ where s is the time delay, d is the embedding dimension or the time window and p is the prediction horizon. Consequently, the predicted & Dana Simian [email protected] http://web.ulbsibiu.ro/dana.simian/index.html 1

Lucian Blaga University of Sibiu, 10 Victoriei Bd., 550024 Sibiu, Romania

2

Ovidius University of Constanta, 124 Mamaia Bd., 900527 Constant¸ a, Romania

value is given by xtþp ¼ f ðXt Þ. In general, statistical prediction methods [1] cannot capture the nonlinearity of data. Therefore, other nonlinear methods like artificial neural networks (ANNs) [2], support vector regression (SVR) [3–5], gene expression programming (GEP) [6], extreme learning machine (ELM) [7, 8], etc., are being used. Another problem facing the time series forecast is that the pr