Double decomposition and optimal combination ensemble learning approach for interval-valued AQI forecasting using stream

  • PDF / 2,478,514 Bytes
  • 16 Pages / 595.276 x 790.866 pts Page_size
  • 114 Downloads / 199 Views

DOWNLOAD

REPORT


RESEARCH ARTICLE

Double decomposition and optimal combination ensemble learning approach for interval-valued AQI forecasting using streaming data Zicheng Wang 1 & Liren Chen 2 & Jiaming Zhu 3 & Huayou Chen 1 & Hongjun Yuan 4 Received: 18 April 2020 / Accepted: 25 June 2020 # Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract To forecast possible future environmental risks, numerous models are developed to predict the hourly values or daily averages of air pollutant concentrations using streaming data (a kind of big data collected from the Internet). On the one hand, real-time hourly data is massive and redundant, making it difficult to process. On the other hand, daily averages cannot reflect the fluctuations of air pollutant concentrations throughout the day. Therefore, a double decomposition and optimal combination ensemble learning approach is proposed for interval-valued AQI (air quality index) forecasting in this paper. In the first decomposition, considering the strong seasonal representation of AQI, the original data of each year is decomposed into four seasonal subseries on the basis of the Chinese calendar. Subsequently, we reconstruct the data of the same season in different years to get a new seasonal series to reduce the interference of seasonal changes on AQI forecasting. In the second decomposition, due to the nonlinearity and irregularity of interval-valued AQI time series, BEMD (bivariate empirical mode decomposition) is employed to decompose the interval-valued signals into a finite number of complex-valued IMF (intrinsic mode function) components and one complex-valued residue component with different frequencies to reduce the complexity of interval times series. Interval multilayer perceptron (iMLP) is utilized to model the lower bound and the upper bound simultaneously of the total components to obtain the corresponding forecasting results, which are merged to produce the final interval-valued output by an optimal combination ensemble method. Empirical study results show that the proposed model with different datasets and different forecasting horizons is significantly better than other considered models for its superior forecasting performances. Keywords Air quality index . Interval forecasting . Bivariate empirical mode decomposition . Optimal combination ensemble . Seasonality

Introduction

Responsible editor: Marcus Schulz Electronic supplementary material The online version of this article (https://doi.org/10.1007/s11356-020-09891-x) contains supplementary material, which is available to authorized users. * Hongjun Yuan [email protected] 1

School of Mathematical Sciences, Anhui University, Hefei 230601, China

2

School of Environmental Science and Engineering, Tianjin University, Tianjin 300350, China

3

School of Internet, Anhui University, Hefei 230039, China

4

School of Statistics and Applied Mathematics, Anhui University of Finance and Economics, Bengbu 233030, China

With the acceleration of industrialization and urbanization, serious air pollution problems h