Social Data Mining and Seasonal Influenza Forecasts: The FluOutlook Platform

FluOutlook is an online platform where multiple data sources are integrated to initialize and train a portfolio of epidemic models for influenza forecast. During the 2014/15 season, the system has been used to provide real-time forecasts for 7 countries i

  • PDF / 538,719 Bytes
  • 4 Pages / 439.37 x 666.142 pts Page_size
  • 58 Downloads / 185 Views

DOWNLOAD

REPORT


1

MOBS, Northeastern University, Boston, MA, USA {qi.zhang,n.perra,a.vespignani}@neu.edu 2 ISI Foundation, Turin, Italy {corrado.gioannini,daniela.paolotti,daniela.perrotta, marco.quaggiotto,michele.tizzoni}@isi.it

Abstract. FluOutlook is an online platform where multiple data sources are integrated to initialize and train a portfolio of epidemic models for influenza forecast. During the 2014/15 season, the system has been used to provide real-time forecasts for 7 countries in North America and Europe.

Keywords: Real-time forecasting

1

· Epidemic modeling · Data mining

Introduction

The real-time monitoring and modeling of infectious disease is being redefined by the novel availability of large scale social media and digital surveillance data. Several methods use social data, like search engine queries and tweets, as inputs for time series analysis; Google Flu Trends (GFT) [1] being probably the most known example. Unfortunately, most of the current approaches are unable to capture the disease transmission dynamics and its long-term trends, and suffer from several issues related to biases and statistical sampling [2]. Here we present FluOutlook (http://fluoutlook.org/), an online platform exposing real-time seasonal influenza forecasts. It integrates current and historical surveillance data, social data mining and several forecast models. Along with standard regression statistical models, FluOutlook includes stochastic generative models simulating the disease progression at the level of single individuals. The platform reports in real-time the influenza intensity with a lead time of up to four weeks, as well as main indicators of the epidemic season at its early stages. FluOutlook provides a description of the seasonal influenza that could be used by public health agency to guide their decision making process, as well as to compare and assess the performance of different forecast approaches.

c Springer International Publishing Switzerland 2015  A. Bifet et al. (Eds.): ECML PKDD 2015, Part III, LNAI 9286, pp. 237–240, 2015. DOI: 10.1007/978-3-319-23461-8 21

238

2

Q. Zhang et al.

Methodology

The FluOutlook platform consists of two parts: a computational framework that provides predictions and a user-friendly website that provides their visualization. The system architecture, shown in Fig. 1, is made by three main components. The first component mines and assimilates the social and surveillance data needed to initialize the modeling approaches. The second component is the computational system that generates the numerical output of the modeling approaches. The third component is the statistical pipeline that compares the models’ output with the current ground truth, available to define the forecast ensemble that is eventually exposed on the platform. The website of the platform runs as a Python Flask application with a PostgreSQL database, served through the Apache web service. In the landing page, maps show the current influenza activity level in each country and indicate the observed trend. The forecasting page p