Response-Guided Community Detection: Application to Climate Index Discovery

Discovering climate indices–time series that summarize spatiotemporal climate patterns–is a key task in the climate science domain. In this work, we approach this task as a problem of response-guided community detection; that is, identifying communities i

  • PDF / 503,487 Bytes
  • 16 Pages / 439.37 x 666.142 pts Page_size
  • 53 Downloads / 186 Views

DOWNLOAD

REPORT


3

North Carolina State University, Raleigh, NC, USA [email protected] 2 University of Minnesota, Minneapolis, MN, USA Oak Ridge National Laboratory, Oak Ridge, TN, USA

Abstract. Discovering climate indices–time series that summarize spatiotemporal climate patterns–is a key task in the climate science domain. In this work, we approach this task as a problem of response-guided community detection; that is, identifying communities in a graph associated with a response variable of interest. To this end, we propose a general strategy for response-guided community detection that explicitly incorporates information of the response variable during the community detection process, and introduce a graph representation of spatiotemporal data that leverages information from multiple variables. We apply our proposed methodology to the discovery of climate indices associated with seasonal rainfall variability. Our results suggest that our methodology is able to capture the underlying patterns known to be associated with the response variable of interest and to improve its predictability compared to existing methodologies for data-driven climate index discovery and official forecasts. Keywords: Community detection · Spatiotemporal data index discovery · Seasonal rainfall prediction

1

·

Climate

Introduction

Detecting communities in real-world networks is a key task in many scientific domains. Oftentimes, domain scientists are particularly concerned with finding communities associated with a response variable of interest that can be used to analyze or predict this response variable. For example, in climate science, such communities may represent spatiotemporal climate patterns associated with a particular weather event [24], while in biology, they may represent groups of functionally associated genes associated with a particular phenotype [12]. However, community detection techniques are traditionally unsupervised learning methods, and thus do not take into account the variability of the response variable of interest. Therefore, the communities identified may not necessarily be associated with this response variable. Furthermore, even though semi-supervised methods have been proposed to incorporate prior knowledge c Springer International Publishing Switzerland 2015  A. Appice et al. (Eds.): ECML PKDD 2015, Part II, LNAI 9285, pp. 736–751, 2015. DOI: 10.1007/978-3-319-23525-7 45

Response-Guided Community Detection

737

to the community detection process, these methods do not consider a response variable either and require partial information about the community memberships, which may not be available [6]. For this reason, we introduce the problem of response-guided community detection–that is, identifying communities in a graph associated with a response variable of interest–and study its application to the discovery of climate indices, an important task in the climate science domain. Climate indices are time series that summarize spatiotemporal patterns in the global climate system. These patterns are often associated wi