Improving query performance on dynamic graphs

  • PDF / 3,717,862 Bytes
  • 31 Pages / 595.276 x 790.866 pts Page_size
  • 15 Downloads / 202 Views

DOWNLOAD

REPORT


REGULAR PAPER

Improving query performance on dynamic graphs Gala Barquero1 · Javier Troya2

· Antonio Vallecillo1

Received: 30 January 2020 / Revised: 23 September 2020 / Accepted: 30 September 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Querying large models efficiently often imposes high demands on system resources such as memory, processing time, disk access or network latency. The situation becomes more complicated when data are highly interconnected, e.g. in the form of graph structures, and when data sources are heterogeneous, partly coming from dynamic systems and partly stored in databases. These situations are now common in many existing social networking applications and geo-location systems, which require specialized and efficient query algorithms in order to make informed decisions on time. In this paper, we propose an algorithm to improve the memory consumption and time performance of this type of queries by reducing the amount of elements to be processed, focusing only on the information that is relevant to the query but without compromising the accuracy of its results. To this end, the reduced subset of data is selected depending on the type of query and its constituent filters. Three case studies are used to evaluate the performance of our proposal, obtaining significant speedups in all cases. Keywords Data stream processing · Dynamic graphs · Performance optimization · Precomputing systems · Data queries

1 Introduction Nowadays, many information processing systems need to handle the data flows that are constantly generated from different sources, such as social networks, geo-location systems or e-commerce applications [16,17]. Companies and organizations use this information to make informed decisions or detect situations of interest in real time. For example, the Spanish BBVA bank studied the economic impact of Barcelona’s 2012 Mobile World Congress by analysing all credit card transactions during two weeks [13]. Another example of the need to process data flows to extract useful information is the detection of possible terrorist attacks analysing social networks and Web access logs [50,71]. In Communicated by Daniel Varro.

B

Javier Troya [email protected] http://www.lsi.us.es/~jtroya Gala Barquero [email protected] Antonio Vallecillo [email protected]

1

ITIS Software, Universidad de Málaga, Málaga, Spain

2

SCORE Lab, I3US Institute, Universidad de Sevilla, Seville, Spain

these cases, an efficient query system needs to process the information as fast as possible to identify those situations of interest without delay. However, the large volume of data to be efficiently processed represents a significant challenge to current information processing systems. To speed up the processing of the information, these systems usually apply different mechanisms, such as storing the information in clusters to be processed in parallel [35,49,67], or selecting a subset of the data that is finally queried [10,24, 26,61]. These approaches usually deal with streams of data that represe