Top- k term publish/subscribe for geo-textual data streams
- PDF / 2,865,366 Bytes
- 28 Pages / 595.276 x 790.866 pts Page_size
- 99 Downloads / 228 Views
REGULAR PAPER
Top-k term publish/subscribe for geo-textual data streams Lisi Chen1 · Shuo Shang1 · Christian S. Jensen2 · Jianliang Xu3 · Panos Kalnis4 · Bin Yao5 · Ling Shao6 Received: 19 January 2019 / Revised: 15 November 2019 / Accepted: 21 February 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020
Abstract Massive amounts of data that contain spatial, textual, and temporal information are being generated at a rapid pace. With streams of such data, which includes check-ins and geo-tagged tweets, available, users may be interested in being kept upto-date on which terms are popular in the streams in a particular region of space. To enable this functionality, we aim at efficiently processing two types of general top-k term subscriptions over streams of spatio-temporal documents: region-based top-k spatial-temporal term (RST) subscriptions and similarity-based top-k spatio-temporal term (SST) subscriptions. RST subscriptions continuously maintain the top-k most popular trending terms within a user-defined region. SST subscriptions free users from defining a region and maintain top-k locally popular terms based on a ranking function that combines term frequency, term recency, and term proximity. To solve the problem, we propose solutions that are capable of supporting reallife location-based publish/subscribe applications that process large numbers of SST and RST subscriptions over a realistic stream of spatio-temporal documents. The performance of our proposed solutions is studied in extensive experiments using two spatio-temporal datasets. Keywords Publish · Subscribe · Spatio-temporal · Keyword · Stream
1 Introduction
B
Shuo Shang jedi.shang@gmail.com Lisi Chen lchen012@e.ntu.edu.sg Christian S. Jensen csj@cs.aau.dk Jianliang Xu xujl@comp.hkbu.edu.hk Panos Kalnis panos.kalnis@kaust.edu.sa Bin Yao yaobin@cs.sjtu.edu.cn Ling Shao ling.shao@ieee.org
1
University of Electronic Science and Technology of China, Chengdu, China
2
Aalborg University, Aalborg, Denmark
3
Hong Kong Baptist University, Kowloon Tong, Hong Kong
4
King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
5
Shanghai Jiao Tong University, Shanghai, China
6
Inception Institute of Artificial Intelligence, Abu Dhabi, UAE
Very large volumes of spatio-temporal documents are being generated at a rapid pace by social media users. For example, Twitter has more than 300 million monthly active users who post 500 million tweets per day [69]. All tweets are associated with a timestamp that indicates their arrival time, and many tweets are associated with locations, which may be either coordinates (latitude and longitude) or semantic locations (e.g., “Chicago, IL, USA”). Beyond Twitter, location-based social networking services (e.g., Foursquare, Yelp, Booking.com) allow users to publish check-ins or reviews that contain text descriptions, geographical information and timestamps. Such spatio-temporal documents that arrive continuously in data streams often offer first-hand information about local events of diffe
Data Loading...