k -dominant Skyline query algorithm for dynamic datasets

  • PDF / 494,140 Bytes
  • 9 Pages / 612.284 x 802.205 pts Page_size
  • 20 Downloads / 179 Views

DOWNLOAD

REPORT


k-dominant Skyline query algorithm for dynamic datasets Zhiyun ZHENG, Ke RUAN, Mengyao YU, Xingjin ZHANG, Ning WANG, Dun LI School of Information Engineering, Zhengzhou University, Zhengzhou 450001, China c Higher Education Press 2020 

Abstract At present, most k-dominant Skyline query algorithms are oriented to static datasets, this paper proposes a kdominant Skyline query algorithm for dynamic datasets. The algorithm is recursive circularly. First, we compute the dominant ability of each object and sort objects in descending order by dominant ability. Then, we maintain an inverted index of the dominant index by k-dominant Skyline point calculation algorithm. When the data changes, it is judged whether the update point will affect the k-dominant Skyline point set. So the kdominant Skyline point of the new data set is obtained by inserting and deleting algorithm. The proposed algorithm resolves maintenance issue of a frequently updated database by dynamically updating the data sets.The experimental results show that the query algorithm can effectively improve query efficiency. Keywords multi-objective decision, Skyline queries, kdominant Skyline queries, dynamic datasets

1

Introduction

The Skyline query problem is also called Pareto optimal problem, has been widely used in multi-objective optimization query, decision support, data mining and so on [1]. The main idea of Skyline query is to select a set of tuples that are not dominated by other points in a given set of d-dimensional attribute tuples [2]. Data point p(p1 , p2 , . . . , pd ) dominates the data points q(q1 , q2 , . . . , qd ), meaning that the values of the data points p in all dimensions are not worse than data points q, and the data points p is superior to the data point q in at least one dimension [3]. Traditional Skyline query requires that the value of all dimensions on the dominant point cannot be inferior to the nondominant point [4]. In practical applications, data is often highdimensional. As the data dimension increases, the probability that one data point dominates another data point becomes lower. The Skyline query has a dimension curse problem, resulting in a large number of data points being returned in the Skyline query, which does not provide useful information for user decisions [5–7]. In order to find important data points in the highdimensional space and reduce the returned result sets, Chan et al. proposed the concept of query, which reduced the definition of domination and made it easier to generate dominance relationships between data points, thus to narrow the result set to Received July 5, 2019; accepted March 3, 2020 E-mail: [email protected]

an appropriate range. While it can reduce the number of retrieved objects, k-dominant Skyline objects are hard to maintain if the database is updated. This paper studies the maintenance and query of k-dominant Skyline objects in frequently updated databases. In order to reduce the excessive meaningless comparison between data points, this paper proposes a pre-sorting algorithm, which mak