Multi-objective materialized view selection using NSGA-II
- PDF / 1,271,965 Bytes
- 13 Pages / 595.276 x 790.866 pts Page_size
- 45 Downloads / 211 Views
ORIGINAL ARTICLE
Multi-objective materialized view selection using NSGA-II Jay Prakash1 • T. V. Vijay Kumar1
Received: 17 July 2019 / Revised: 17 July 2019 Ó The Society for Reliability Engineering, Quality and Operations Management (SREQOM), India and The Division of Operation and Maintenance, Lulea University of Technology, Sweden 2020
Abstract Data warehouse is constructed with the purpose of supporting decision making. Decision making queries, being long and complex, consume a lot of time in processing against a continuously growing data warehouse. View materialization is one of the alternative ways of improving the response time of such analytical or decision making queries. This involves selection and materialization of views that minimize the analytical query response times while adhering to the resource constraints. This is referred to as the view selection problem, which is a NP-Hard problem. The view selection problem is concerned with simultaneously minimizing the cost of evaluating materialized and non-materialized views. This being a bi-objective optimization problem is addressed using NSGA-II in this paper. The proposed approach aims to achieve an acceptable trade-off between the afore-mentioned two objectives. Keywords Data warehouse OLAP Materialized view selection Multi-objective optimization NSGA-II
1 Introduction Commercial organizations make business decisions by analyzing their performance using historical business transactions data. The data is usually spread across multiple disparate databases, as offices of an organization are spread
& T. V. Vijay Kumar [email protected] 1
School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi 110067, India
across many locations in the world. Since the decision making queries are analytical in nature, processing such queries is time consuming as the required data is spread across multiple disparate data sources. Processing these analytical queries requires the data to be retrieved from multiple data sources, either using the on-demand approach or the in-advance approach for querying (Widom 1995). Data warehousing is based on the later approach, where data is retrieved in advance from multiple disparate sources and accumulated in a central repository, referred to as a data warehouse, within the organization. The purpose of a data warehouse is to answer analytical queries in order to facilitate management decisions. A data warehouse storing time variant and non-volatile data is always available for querying; even when remote data sources are inaccessible (Inmon 2003; Kimball and Ross 2002). A data warehouse is designed with the purpose of supporting decision making (Inmon 2003). Since decision making queries are long and complex, processing these queries against a data warehouse is time consuming. Data warehouse addresses this problem by pre-computing and storing the relevant and the required data in it. This data is stored in the form of materialized views (Roussopoulos 1997). For an n-dimensional data set, the p
Data Loading...