Mining frequent weighted closed itemsets using the WN-list structure and an early pruning strategy

PDF / 2,283,414 Bytes
21 Pages / 595.276 x 790.866 pts Page_size
27 Downloads / 195 Views

Mining frequent weighted closed itemsets using the WN-list structure and an early pruning strategy Huong Bui 1,2 & Bay Vo 3

&

Tu-Anh Nguyen-Hoang 1,2 & Unil Yun 4

# Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract The problem of mining frequent weighted itemsets (FWIs) is an extension of the mining frequent itemsets (FIs), which considers not only the frequent occurrence of items but also their relative importance in a dataset. However, like mining FIs, mining FWIs usually produces a large result set, which makes it difficult to extract rules and creates redundancy. The problem of mining frequent weighted closed itemsets (FWCIs) has been proposed as a solution to this issue, which produces a smaller result set while preserving sufficient information to extract rules. The weighted node-list (WN-list) structure is currently considered the state-ofthe-art structure for mining FWIs. In this study, we first propose the definition of WN-list ancestral operation and a theorem as the theoretical basis for eliminating unsatisfactory candidates, then propose an efficient algorithm, namely NFWCI, for mining FWCIs using the WN-list and an early pruning strategy. The experimental results on many sparse and dense datasets show that the proposed algorithm outperforms the-state-of-the-art algorithm for mining FWCIs. Keywords Data mining . Frequent weighted closed itemsets . Weighted support . WN-list structure

1 Introduction Data mining [1–6] focuses on finding anomalies, patterns, and correlations in large datasets to predict outcomes. Data mining is often used by retail and financial companies to analyze data and predict customer demand to increase revenues, cut costs, improve customer relationships, reduce risk and more. Mining frequent patterns is a fundamental research area in data

* Bay Vo [email protected] Huong Bui [email protected] Tu-Anh Nguyen-Hoang [email protected] Unil Yun [email protected] 1

University of Information Technology, Ho Chi Minh City, Vietnam

2

Vietnam National University, Ho Chi Minh City, Vietnam

3

Faculty of Information Technology, Ho Chi Minh City University of Technology (HUTECH), Ho Chi Minh City, Vietnam

4

Department of Computer Engineering, Sejong University, Seoul, Republic of Korea

mining, which focuses on finding patterns that occur frequently in a large dataset. The problem of mining frequent patterns includes subproblems such as mining frequent itemsets (FIs) [7, 8], mining frequent sequences [9, 10], and mining frequent subgraphs [11]. The mining FIs is the problem of finding the sets of items that appear together with a number of times, called support, greater than or equal a given threshold, called minimum support, in a transactional dataset. The set of FIs found is used to mine association rules [1, 4, 12] to analyze and predict trends and customer needs. However, mining FIs has two drawbacks when it comes to applications in real life. The first drawback is to consider all items as equally important, while in fact, items are often of di

Data Loading...

Mining frequent weighted closed itemsets using the WN-list structure and an early pruning strategy

Recommend Documents

Efficient Mining of Weighted Frequent Itemsets in Uncertain Databases

Mining frequent itemsets using the N-list and subsume concepts

Approximation of Frequent Itemsets

Frequent Itemsets and Association Rules

An Indexed Trie Approach to Incremental Mining of Closed Frequent Itemsets Based on a Galois Lattice Framework

Minimizing Frequent Itemsets Using Hybrid ABCBAT Algorithm

High Utility Itemsets Mining Based on Divide-and-Conquer Strategy

Comprehensive mining of frequent itemsets for a combination of certain and uncertain databases

Mining Cross-Level High Utility Itemsets

Frequent Itemset Mining

Constrained Frequent Itemset Mining

Frequent Set Mining with Constraints