High utility itemset mining: a Boolean operators-based modified grey wolf optimization algorithm

  • PDF / 2,443,229 Bytes
  • 14 Pages / 595.276 x 790.866 pts Page_size
  • 97 Downloads / 212 Views

DOWNLOAD

REPORT


(0123456789().,-volV)(0123456789().,-volV)

METHODOLOGIES AND APPLICATION

High utility itemset mining: a Boolean operators-based modified grey wolf optimization algorithm N. Pazhaniraja1 • S. Sountharrajan1



B. Sathis Kumar2

 Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract In data mining, mining high utility itemset (HUI) is one among the recent thrust area that receives several approaches for solving it in an effective manner. In the past decade, addressing optimization problems using evolutionary algorithms are an unavoidable strategy due to its convergence towards optimal solution within the stipulated time. The results of evolutionary algorithms on various optimization problems are far effective when compared to the exhaustive approaches with respect to computational time. The problem with HUI is discovering a set of items from a transactional database that possess high level of utility when compared with other distinctive sets. This problem becomes harder while addressing the count of items in the database while its higher and computational time to solve this problem using exhaustive search becomes exponential as proposition of items in transaction database increases. In this paper, an optimization model based on the biological behaviour of grey wolf is proposed; the model namely grey wolf optimization algorithm is used to solve HUI using five different Boolean operations. The proposed model is evaluated using standard performance metrics over synthetic datasets and real-world datasets. The proposed model results are then compared with recent HUIM models to show the significance. Keywords High utility itemset  Boolean operators  Grey wolf optimization algorithm

1 Introduction Data mining plays a vital role in the stream of knowledge extraction from a collection of data where the data are either unstructured or hefty for processing. Mining knowledge from data is a typical procedure to extract valuable and intellectual information from vast variety of repositories (Wu et al. 2014; Zaki 2014). Two basic approaches which initially strike the data mining stream are

Communicated by V. Loia. & S. Sountharrajan [email protected] N. Pazhaniraja [email protected] B. Sathis Kumar [email protected] 1

School of Computing Science and Engineering, VIT Bhopal University, Sehore, Madhya Pradesh, India

2

School of Computer Science and Engineering, VIT University, Chennai, India

Associative Rule Mining (ARM) (Benites and Sapozhnikova 2014; Zhang and Zhang 2002) and Frequency Itemset Mining (FIM) (Agrawal and Srikant 1994; Song et al. 2008). These models are not only used on data mining process but also in other applications due to its scalability (Quadrana et al. 2015; Tran et al. 2017). FIM deals with the problem to identify the set of items that occurred in transaction database not lesser to user specified threshold value. FIM model in general template identifies the item sets that occur frequently in the transactions. However, the participating items need not to be of