High utility itemset mining using path encoding and constrained subset generation

  • PDF / 1,201,831 Bytes
  • 9 Pages / 595.276 x 790.866 pts Page_size
  • 94 Downloads / 232 Views

DOWNLOAD

REPORT


High utility itemset mining using path encoding and constrained subset generation Vamsinath Javangula 1 & Suvarna Vani Koneru 2 & Haritha Dasari 3 Received: 24 June 2020 / Accepted: 4 August 2020 # Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract In this paper a two phase approach for high utility itemset mining has been proposed. In the first phase potential high utility itemsets are generated using potential high utility maximal supersets. The transaction weighted utility measure is used in ascertaining the potential high utility itemsets. The maximal supersets are obtained from high utility paths ending in the items in the transaction database. The supersets are constructed without using any tree structures. The prefix information of an item in a transaction is stored in the form of binary codes. Thus, the prefix information of a path in a transaction is encoded as binary codes and stored in the node containing the item information. The potential high utility itemsets are generated from the maximal supersets using a modified set enumeration tree. The high utility itemsets are then obtained from the set enumeration tree by calculating the actual utility by scanning the transaction database. The experiments highlight the superior performance of the system compared to other similar systems in the literature. Keywords Itemset mining . High utility itemset mining . Mining weighted frequent patterns . Transaction utility mining . Utility mining

1 Introduction Frequent itemset mining is is one of the generally investigated territories under information mining. It was primarily used for constructing rules in association rule mining [1–3]. In terms of market basket analysis,it is a technique to identify items frequently purchase together [4, 5]. This information can utilized

This article is part of the Topical Collection: Special Issue on Network In Box, Architecture, Networking and Applications Guest Editor: Ching-Hsien Hsu * Vamsinath Javangula [email protected] Suvarna Vani Koneru [email protected] Haritha Dasari [email protected] 1

CSE Department, P.B.R V I T S, Kavali, Andhra Pradesh, India

2

CSE Department, Velagapudi Ramakrishna Siddhartha Engineering College, Kanuru, Andhra Pradesh, India

3

CSE Department, University College of Engineering College, JNTUK, Kakinada, Andhra Pradesh, India

for improving sales of product by formalizing proper marketing strategies [6]. Frequent itemsets are mined using a measure called support which is a measure of the count of the event of itemset in the exchanged performed. The frequent itemset represent items in high volume transactions, it does not convey anything about the profitability of the transactions. Businesses often are interested in transactions that are highly profitable to them. In order to measure the profitability of the things in a exchange, a unit called utility was in the frequent itemset mining process. This process of mining frequent itemset that exhibit high utility is called HUI Mining. The utility of an itemset i