High Average Utility Itemset Mining: A Survey
The past two decades has seen a great amount of research being done in the area of “Data Mining”. “Frequent Itemset Mining” is one application of “Data Mining” used to mine “Frequent Patterns” from databases. Itemsets that occur frequently only extract pa
- PDF / 861,628 Bytes
- 28 Pages / 439.37 x 666.142 pts Page_size
- 48 Downloads / 294 Views
Abstract The past two decades has seen a great amount of research being done in the area of “Data Mining”. “Frequent Itemset Mining” is one application of “Data Mining” used to mine “Frequent Patterns” from databases. Itemsets that occur frequently only extract patterns that are frequent. This type of mining has its disadvantage. Items that are frequent may or may not be profitable to an organization. For this type of disadvantage to be overcome “High Utility Itemset Mining” had been introduced. It is used to mine profitable patterns from databases. This can help organizations to market profitable patterns. “High Utility Itemset Mining” also has its own disadvantage, because while mining itemsets, the lengths of the itemsets are not taken into consideration. The bigger the length of the itemset the more the profit. This does not give a real representation of the value or profit of the itemset. To get over this problem “High Average Utility Itemset Mining” had been introduced. Keywords High Average Utility Itemset Mining · Transaction utility · Transaction maximum utility
1 Introduction The main idea of “Data Mining” [1, 2] is to discover knowledge in humongous amounts of data. Knowledge from data can be discovered using “Frequent Itemset Mining”. The items that are frequent are used to mine association rules [3]. Itemets which occur frequently can be discovered using “Frequent Itemset Mining” algorithms. For example, in a super market setting, milk and bread are bought more frequently than milk and diapers; therefore, milk and bread is an itemset which is frequent as it exceeds minimum support [3]. Many contributions have been made till M. J. Kenny Kumar (B) · D. Rana Sardar Vallabhbhai National Institute of Technology, Surat, India e-mail: [email protected] D. Rana e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 N. Chaki et al. (eds.), Proceedings of International Conference on Computational Intelligence and Data Engineering, Lecture Notes on Data Engineering and Communications Technologies 56, https://doi.org/10.1007/978-981-15-8767-2_30
347
348
M. J. Kenny Kumar and D. Rana
date for “Frequent Itemset Mining” [3–9]. A critical drawback of “Frequent Itemset Mining” is of the fact that it only mines “Frequent Itemsets” and not the itemsets that are profitable to an organization. For example, an itemset of Home automation system (1), Laptop (1) is more profitable than Bread (10), Milk (10). In this example, although home automation system and laptop have frequency of only 1, the itemset is more profitable than bread and milk which have a frequency of 10. To overcome this drawback “High Utility Oriented Itemset Mining” (HUIM) [10–13] was proposed. “Utility” represents the profit or weight of an itemset. Unlike “Frequent Itemset Mining”, “High Utility Itemset Mining” (HUIM) does not adhere to the “Downward Closure Property”. Developing upper bounds to eliminate itemets is a challenging task in HUIM. Applications of “Utility Itemset Mining” i
Data Loading...