Association Rules Mining by Improving the Imperialism Competitive Algorithm (ARMICA)

Many algorithms have been proposed for Association Rules Mining (ARM), like Apriori. However, such algorithms often have a downside for real word use: they rely on users to set two parameters manually, namely minimum Support and Confidence. In this paper,

  • PDF / 263,126 Bytes
  • 13 Pages / 439.37 x 666.142 pts Page_size
  • 26 Downloads / 215 Views

DOWNLOAD

REPORT


Abstract. Many algorithms have been proposed for Association Rules Mining (ARM), like Apriori. However, such algorithms often have a downside for real word use: they rely on users to set two parameters manually, namely minimum Support and Confidence. In this paper, we propose Association Rules Mining by improving the Imperialism Competitive Algorithm (ARMICA), a novel ARM method, based on the heuristic Imperialism Competitive Algorithm (ICA), for finding frequent itemsets and extracting rules from datasets, whilst setting support automatically. Its structure allows for producing only the strongest and most frequent rules, in contrast to many ARM algorithms, thus alleviating the need to define minimum support and confidence. Experimental results indicate that ARMICA generates accurate rules faster than Apriori. Keywords: Association rules mining

 Data mining  Knowledge engineering

1 Introduction With the dramatic increase in the amount of available data, searching for information in databases can be done manually with queries, but takes a long time and is not always efficient [1]. Alternatively, Data mining and particularly Association Rules Mining (ARM) have been proposed to analyze databases and extract frequent patterns and rules. A well-known method for ARM algorithms employs a two-step approach: find the Frequent List and extract Rules from that list [2]. However, there are still challenges faced by many ARM algorithms, as they require the user to define two parameters named Minimum Support and Minimum Confidence [3–5]. Another challenge for Apriori style ARM algorithms is that they use a time and resource consuming process, named Pruning which checks the feasibility of a rule and removes weak rules [5]. ARM algorithms could improve by approaches that automatically determine the value for parameters like minimum support and confidence [6–8], as well as by replacing pruning with more efficient methods. Although there have been researches addressing this topic, more needs to be done. In this paper, we propose Association Rules Mining by improving the Imperialism Competitive Algorithm (ARMICA), a novel dynamic method based on the heuristic Imperialism Competitive Algorithm (ICA) [9]. ARMICA customizes itself according to © IFIP International Federation for Information Processing 2016 Published by Springer International Publishing Switzerland 2016. All Rights Reserved L. Iliadis and I. Maglogiannis (Eds.): AIAI 2016, IFIP AICT 475, pp. 242–254, 2016. DOI: 10.1007/978-3-319-44944-9_21

Association Rules Mining

243

the database that it operates on, and eliminates the dependability on initial parameters like minimum support and confidence. It automatically selects the strongest rules from the database. Moreover, in contrast with algorithms like Apriori, it does not involve time-consuming pruning. Because it eliminates all the weak items, it results in eliminating the need for pruning. ARMICA improves ICA in that it selects imperialists among the most powerful countries, rather than randomly, thus alleviating the need for a