Alternative Method for Incrementally Constructing the FP-Tree

The FP-tree is an effective data structure that facilitates the mining of frequent patterns from transactional databases. But, transactional databases are dynamic in general, and hence modifications on the database must be reflecting onto the FP-tree. Con

PDF / 494,172 Bytes
17 Pages / 439.18 x 665.994 pts Page_size
32 Downloads / 224 Views

DOWNLOAD

REPORT

Department of Computer Science, University of Calgary, Calgary, AB, Canada Department of Computer Science, Global University, Beirut, Lebanon

Summary. The FP-tree is an eﬀective data structure that facilitates the mining of frequent patterns from transactional databases. But, transactional databases are dynamic in general, and hence modiﬁcations on the database must be reﬂecting onto the FP-tree. Constructing the FP-tree from scratch and incrementally updating the FP-tree are two possible choices. However, from scratch construction turns unfeasible as the database size increases. So, this chapter addresses incremental update by extending the FP-tree concepts and manipulation process. Our new approach is capable of handling all kinds of changes, include additions, deletions and modiﬁcations. The target is achieved by constructing and incrementally dealing with the complete FP-tree, i.e., with one as the minimum support threshold. Constructing the complete FP-tree has the other advantage that it provides the freedom of mining for lower minimum support values without the need to reconstruct the tree. However, directly reﬂecting the changes onto the FP-tree may invalidate the basic FP-tree structure. Thus, we apply a sequence of shuﬄing and merging operations to validate and maintain the modiﬁed tree. The experiments conducted on synthetic and real datasets clearly highlight advantages of the proposed incremental approach over constructing the FP-tree from scratch.

1 Introduction Data mining is the process of discovering and predicting hidden and unknown knowledge by analyzing known databases. It is diﬀerent from querying in the sense that querying is a retrieval process, while mining is a discovery process. Data mining has received considerable attention over the past decades and a number of eﬀective mining techniques already exist. They are well investigated and documented in the literature. However, existing and emerging applications of data mining motivated the development of new techniques and the extension of existing ones to adapt to the change. Data mining has several applications, including market analysis, pattern recognition, gene expression data analysis, spatial data analysis, among others. Muhaimenul et al.: Alternative Method for Incrementally Constructing the FP-Tree, Studies in Computational Intelligence (SCI) 109, 361–377 (2008) c Springer-Verlag Berlin Heidelberg 2008 www.springerlink.com

362

Muhaimenul et al.

Association rules mining is a technique that investigates the correction between items within the transactions of a given database. Explicitly, given a database of transactions, such that each transaction contains a set of items, the association rules mining process determines correlations of the form X ⇒ Y , such that X and Y are disjoint sets of items from the investigated database. A correlation X ⇒ Y is characterized by support and conﬁdence. The former refers to the percentage of transactions that contain all items in X ∪Y by considering all the transactions in the database; and th

Data Loading...

Alternative Method for Incrementally Constructing the FP-Tree

Recommend Documents

Method for constructing theoretical phase diagrams for tetramethylammonium family crystals

An alternative method for the recovery of lithium from spodumene

Evolutionary Method of Constructing Artificial Intelligence Systems

Constructing Mixed Algorithms on the Basis of Some Bundle Method

Quality-Days Incrementally Gained: A New Approach for Short-Term Evaluations Based on the Extended Q-TWiST Method Illust

An alternative method for penetration depth determination in nanoindentation measurements

Clustering by Intent: A Semi-Supervised Method to Discover Relevant Clusters Incrementally

Method of Constructing Node Map in Wireless Mesh Sensor Network

Accuracy Estimation for an Incrementally Learning Cooperative Inventory Assistant Robot

Approximately Bisimilar Symbolic Models for Incrementally Stable Switched Systems

Constructing PICES

Technologies for Constructing Intelligent Systems 1 Tasks