Uncertain maximal frequent subgraph mining algorithm based on adjacency matrix and weight

PDF / 2,077,167 Bytes
11 Pages / 595.276 x 790.866 pts Page_size
40 Downloads / 211 Views

ORIGINAL ARTICLE

Uncertain maximal frequent subgraph mining algorithm based on adjacency matrix and weight Di Wu1 · Jiadong Ren2 · Long Sheng1

Received: 17 May 2016 / Accepted: 20 February 2017 © Springer-Verlag Berlin Heidelberg 2017

Abstract How to mine many interesting subgraphs in uncertain graph has become an important research field in data mining. In this paper, a novel algorithm Uncertain Maximal Frequent Subgraph Mining Algorithm Based on Adjacency Matrix and Weight (UMFGAMW) is proposed. The definition of the adjacency matrix and the standard matrix coding for uncertain graph are presented. The correspondence between the adjacency matrix and uncertain graph is established. A new vertex ordering policy for computing the standard coding of uncertain graph adjacency matrix is designed. The complexity of uncertain graph standard coding is reduced, and the matching speed of uncertain subgraph standard coding is improved. The definition of the weight of uncertain graph and the mean weight of uncertain edge is proposed. The importance of the uncertain subgraphs that meet the minimum support threshold in the graph dataset is fully considered. Finally, a depth-first search weighted uncertain maximal frequent subgraph mining algorithm is discussed. According to the limiting condition of the uncertain maximum frequent subgraph and weighed uncertain edge, the number of mining results is reduced effectively. Experimental results demonstrate that the UMFGAMW algorithm has higher efficiency and better scalability. Keywords Adjacency matrix · Weight · Uncertain graph · Maximum frequent subgraph · Frequent subgraph mining

* Di Wu [email protected] 1

Hebei University of Engineering, Handan, Hebei, China

2

Yanshan University, Qinhuangdao, Hebei, China

1 Introduction Recently, more areas of science attempt to describe complex structure objects by graph structure [1]. In the field of biology and medicine, scientists often use the graph structure to express the internal structure of macromolecules. In the internet field, the graph structure is used to describe the link relationships and views between websites. Since graph data have been more widely applied, graph data mining has become a hot topic in data mining research [2]. In practical applications, graph data have the following characteristics [3–5]: First, there is a large quantity amount of data. Second, the data have a fast data growth rate. The update of graph data is highly frequent. Because of these features and immature technical conditions, the accuracy of obtained graph data is poor, and uncertainty exists. For example, the field of molecular biology utilizes graph data to describe protein and its interaction network. Protein is often denoted by vertices, and the relationship of proteins is represented by the connection between vertices. Due to the restriction of detection methods during experiments, certain parts can not be accurately detected, creating uncertainty. The graph data shows the presence of the edge based on a certain probability. In su

Data Loading...

Uncertain maximal frequent subgraph mining algorithm based on adjacency matrix and weight

Recommend Documents

Mining Maximal Frequent Subtrees with Lists-Based Pattern-Growth Method

A Closed Frequent Subgraph Mining Algorithm in Unique Edge Label Graphs

Efficient Mining of Weighted Frequent Itemsets in Uncertain Databases

A Note on the Practicality of Maximal Planar Subgraph Algorithms

A Frequent Concept Lattice Algorithm Based on FP-tree for Mining Association Rules

Maximal Itemset Mining

Frequent Itemset Mining

Constrained Frequent Itemset Mining

Comprehensive mining of frequent itemsets for a combination of certain and uncertain databases

The Maximum Weight Connected Subgraph Problem

Frequent Set Mining with Constraints

Managing and Mining Uncertain Data