Improved community structure discovery algorithm based on combined clique percolation method and K-means algorithm

  • PDF / 2,423,178 Bytes
  • 10 Pages / 595.276 x 790.866 pts Page_size
  • 37 Downloads / 246 Views

DOWNLOAD

REPORT


Improved community structure discovery algorithm based on combined clique percolation method and K-means algorithm Zhou Zhou 1 & Zhuopeng Xiao 2 & WeiHong Deng 2 Received: 23 August 2019 / Accepted: 28 February 2020 # Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Research on the community structure of networks is beneficial for understanding the structure of networks, analyzing their characteristics and discovering the rules hidden in these networks. To address issues from previous community mining algorithms, such as the low rate of convergence and high time complexity, this study proposes an improved community structure discovery algorithm named CPMK-Means algorithm. The main idea of this algorithm can be summarised as follows. The clique percolation method (CPM) algorithm generates the maximum number of cliques by combining depth-first search with breadthfirst search so that the number of cluster centres is determined. Then, the k centres are selected based on the principle of the maximum degree of centres and minimum similarity between different centres. Afterwards, nodes in the network are assigned to the communities formed by the k centres, and the iterations are performed repeatedly until the centres become stable. Finally, the overlapping communities are merged. Experiments are carried out on standard data sets Football and Collins to evaluate the performance of the CPMK-Means algorithm. Results indicate that the CPMK-Means algorithm can achieve better community mining and higher execution efficiency compared with other algorithms. Furthermore, it is superior to other algorithms in terms of precision, recall, accuracy, F-measure and separation. Keywords Big data . Parallel computing . Community structure . Execution efficiency . Mining algorithm

1 Introduction In the era of mobile Internet, social software is being widely used for efficient interpersonal communication, generating massive mutually interactive social network data. According to statistics [1] (http://finance.sina.com.cn/roll/2017-11-07/ doc-ifynmvuq9346566.shtml), the number of monthly active Facebook users reached 2.07 billion as of the end of September 2017, corresponding to a year-on-year growth rate of 16%. Moreover, the number of daily active Facebook users reached 1.37 billion. Similarly, the number of monthly active users of Sina Weibo in September 2017 increased by approximately 79.00 million, compared with that in September 2016 (376.00 million). Therefore, mining for a meaningful Zhou Zhou and Zhuopeng Xiao are co-first authors * WeiHong Deng [email protected] 1

Department of Mathematics and Computer Science, Changsha University, Changsha 410083, China

2

Information Engineering Department, Zhangjiajie Institute of Aeronautical Engineering, Zhangjiajie 427000, China

community structure from social network data is one of the main objectives of the current research [2, 3]. Investigating the community structure of networks is beneficial for understanding the structure of networks, analyzing their char