A robust spectral clustering algorithm based on grid-partition and decision-graph

  • PDF / 1,761,493 Bytes
  • 12 Pages / 595.276 x 790.866 pts Page_size
  • 20 Downloads / 191 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

A robust spectral clustering algorithm based on grid‑partition and decision‑graph Lijuan Wang1,3 · Shifei Ding1,2   · Yanru Wang1 · Ling Ding1 Received: 16 April 2020 / Accepted: 2 November 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Spectral clustering (SC) transforms the dataset into a graph structure, and then finds the optimal subgraph by the way of graph-partition to complete the clustering. However, SC algorithm constructs the similarity matrix and feature decomposition for overall datasets, which needs high consumption. Secondly, k-means is taken at the clustering stage and it selects the initial cluster centers randomly, which leads to the unstable performance. Thirdly, SC needs prior knowledge to determine the number of clusters. To deal with these issues, we propose a robust spectral clustering algorithm based on grid-partition and decision-graph (PRSC) to reduce the amount of calculation and improve the clustering efficiency. In addition, a decisiongraph method is added to identify the cluster centers quickly to improve the algorithm stability without any prior knowledge. A numerical experiments validate that PRSC algorithm can effectively improve the efficiency of SC. It can quickly obtain the stable performance without any prior knowledge. Keywords  Spectral clustering · Grid-partition method · Decision-graph method · Stable performance

1 Introduction In the big data era, clustering is widely used in data mining, pattern recognition, image segmentation and maybe it is the most important technique of unsupervised learning [1–3]. The mainly goal of clustering is to divide the original data points into different clusters according to their similarity and the similarity of the same cluster is larger and the dissimilarity between different clusters is larger [4]. However, the performance of most well-known traditional clustering algorithms, such as the k-means algorithm, the affinity propagation (AP) algorithm, and the density-based spatial clustering of applications with noise (DBSCAN) algorithm are

* Shifei Ding [email protected] 1



School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China

2



Mine Digitization Engineering Research Center of Ministry of Education of the People’s Republic of China, Xuzhou 221116, China

3

School of Information and Electrical Engineering, Xuzhou College of Industrial Technology, Xuzhou 221400, China



powerless when processing the datasets with cross-classes and complex distribution structure [5–7]. Spectral clustering (SC) algorithm is a clustering method based on graph theory [8], which is a classical kernel-based method. For a given dataset clustering, it constructs an undirected weighted graph, where the vertices of the graph represent data points, and each edge of the graph has a weight to describe the similarity between the vertices [9]. Aggregating data points into multiple clusters is equivalent to dividing the graph into several subgraphs, so that the conn