An improved density-based adaptive p -spectral clustering algorithm

  • PDF / 2,533,786 Bytes
  • 12 Pages / 595.276 x 790.866 pts Page_size
  • 108 Downloads / 261 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

An improved density‑based adaptive p‑spectral clustering algorithm Yanru Wang1 · Shifei Ding1,2   · Lijuan Wang1 · Ling Ding1 Received: 30 May 2020 / Accepted: 5 November 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract As a generalization algorithm of spectral clustering, p-spectral clustering has gradually attracted extensive attention of researchers. Gaussian kernel function is generally used in traditional p-spectral clustering to construct the similarity matrix of data. However, the Gaussian kernel function based on Euclidean distance is not effective when the data-set is complex with multiple density peaks or the density distribution is uniform. In order to solve this problem, an improved Density-based adaptive p-spectral clustering algorithm (DAPSC) is proposed, the prior information is considering to adjust the similarity between sample points and strengthen the local correlation between data points. In addition, by combining the density canopy method to update the initial clustering center and the number of clusters, the algorithm sensitivity of the original p-spectral clustering caused by the two is weakened. By experiments on four artificial data-sets and 8F UCI data-sets, we show that the proposed DAPSC has strong adaptability and more accurate compared with the four baseline methods. Keywords  Spectral clustering · p-laplacian matrix · Similarity matrix · Density canopy

1 Introduction As an unsupervised data mining technology, clustering has been widely applied in fields such as medicine, computer vision, and social network [1–3]. Traditional clustering algorithms work in convex spherical spaces, for instance K-means and EM algorithms. Otherwise, they will easily fall into local optimality. In recent years, spectral clustering algorithm has attracted much attention in the field of machine learning and pattern recognition due to its ability to deal with nonlinear separable problems and well-defined mathematical frameworks. Spectral clustering is a clustering method to transform cluster analysis into graph cutting problems. The goal of graph cutting is to maximize the difference between the sub-graphs and the similarity within the sub-graphs [4]. Compared with the traditional clustering algorithm, spectral clustering has obvious advantages. It

* Shifei Ding [email protected] 1



School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China



Mine Digitization Engineering Research Center of Ministry of Education of the People’s Republic of China, Xuzhou 221116, China

2

can cluster in any space and converge to the global optimal solution, which is especially suitable for non-convex datasets. Therefore, it is applicable to many practical application fields [5]. Recently, a series of improved spectral clustering algorithms and their applications have been proposed [6–8]. According to the improvement strategy, these algorithms are mainly divided into two categories. One is to enhance the adaptability of the meth