McDPC: multi-center density peak clustering

  • PDF / 2,277,039 Bytes
  • 14 Pages / 595.276 x 790.866 pts Page_size
  • 116 Downloads / 282 Views

DOWNLOAD

REPORT


(0123456789().,-volV)(0123456789(). ,- volV)

ORIGINAL ARTICLE

McDPC: multi-center density peak clustering Yizhang Wang1,2 • Di Wang3,4 • Xiaofeng Zhang5 • Wei Pang6 • Chunyan Miao3,4,7 • Ah-Hwee Tan3,7 You Zhou1,2



Received: 11 July 2019 / Accepted: 23 January 2020 Ó Springer-Verlag London Ltd., part of Springer Nature 2020

Abstract Density peak clustering (DPC) is a recently developed density-based clustering algorithm that achieves competitive performance in a non-iterative manner. DPC is capable of effectively handling clusters with single density peak (single center), i.e., based on DPC’s hypothesis, one and only one data point is chosen as the center of any cluster. However, DPC may fail to identify clusters with multiple density peaks (multi-centers) and may not be able to identify natural clusters whose centers have relatively lower local density. To address these limitations, we propose a novel clustering algorithm based on a hierarchical approach, named multi-center density peak clustering (McDPC). Firstly, based on a widely adopted hypothesis that the potential cluster centers are relatively far away from each other. McDPC obtains centers of the initial micro-clusters (named representative data points) whose minimum distance to the other higher-density data points are relatively larger. Secondly, the representative data points are autonomously categorized into different density levels. Finally, McDPC deals with micro-clusters at each level and if necessary, merges the micro-clusters at a specific level into one cluster to identify multi-center clusters. To evaluate the effectiveness of our proposed McDPC algorithm, we conduct experiments on both synthetic and real-world datasets and benchmark the performance of McDPC against other state-ofthe-art clustering algorithms. We also apply McDPC to perform image segmentation and facial recognition to further demonstrate its capability in dealing with real-world applications. The experimental results show that our method achieves promising performance. Keywords Density peak clustering  Multi-center cluster  Image segmentation

1 Introduction & You Zhou [email protected] 1

College of Computer Science and Technology, Jilin University, Changchun, China

2

Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China

3

Joint NTU-UBC Research Centre of Excellence in Active Living for the Elderly, Nanyang Technological University, Singapore, Singapore

4

Joint NTU-WeBank Research Centre on FinTech, Nanyang Technological University, Singapore, Singapore

5

Department of Computer Science, Harbin Institute of Technology (Shenzhen), Shenzhen, China

6

School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK

7

School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore

Density-based clustering has been widely adopted in the literature [1–3]. Moreover, it recently attracted an increasing amount of attention in data mining and patte