A survey of density based clustering algorithms

PDF / 965,752 Bytes
27 Pages / 612.284 x 802.205 pts Page_size
64 Downloads / 282 Views

A survey of density based clustering algorithms Panthadeep BHATTACHARJEE

, Pinaki MITRA

Department of Computer Science and Engineering, Indian Institute of Technology, Guwahati 781039, India c Higher Education Press 2020

Abstract Density based clustering algorithms (DBCLAs) rely on the notion of density to identify clusters of arbitrary shapes, sizes with varying densities. Existing surveys on DBCLAs cover only a selected set of algorithms. These surveys fail to provide an extensive information about a variety of DBCLAs proposed till date including a taxonomy of the algorithms. In this paper we present a comprehensive survey of various DBCLAs over last two decades along with their classification. We group the DBCLAs in each of the four categories: density definition, parameter sensitivity, execution mode and nature of data and further divide them into various classes under each of these categories. In addition, we compare the DBCLAs through their common features and variations in citation and conceptual dependencies. We identify various application areas of DBCLAs in domains such as astronomy, earth sciences, molecular biology, geography, multimedia. Our survey also identifies probable future directions of DBCLAs where involvement of density based methods may lead to favorable results. Keywords clustering, density based clustering, survey, classification, common properties, applications

1

Introduction

Clustering is an unsupervised learning task that groups data objects or patterns based on similarity measures. Such objects may exist as data points in a Rd space. Entities belonging to a certain cluster have greater similarity between them than with an entity belonging to a diﬀerent cluster [1–3]. Cluster analysis is done with the objective of summarization or improved understanding of the data in context, e.g., grouping of related documents for browsing, finding protein structures and genes having analogous functions, or as a technique to compress data [4]. A large number of clustering techniques have been developed for pattern analysis, grouping, decision making, document retrieval, image segmentation, data mining, yet many significant challenges still remain in determining the clusters correctly. Clustering approaches are broadly classified into partitional, hierarchical and density based methods (Refer to Fig. 1) [1]. Partitional method creates partition of the data instead of a clustering structure. The partitional clustering approach involves squared error method, e.g., K-means algorithm, graph theoretic clustering, mixture resolving, e.g., EM algorithm and mode Received February 17, 2019; accepted September 9, 2019 E-mail: [email protected]; [email protected]

seeking method [1]. Hierarchical clustering produces a dendrogram that represents the nested grouping of patterns, e.g., Chameleon [5]. Hierarchical method adopts agglomerative or divisive approach to determine the clustering. Density based clustering depends on the notion of finding density of a region. The objective of DBCLAs is to find clu

Data Loading...

A survey of density based clustering algorithms

Recommend Documents

A Survey on Clustering Algorithms Based on Bioinspired Optimization Techniques

Density-based Clustering

A survey on parallel clustering algorithms for Big Data

Clustering Based on Genetic Algorithms

Underdetermined mixing matrix estimation based on joint density-based clustering algorithms

Target Tracking Algorithm Based on Density Clustering

ESDBSCAN: Enhanced Shuffling Based Density Clustering

Intuitionistic Fuzzy Clustering Algorithms

Some Adaptive Clustering Algorithms

A Relativistic Study on Recent Clustering Algorithms

Clustering of Quantitative Survey Data: A Subsystem of EDM Framework

GDPC: generalized density peaks clustering algorithm based on order similarity