A New Robust Fuzzy Clustering Approach: DBKIFCM

  • PDF / 1,698,668 Bytes
  • 22 Pages / 439.37 x 666.142 pts Page_size
  • 32 Downloads / 243 Views

DOWNLOAD

REPORT


A New Robust Fuzzy Clustering Approach: DBKIFCM Anjana Gosain1 · Sonika Dahiya2 Accepted: 1 September 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract A clustering algorithm—Distance Based Gaussian Kernelized Intuitionistic Fuzzy C Means (DBKIFCM) is proposed. This algorithm is based on Gaussian kernel, outlier identification, and intuitionist fuzzy sets. It is intended to resolve the issue of presence of outliers, problem of sensitivity to initialization (STI) and is motivated by good performance of Radial Based Kernelized Intuitionistic Fuzzy C Means (KIFCM-RBF). Experiments are performed on standard 2D data sets such as Diamond (D12 and D15), and Dunn and real-world high dimension data sets such as Fisheriris, Wisconsin breast cancer, and Wine. DBKIFCM outcomes are studied in relation to Fuzzy C Means (FCM), Intuitionistic Fuzzy C Means (IFCM), KIFCM-RBF, Density Oriented Fuzzy C Means (DOFCM). It is observed that proposed approach significantly outperforms the earlier proposed algorithms with respect to outlier identification, effect of noise, issue of STI, and clustering error. Keywords Fuzzy clustering · Outlier identification · Kernel function · FCM · IFCM · KIFCM · DOFCM

1 Introduction Clustering is useful in identifying natural boundaries in data set whereas fuzzy clustering is useful in cases where cluster boundaries are not sharp. Fuzzy clustering deals with fuzziness, uncertainty and vagueness in data. Thus, it is extensively used in domains such as medical imaging [1], astronomy, finance, marketing, robust designing etc. [2]. Fuzzy clustering is applied in various applications such as object recognition, customer segmentation, fault diagnosis [3], pattern recognition, image segmentation [4–6]. In 1965, Lotfi Zadeh introduced fuzzy sets [7]. Dunn, in 1974, proposed ISODATA algorithm [8] by incorporating fuzzy logic in clustering and in 1981, an extension of ISODATA algorithm, FCM (Fuzzy C-Means) [9] was proposed by J.C. Bezdek which is widely accepted and successfully applied in fuzzy clustering applications. However, the performance of FCM is affected by the presence of noise and outliers. To smoothen the effect of noise, Tolia

B

Sonika Dahiya [email protected]

1

USICT, GGSIP University, Sector 16C, Dwarka, Delhi 110075, India

2

CSE, Delhi Technological University, Main Bawana Road, Shahbad Daulatpur, Delhi 110042, India

123

A. Gosain, S. Dahiya

and Panas [10] suggested post-processing of membership function, introduced in FCM. For enforcing spatial constraint, Acton and Mukherjee [11, 12] incorporated multiscale information. In 1993, Krishnapuram et al. proposed Possibilistic C Means (PCM) [12, 13] which interprets clustering as a possibilistic partition. But PCM is sensitive to initialization and few data objects are allocated to more than one cluster, which causes cluster identity issues or overlapping issues. In 2005, Possibilistic Fuzzy C Means [12, 14] was proposed by Nikhil R. Pal et al. to overcome the identical cluster problem of PCM by gen