Intra-cluster Similarity Index Based on Fuzzy Rough Sets for Fuzzy C-Means Algorithm

Cluster validity indices have been used to evaluate the quality of fuzzy partitions. In this paper, we propose a new index, which uses concepts of Fuzzy Rough sets to evaluate the average intra-cluster similarity of fuzzy clusters produced by the fuzzy c-

PDF / 464,198 Bytes
8 Pages / 430 x 660 pts Page_size
76 Downloads / 349 Views

DOWNLOAD

REPORT

Abstract. Cluster validity indices have been used to evaluate the quality of fuzzy partitions. In this paper, we propose a new index, which uses concepts of Fuzzy Rough sets to evaluate the average intra-cluster similarity of fuzzy clusters produced by the fuzzy c-means algorithm. Experimental results show that contrasted with several well-known cluster validity indices, the proposed index can yield more desirable cluster number estimation. Keywords: Fuzzy c-means algorithm, Fuzzy Rough sets, Intra-cluster similarity, Cluster validity index.

1

Introduction

Cluster analysis for revealing the structure existing in a given data (patterns) set can be viewed as the problem of dividing the data set into a few compact subsets. The fuzzy c-means (FCM) algorithm [1] for cluster analysis has been the dominant approach in both theoretical and practical applications of fuzzy techniques for the last two decades. The aim of FCM is to partition a given set of data points (patterns) X = {x1 , x2 , · · · , xn } ⊂ Rp into c clusters represented as fuzzy sets F1 , F2 , · · · , Fc . The FCM objective function has the form of Jm (U, V ) =

c n

2 um ij xj − vi ,

(1)

i=1 j=1

where vi is the centroid of the fuzzy cluster Fi , · is a certain distance function, the exponent m > 1 is a fuzziﬁer, uij = Fi (xj ) is the membership of xj c value n belonging to Fi satisfying i=1 uij = 1 (j = 1, 2, · · · , n) and 0 < j=1 uij < n (i = 1, 2, · · · , c), U = [uij ] is the partition matrix, and V = {v1 , v2 , · · · , vc } is the set of all cluster centroids. FCM iteratively updates U and V to minimize Jm (U, V ) until a certain termination criterion has been satisﬁed. In FCM, a fuzzy partition is denoted as (U, V ). In FCM, if c is not known a priori, a cluster validity index must be used to evaluate the quality of fuzzy partitions for diﬀerent values of c to ﬁnd out the optimal cluster number. In most cited indices, e.g. the Xie-Beni index [2] and the G. Wang et al. (Eds.): RSKT 2008, LNAI 5009, pp. 316–323, 2008. c Springer-Verlag Berlin Heidelberg 2008

Intra-cluster Similarity Index Based on Fuzzy Rough Sets

317

Fukuyama-Sugeno index [3], the intra-cluster similarity of a fuzzy partition is estimated by using distances between data points and cluster centroids. But this approach is not eﬀective for large values of c, because limc→n xj − vi 2 = 0 (see [4,5]). To overcome this shortcoming, the Kwon index [5] is proposed, and another kind of index has been proposed in recent years [6,7]. This kind of index only considers the inter-cluster proximity, which is evaluated by the membership values of each data point belonging to all fuzzy clusters whereas the distance function is not taken into account. In this paper, we propose a new method to assess the intra-cluster similarity of a fuzzy cluster by using the concepts of Fuzzy Rough sets. And the intracluster similarity index of a fuzzy partition obtained from FCM is deﬁned as the average intra-cluster similarity of all fuzzy clusters. Experimental results indicate that the

Data Loading...

Intra-cluster Similarity Index Based on Fuzzy Rough Sets for Fuzzy C-Means Algorithm

Recommend Documents

A novel classification algorithm based on kernelized fuzzy rough sets

Fuzzy Sets, Rough Sets, Multisets and Clustering

Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing 9th

Fuzzy Sets

Fuzzy Sets Based Heuristics for Optimization

Uncertainty Management with Fuzzy and Rough Sets Recent Advances and

On Intuitionistic Fuzzy Sets Theory

Uncertainty of Multi-granulation Hesitant Fuzzy Rough Sets Based on Three-Way Decisions

Fuzzy Sets and Fuzzy Database Models

Mathematics of Fuzzy Sets and Fuzzy Logic

Kernelized fuzzy rough sets based online streaming feature selection for large-scale hierarchical classification

Similarity-based Rough Sets and Its Applications in Data Mining