Evidential evolving C-means clustering method based on artificial bee colony algorithm with variable strings and interac

  • PDF / 990,548 Bytes
  • 21 Pages / 439.37 x 666.142 pts Page_size
  • 118 Downloads / 202 Views

DOWNLOAD

REPORT


Evidential evolving C-means clustering method based on artificial bee colony algorithm with variable strings and interactive evaluation mode Zhi-gang Su1

· Hong-yu Zhou1,2 · Yong-sheng Hao1

Accepted: 24 September 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract The Evidential C-Means algorithm provides a global treatment of ambiguity and uncertainty in memberships when partitioning attribute data, but still requires the number of clusters to be fixed as a priori, like most existing clustering methods do. However, the users usually do not know the exact number of clusters in advance, particularly in practical engineering. To relax this requirement, this paper proposes an Evidential Evolving C-Means (E2CM) clustering method in the framework of evolutionary computation: cluster centers are encoded in a population of variable strings (or particles) to search the optimal number and locations of clusters simultaneously. To perform such joint optimization problem well, an artificial bee colony algorithm with variable strings and interactive evaluation mode is proposed. It will be shown that the E2CM can automatically create appropriate credal partitions by just requiring an upper bound of the cluster number rather than the exact one. More interestingly, there are no restrictions on this upper bound from the theoretic point of view. Some numerical experiments and a practical application in thermal power engineering validate our conclusions. Keywords Belief functions · Evidential clustering · Soft clustering · Evolutionary computation · Artificial bee colony

This work is supported in part by the National Natural Science Foundation of China under Grant Nos. 52076037 and 51876035. This paper is a revised and extended version of a short paper presented at the BELIEF 2018 conference (Su et al. 2018).

B

Zhi-gang Su [email protected]

1

School of Energy and Environment, Southeast University, Nanjing, Jiangsu, China

2

NARI Technology Co. Ltd., 19 Chengxin Ave, Nanjing, Jiangsu, China

123

Z. Su et al.

1 Introduction Clustering is one of the most important tasks in data mining and machine learning. It aims to find clusters of objects that are similar to one another but dissimilar to objects in any other clusters. With different philosophies, distinct clustering algorithms have been developed, for example, see Denœux and Kanjanatarakul (2016), Saxena et al. (2017) and the literature therein. Among them, the evidential clustering shows its powerful ability to reveal data structure and attracts more and more attentions in artificial intelligence societies. Evidential clustering describes the ambiguity and uncertainty in the membership of objects to clusters using a Dempster–Shafer mass functions (Shafer 1976). A mass function can be seen as a collection of (focal) sets with corresponding masses. A collection of such mass functions for n objects is called a credal partition, which extends the existing concepts of hard, fuzzy, possibilistic and rough partitions (Denœux and Kanjanatarakul