A Clustering Algorithm for Triangular Fuzzy Normal Random Variables

PDF / 6,132,897 Bytes
18 Pages / 595.276 x 790.866 pts Page_size
96 Downloads / 331 Views

A Clustering Algorithm for Triangular Fuzzy Normal Random Variables Ye Li1 • Yiyan Chen2 • Qun Li3

Received: 16 April 2019 / Revised: 29 June 2020 / Accepted: 1 August 2020 Ó Taiwan Fuzzy Systems Association 2020

Abstract In view of the fact that most clustering algorithms cannot solve the clustering problem about samples with uncertain information, according to the theory of fuzzy sets and probability, we define the fuzzy-probability binary measure space and triangular fuzzy normal random variables firstly, and then combine the advantages of kmeans algorithm, such as simple principle, few parameters, fast convergence rate, good clustering effect and good scalability, etc., a clustering algorithm is proposed for samples containing multiple triangular fuzzy normal random variables, which we call TFNRV-k-means algorithm. The algorithm uses our proposed Euclidean random comprehensive absolute distance (ERCAD for short) as a measurement, under the fuzzy measure, the lower bound, the principal value and the upper bound of the triangular fuzzy normal random variables are iterated, respectively, by means, and then the cluster center is updated until it becomes stable and unchanged. Then we analyze the time complexity of the proposed algorithm, and test the algorithm under different sample sets by random simulation experiments. We get the highest clustering accuracy of 99.00% and the maximum Kappa coefficient of 0.9850, and draw the conclusion that TFNRV-k-means clustering Ye Li, Yiyan Chen and Qun Li contributed equally to this work and should be considered co-first authors. & Yiyan Chen [email protected] 1

University of Chinese Academy of Social Sciences (Graduate School), Beijing 102488, China

2

School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China

3

Institute of Quantitative & Technical Economics, Chinese Academy of Social Sciences, Beijing 100732, China

algorithm has good clustering effect. Finally, we summarize the content of the article, list the advantages and disadvantages of TFNRV-k-means clustering algorithm, and propose corresponding improvement methods, which provide ideas for further research on TFNRV-k-means in the future. Keywords Fuzzy-probability binary measure space Triangular fuzzy normal random variables Euclidean random synthesis absolute distance TFNRV-k-means clustering algorithm

1 Introduction Clustering analysis is an important data mining method, which has been widely used in many fields such as pattern recognition, image analysis, market research, customer relationship management, web document classification, etc. [1]. At present, there is no recognized definition of clustering in academia. Everitt [2] gives a definition of clustering: entities within a cluster are similar, entities within different clusters are not similar; one cluster is the aggregation of points in the data space, and the similarity between any two points in the same cluster is less than that in different clusters. Cluster can be described as a connected domain of a

Data Loading...

A Clustering Algorithm for Triangular Fuzzy Normal Random Variables

Recommend Documents

Triangular approximations for continuous random variables in risk analysis

Formalization of Normal Random Variables in HOL

A Fuzzy Crow Search Algorithm for Solving Data Clustering Problem

A novel intuitionistic fuzzy co-clustering algorithm for brain images

Chance constrained programming with some non-normal continuous random variables

Random Sets and Random Fuzzy Sets as Ill-Perceived Random Variables

Description of Uncertain Structural Parameters as Fuzzy Random Variables

Random Variables

Exponential tail estimates in the law of ordinary logarithm (LOL) for triangular arrays of random variables

A Fuzzy Clustering Approach for TS Fuzzy Model Identification

A Euclidean Algorithm for Normal Bases

Fractiles of Random Variables