Multiple clustering and selecting algorithms with combining strategy for selective clustering ensemble

PDF / 709,467 Bytes
13 Pages / 595.276 x 790.866 pts Page_size
4 Downloads / 257 Views

FOUNDATIONS

Multiple clustering and selecting algorithms with combining strategy for selective clustering ensemble Tinghuai Ma1 · Te Yu1 · Xiuge Wu1 · Jie Cao2 · Alia Al-Abdulkarim3 · Abdullah Al-Dhelaan3 · Mohammed Al-Dhelaan3

© Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Clustering ensemble can overcome the instability of clustering and improve clustering performance. With the rapid development of clustering ensemble, we find that not all clustering solutions are effective in their final result. In this paper, we focus on selection strategy in selective clustering ensemble. We propose a multiple clustering and selecting approach (MCAS), which is based on different original clustering solutions. Furthermore, we present two combining strategies, direct combining and clustering combining, to combine the solutions selected by MCAS. These combining strategies combine results of MCAS and get a more refined subset of solutions, compared with traditional selective clustering ensemble algorithms and single clustering and selecting algorithms. Experimental results on UCI machine learning datasets show that the algorithm that uses multiple clustering and selecting algorithms with combining strategy performs well on most datasets and outperforms most selective clustering ensemble algorithms. Keywords Selective clustering ensemble · Clustering solution · Multiple clustering and selecting algorithms · Combining strategy

1 Introduction Clustering is one of the most important tools in data mining. The major goal of clustering is to seek a grouping Communicated by A. Di Nola.

B

Tinghuai Ma [email protected] Te Yu [email protected] Alia Al-Abdulkarim [email protected] Abdullah Al-Dhelaan [email protected] Mohammed Al-Dhelaan [email protected]

1

School of Computer, Nanjing University of information science and Technology, Jiangsu 210-044, Nanjing, China

2

School of Economics and Management, Nanjing University of Information Science and Technology, Nanjing 210044, China

3

Computer Science Department, College of Computer and Information Science, King Saud University, Riyadh 11362, Saudi Arabia

which makes the intra-group similarity large, but inter-group similarity small. However, using different methods or same method with different parameters on the same dataset will have different results. The basic challenge in clustering is choosing a suitable algorithm for one dataset. Strehl and Ghosh (2003) proposed clustering ensemble which combines independent clustering results rather than finds the best ones. Clustering ensemble, known as clustering aggregation and consensus clustering, is characterized by high robustness, stability, novelty, scalability and parallelism (Yu et al. 2014; Lv et al. 2016; Jia et al. 2011; Ma et al. 2018). In addition, clustering ensemble has advantages in privacy protection and knowledge reuse. It only needs to access clustering solutions rather than original data, so it provides privacy protection for original data (Akbari et al. 2015). Clustering ensemble uses

Data Loading...

Multiple clustering and selecting algorithms with combining strategy for selective clustering ensemble

Recommend Documents

Clustering Ensemble Selection with Analytic Hierarchy Process

Intuitionistic Fuzzy Clustering Algorithms

Some Adaptive Clustering Algorithms

Contention-aware container placement strategy for docker swarm with machine learning based clustering algorithms

Clustering Based on Genetic Algorithms

Improved Algorithms for Distributed Balanced Clustering

Multiple Consensuses Clustering by Iterative Merging/Splitting of Clustering Patterns

Ensemble Similarity Clustering Frame work for Categorical Dataset Clustering Using Swarm Intelligence

Rough subspace-based clustering ensemble for categorical data

A New Ensemble Clustering Approach for Effective Information Retrieval

Selecting the most relevant variables towards clustering bus priority corridors

A Relativistic Study on Recent Clustering Algorithms