Performance Assessment of Kernel-Based Clustering

Kernel methods are ones that, by replacing the inner product with positive definite function, implicitly perform a nonlinear mapping of input data into a high-dimensional feature space. Various types of kernel-based clustering methods have been studied so

  • PDF / 354,463 Bytes
  • 7 Pages / 439.37 x 666.142 pts Page_size
  • 71 Downloads / 319 Views

DOWNLOAD

REPORT


Abstract Kernel methods are ones that, by replacing the inner product with positive definite function, implicitly perform a nonlinear mapping of input data into a high-dimensional feature space. Various types of kernel-based clustering methods have been studied so far by many researchers, where Gaussian kernel, in particular, has been found to be useful. In this paper, we have investigated the role of kernel function in clustering and incorporated different kernel functions. We discussed numerical results in which different kernel functions are applied to kernel-based hybrid c-means clustering. Various synthetic data sets and real-life data set are used for analysis. Experiments results show that there exist other robust kernel functions which hold like Gaussian kernel.







Keywords Clustering Kernel function Gaussian kernel Hyper-tangent kernel Log kernel



1 Introduction Fuzzy clustering has emerged as an important tool for discovering the structure of data. Kernel methods have been applied to fuzzy clustering, and the kernelized version is referred to as kernel-based fuzzy clustering. The kernel-based classification in the feature space not only preserves the inherent structure of groups in the input space, but also simplifies the associated structure of the data [1]. Since

M. Tushir (&) Maharaja Surajmal Institute of Technology, New Delhi, India e-mail: [email protected] S. Srivastava Netaji Subash Institute of Technology, New Delhi, India e-mail: [email protected]

G. S. S. Krishnan et al. (eds.), Computational Intelligence, Cyber Security and Computational Models, Advances in Intelligent Systems and Computing 246, DOI: 10.1007/978-81-322-1680-3_16,  Springer India 2014

139

140

M. Tushir and S. Srivastava

Girolami first developed the kernel k-means clustering algorithm for unsupervised classification [2], several studies have demonstrated the superiority of kernel clustering algorithms over other approaches to clustering [3–5]. The point raised regarding the kernel-based clustering method of data partitioning is the choice of the type of kernel function chosen in defining the nonlinear mapping. Clearly, the choice of kernel is data specific; however, in the specific case of data partitioning, a kernel which will have universal approximation qualities such as RBF is most appropriate. In [6], we proposed kernel-based hybrid c-means clustering (KPFCM) as an improvement over possibilistic fuzzy c-means clustering [8] using Gaussian kernel function. In most papers, Gaussian kernel function is used as the kernel function. Different kernels will induce different metric measures resulting in new clustering algorithms. Very few papers have studied other kernel functions such as hyper-tangent kernel function [7, 9]. In this paper, we have tried to investigate the effect of different kernel functions on the clustering results. To our knowledge, this is the first such comparison of kernel clustering algorithms using different kernel functions for general purpose clustering. The paper is organized as fol