Computational complexity of kernel-based density-ratio estimation: a condition number analysis

PDF / 1,004,714 Bytes
30 Pages / 439.37 x 666.142 pts Page_size
90 Downloads / 240 Views

Computational complexity of kernel-based density-ratio estimation: a condition number analysis Takafumi Kanamori · Taiji Suzuki · Masashi Sugiyama

Received: 2 March 2011 / Accepted: 13 September 2012 / Published online: 12 December 2012 © The Author(s) 2012

Abstract In this study, the computational properties of a kernel-based least-squares densityratio estimator are investigated from the viewpoint of condition numbers. The condition number of the Hessian matrix of the loss function is closely related to the convergence rate of optimization and the numerical stability. We use smoothed analysis techniques and theoretically demonstrate that the kernel least-squares method has a smaller condition number than other M-estimators. This implies that the kernel least-squares method has desirable computational properties. In addition, an alternate formulation of the kernel least-squares estimator that possesses an even smaller condition number is presented. The validity of the theoretical analysis is verified through numerical experiments. Keywords Kernel · Density-ratio · Condition number · Smoothed analysis

1 Introduction In this section, we introduce background materials of our target problem addressed in this study.

Editor: Massimiliano Pontil. T. Kanamori () Nagoya University Department of Computer Science and Mathematical Informatics, Nagoya University, Furocho, Chikusaku, Nagoya 464-8601, Japan e-mail: [email protected] T. Suzuki Department of Mathematical Informatics, University of Tokyo, 3-1 Hongo 7-Chome, Bunkyo-ku, Tokyo 113-8656, Japan e-mail: [email protected] M. Sugiyama Department of Computer Science, Tokyo Institute of Technology, 2-12-1 O-okayama, Meguro-ku, Tokyo 152-8552, Japan e-mail: [email protected]

432

Mach Learn (2013) 90:431–460

1.1 Density-ratio estimation Recently, methods of directly estimating the ratio of two probability densities without going through density estimation have been developed. These methods can be used to solve various machine learning tasks such as importance sampling, divergence estimation, mutual information estimation, and conditional probability estimation (Sugiyama et al. 2009, 2012b). The kernel mean matching (KMM) method (Gretton et al. 2009) directly yields density ratio estimates by efficiently matching the two distributions using a special property of the universal reproducing kernel Hilbert spaces (RKHSs) (Steinwart 2001). Another approach is the M-estimator (Nguyen et al. 2010), which is based on the non-asymptotic variational characterization of the ϕ-divergence (Ali and Silvey 1966; Csiszár 1967). See Sugiyama et al. (2008a) for a similar algorithm that uses the Kullback-Leibler divergence. Non-parametric convergence properties of the M-estimator in RKHSs have been elucidated under the Kullback-Leibler divergence (Nguyen et al. 2010; Sugiyama et al. 2008b). A squared-loss version of the M-estimator for linear density-ratio models called unconstrained Least-Squares Importance Fitting (uLSIF) has also been developed (Kanamori et al. 200

Data Loading...

Computational complexity of kernel-based density-ratio estimation: a condition number analysis

Recommend Documents

Condition Number

Computational Complexity

Quantum Computational Number Theory

Computational Complexity Theory

Computational Complexity of a Solution for Directed Graph Cooperative Games

The Computational Complexity of Logical Theories

Computational Complexity of the Walrasian Equilibrium Inequalities

Resampling Algorithms for Particle Filters: A Computational Complexity Perspective

Low-complexity disordered speech quality estimation

Age Estimation Based on Complexity-Aware Features

The Complexity of Finding Subgraphs Whose Matching Number Equals the Vertex Cover Number

Estimation of the Complexity of the Classification Task Based on the Analysis of Variational Autoencoders