Principal Component Projection with Low-Degree Polynomials

  • PDF / 1,313,901 Bytes
  • 25 Pages / 439.37 x 666.142 pts Page_size
  • 62 Downloads / 185 Views

DOWNLOAD

REPORT


Principal Component Projection with Low-Degree Polynomials Stephen D. Farnham1 · Lixin Shen1 · Bruce W. Suter2 Received: 16 February 2019 / Revised: 5 October 2020 / Accepted: 8 October 2020 / Published online: 12 November 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract In this paper, we consider approximations of principal component projection (PCP) without explicitly computing principal components. This problem has been studied in several recent works. The main feature of existing approaches is viewing the PCP matrix as a matrix function. This underlying function is the composition of a step function with a rational function. To find an approximate PCP, the step function is approximated by a polynomial while the rational function is evaluated by a fast ridge regression solver. In this work, we further improve this process by replacing the rational function with carefully constructed polynomials of low degree. We characterize the properties of polynomials that are suitable for approximating PCP, and then establish an optimization problem to select the optimal one from those polynomials. We show theoretically and confirm numerically that the resulting approximate PCP approach with optimal polynomials is indeed effective for approximations of principal component projection. Keywords Principal component projection · Chebyshev polynomial · PCA

1 Introduction Principal component projection (PCP) is a mathematical procedure in machine learning and statistics that projects high dimensional data onto a lower dimensional subspace. This subspace is defined by the principal components with the highest variance in the training data. Mathematically, PCP can be achieved by finding the top principal components of a matrix, through any principal component analysis (PCA) solver, and then projecting the underlying vector onto their span. Unfortunately, PCA solvers demand a running time that at

B

Lixin Shen [email protected] Stephen D. Farnham [email protected] Bruce W. Suter [email protected]

1

Department of Mathematics, Syracuse University, Syracuse, NY 13244, USA

2

Air Force Research Laboratory, Rome, NY, USA

123

52

Page 2 of 25

Journal of Scientific Computing (2020) 85:52

least linearly scales with the number of top principal components chosen for the projection [1], and computing the principal components of a matrix itself is an expensive task. This naturally raises the question of how one can efficiently project a vector onto the span of the top principal components of a matrix without performing PCA. This question has recently been addressed in [1,4] through an iterative algorithm based on black-box calls to any ridge regression routine. We briefly review the main idea behind this algorithm. Given a general matrix A ∈ Rm×d and a vector y ∈ Rd , we want to compute the projection of y onto the span of the eigenvectors of A A corresponding to eigenvalues above a threshold λ. We denote this projection, i.e., the principal component projection of y, P(A,λ) y. The key observation i