Incremental Construction of Low-Dimensional Data Representations

Various Dimensionality Reduction algorithms transform initial high-dimensional data into their lower-dimensional representations preserving chosen properties of the initial data. Typically, such algorithms use the solution of large-dimensional optimizatio

  • PDF / 191,083 Bytes
  • 13 Pages / 439.37 x 666.142 pts Page_size
  • 83 Downloads / 212 Views

DOWNLOAD

REPORT


2

Skolkovo Institute of Science and Technology, Moscow, Russia {A.Kuleshov,a.bernstein}@skoltech.ru Kharkevich Institute for Information Transmission Problems RAS, Moscow, Russia

Abstract. Various Dimensionality Reduction algorithms transform initial high-dimensional data into their lower-dimensional representations preserving chosen properties of the initial data. Typically, such algorithms use the solution of large-dimensional optimization problems, and the incremental versions are designed for many popular algorithms to reduce their computational complexity. Under manifold assumption about high-dimensional data, advanced manifold learning algorithms should preserve the Data manifold and its differential properties such as tangent spaces, Riemannian tensor, etc. Incremental version of the Grassmann&Stiefel Eigenmaps manifold learning algorithm, which has asymptotically minimal reconstruction error, is proposed in this paper and has significantly smaller computational complexity in contrast to the initial algorithm. Keywords: Machine learning  Dimensionality reduction  Manifold learning  Tangent bundle manifold learning  Incremental learning

1 Introduction The general goal of data analysis is to extract previously unknown information from a given dataset. Many data analysis tasks, such as pattern recognition, classification, clustering, prognosis, and others, deal with real-world data that are presented in high-dimensional spaces, and the ‘curse of dimensionality’ phenomena are often an obstacle to the use of many methods for solving these tasks. Fortunately, in many applications, especially in pattern recognition, the real high-dimensional data occupy only a very small part in the high dimensional ‘observation space’ Rp; it means that an intrinsic dimension q of the data is small compared to the dimension p (usually, q 0} denote the algorithms parameters whose values are chosen depending on the sample size n (εi = εi,n) and tend to zero as n → ∞ with rate O(n−1/(q+2)). Step S1: Neighborhoods (Construction and Description). The necessary preliminary calculations are performed at first step S1. Euclidean Kernel. Introduce Euclidean kernel KE(X, X′) = I{|X′ – X| < ε1} on the DM at points X, X′ 2 M, here I{} is indicator function. Grassmann Kernel. An applying the Principal Component Analysis (PCA) [27] to the points from the set Un(X, ε1) = {X′ 2 Xn: |X′ – X| < ε1} [ {X}, results in p × q orthogonal matrix QPCA(X) whose columns are PCA principal eigenvectors corresponding to the q largest PCA eigenvalues. These matrices determine q-dimensional linear spaces LPCA(X) = Span(QPCA(X)) in Rp, which, under certain conditions, approximate the tangent spaces L(X): LPCA ðXÞ  LðXÞ:

ð10Þ

In what follows, we assume that sample size n is large enough to ensure a positive value of the qth PCA-eigenvalue in sample points and provide proximities (10). To provide trade-off between ‘statistical error’ (depending on number n(X) of sample points in set Un(X, ε1)) and ‘curvature error’ (caused by deviation of the manifold-valued