Incremental Construction of Low-Dimensional Data Representations

Various Dimensionality Reduction algorithms transform initial high-dimensional data into their lower-dimensional representations preserving chosen properties of the initial data. Typically, such algorithms use the solution of large-dimensional optimizatio

PDF / 191,083 Bytes
13 Pages / 439.37 x 666.142 pts Page_size
83 Downloads / 236 Views

DOWNLOAD

REPORT

2

Skolkovo Institute of Science and Technology, Moscow, Russia {A.Kuleshov,a.bernstein}@skoltech.ru Kharkevich Institute for Information Transmission Problems RAS, Moscow, Russia

Abstract. Various Dimensionality Reduction algorithms transform initial high-dimensional data into their lower-dimensional representations preserving chosen properties of the initial data. Typically, such algorithms use the solution of large-dimensional optimization problems, and the incremental versions are designed for many popular algorithms to reduce their computational complexity. Under manifold assumption about high-dimensional data, advanced manifold learning algorithms should preserve the Data manifold and its differential properties such as tangent spaces, Riemannian tensor, etc. Incremental version of the Grassmann&Stiefel Eigenmaps manifold learning algorithm, which has asymptotically minimal reconstruction error, is proposed in this paper and has signiﬁcantly smaller computational complexity in contrast to the initial algorithm. Keywords: Machine learning Dimensionality reduction Manifold learning Tangent bundle manifold learning Incremental learning

1 Introduction The general goal of data analysis is to extract previously unknown information from a given dataset. Many data analysis tasks, such as pattern recognition, classiﬁcation, clustering, prognosis, and others, deal with real-world data that are presented in high-dimensional spaces, and the ‘curse of dimensionality’ phenomena are often an obstacle to the use of many methods for solving these tasks. Fortunately, in many applications, especially in pattern recognition, the real high-dimensional data occupy only a very small part in the high dimensional ‘observation space’ Rp; it means that an intrinsic dimension q of the data is small compared to the dimension p (usually, q 0} denote the algorithms parameters whose values are chosen depending on the sample size n (εi = εi,n) and tend to zero as n → ∞ with rate O(n−1/(q+2)). Step S1: Neighborhoods (Construction and Description). The necessary preliminary calculations are performed at ﬁrst step S1. Euclidean Kernel. Introduce Euclidean kernel KE(X, X′) = I{|X′ – X| < ε1} on the DM at points X, X′ 2 M, here I{} is indicator function. Grassmann Kernel. An applying the Principal Component Analysis (PCA) [27] to the points from the set Un(X, ε1) = {X′ 2 Xn: |X′ – X| < ε1} [ {X}, results in p × q orthogonal matrix QPCA(X) whose columns are PCA principal eigenvectors corresponding to the q largest PCA eigenvalues. These matrices determine q-dimensional linear spaces LPCA(X) = Span(QPCA(X)) in Rp, which, under certain conditions, approximate the tangent spaces L(X): LPCA ðXÞ LðXÞ:

ð10Þ

In what follows, we assume that sample size n is large enough to ensure a positive value of the qth PCA-eigenvalue in sample points and provide proximities (10). To provide trade-off between ‘statistical error’ (depending on number n(X) of sample points in set Un(X, ε1)) and ‘curvature error’ (caused by deviation of the manifold-valued

Data Loading...

Incremental Construction of Low-Dimensional Data Representations

Recommend Documents

Data Representations

Data Quality Representations

Randomized Incremental Construction of Delaunay Triangulations of Nice Point Sets

Image Data Representations

Designing Palpable Data Representations

A construction of antisymmetric modular forms for Weil representations

Temporal Knowledge Graph Incremental Construction Model for Recommendation

Learning Discrete Sentence Representations via Construction & Decomposition

Construction of Fundamental Data Structures for Strings

Robust Construction of the Additively-Weighted Voronoi Diagram via Topology-Oriented Incremental Algorithm

Regularized and incremental decision trees for data streams

CEA Incremental