Smoothed analysis for tensor methods in unsupervised learning

  • PDF / 714,695 Bytes
  • 51 Pages / 439.37 x 666.142 pts Page_size
  • 96 Downloads / 183 Views

DOWNLOAD

REPORT


Series B

Smoothed analysis for tensor methods in unsupervised learning Aditya Bhaskara1 · Aidao Chen2 · Aidan Perreault2 · Aravindan Vijayaraghavan2 Received: 3 December 2019 / Accepted: 7 October 2020 © Springer-Verlag GmbH Germany, part of Springer Nature and Mathematical Optimization Society 2020

Abstract Smoothed analysis is a powerful paradigm in overcoming worst-case intractability in high-dimensional data analysis and unsupervised learning. While polynomial time smoothed analysis guarantees have been obtained for worst-case intractable problems like tensor decomposition and learning mixtures of Gaussians, such guarantees have been hard to obtain for several other important problems in data analysis. A core technical challenge in analyzing algorithms is obtaining lower bounds on the least singular value for random matrix ensembles with dependent entries, that are given by low-degree polynomials of a few base underlying random variables. In this work, we address this challenge by obtaining high-confidence lower bounds on the least singular value of new classes of structured random matrix ensembles of the above kind. We then use these bounds to design algorithms with polynomial time smoothed analysis guarantees for the following three important problems in high-dimensional data analysis: • Higher order tensor decompositions, where we generalize and analyze the socalled FOOBI algorithm of Cardoso. This allows us to obtain polynomially robust decomposition algorithms for 2’th order tensors with rank O(n  ). • Learning overcomplete hidden markov models, where the size of the state space is any polynomial in the dimension of the observations. This gives the first polynomial time guarantees for learning overcomplete HMMs in the smoothed analysis model. • Robust subspace recovery, when the fraction α of inliers in the d-dimensional subspace T ⊂ Rn is at least α > (d/n) for any constant  ∈ Z+ . This contrasts with the known worst-case intractability when α < d/n, and the previous smoothed analysis result which needed α > d/n ([33]).

Preliminary version of this paper appeared in 60th IEEE Symposium on the Foundations of Computer Science [10]. Aidao Chen, Aidan Perreault and Aravindan Vijayaraghavan were supported by the National Science Foundation (NSF) under Grant No. CCF-1652491 and CCF-1637585. Additionally, the third author was supported by an undergraduate research Grant from Northwestern University. Extended author information available on the last page of the article

123

A. Bhaskara et al.

Mathematics Subject Classification 68Q32 Computational learning theory · 49Q12 Sensitivity analysis for optimization problems on manifolds · 15A69 Multilinear algebra, tensor calculus

1 Introduction Several basic computational problems in high-dimensional data analysis like tensor decompositions, and unsupervised learning like learning probabilistic models are intractable in the worst-case. Yet practitioners have had remarkable success in designing heuristics that work well on real-world instances. Bridging this di