Robust high dimensional expectation maximization algorithm via trimmed hard thresholding
- PDF / 3,241,157 Bytes
- 29 Pages / 439.37 x 666.142 pts Page_size
- 73 Downloads / 230 Views
Robust high dimensional expectation maximization algorithm via trimmed hard thresholding Di Wang1 · Xiangyu Guo1 · Shi Li1 · Jinhui Xu1 Received: 16 April 2020 / Revised: 21 July 2020 / Accepted: 17 October 2020 © The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2020
Abstract In this paper, we study the problem of estimating latent variable models with arbitrarily corrupted samples in high dimensional space (i.e., d ≫ n ) where the underlying parameter is assumed to be sparse. Specifically, we propose a method called Trimmed (Gradient) Expectation Maximization which adds a trimming gradients step and a hard thresholding step to the Expectation step (E-step) and the Maximization step (M-step), respectively. We show that under some mild assumptions and with an appropriate initialization, the algorithm is corruption-proofing and converges to the (near) optimal statistical � � rate geometrĩ cally when the fraction of the corrupted samples 𝜖 is bounded by O √1 . Moreover, we n
apply our general framework to three canonical models: mixture of Gaussians, mixture of regressions and linear regression with missing covariates. Our theory is supported by thorough numerical results. Keywords Robust statistics · High dimensional statistics · Gaussian mixture model · Expectation maximixation · Iterative hard thresholding
D. Wang and X. Guo have contributed equally. Editors: Kee-Eung Kim, Vineeth N. Balasubramanian. * Jinhui Xu [email protected] Di Wang [email protected] Xiangyu Guo [email protected] Shi Li [email protected] 1
Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY 14260, USA
13
Vol.:(0123456789)
Machine Learning
1 Introduction As one of the most popular techniques for estimating the maximum likelihood of mixture models or incomplete data problems, Expectation Maximization (EM) algorithm has been widely applied to many areas such as genomics (Laird 2010), finance (Faria and Gonçalves 2013), and crowdsourcing (Dawid and Skene 1979). Although EM algorithm is wellknown to converge to an empirically good local estimator (Wu et al. 1983), finite sample statistical guarantees for its performance have not been established until recent studies (Balakrishnan et al. 2017b; Zhu et al. 2017; Wang et al. 2015; Yi and Caramanis 2015). Specifically, the first local convergence theory and finite sample statistical rate of convergence for the classical EM and its gradient ascent variant (gradient EM) were established in Balakrishnan et al. (2017b). Later, Wang et al. (2015) extended the classical EM and gradient EM algorithms to the high dimensional sparse setting, and the key idea in their methods is an additional truncation step after the M-step, which can exploit the intrinsic sparse structure of the high dimensional latent variable models. Later on, Yi and Caramanis (2015) also studied the high dimensional sparse EM algorithm and proposed a method which uses a regularized M-estimator in the M-step. Recen
Data Loading...