Alternating maximization: unifying framework for 8 sparse PCA formulations and efficient parallel codes

PDF / 2,768,540 Bytes
27 Pages / 439.37 x 666.142 pts Page_size
69 Downloads / 206 Views

Alternating maximization: unifying framework for 8 sparse PCA formulations and efficient parallel codes Peter Richtárik1 · Majid Jahani2 · Selin Damla Ahipaşaoğlu3 · Martin Takáč2 Received: 30 October 2019 / Revised: 29 April 2020 / Accepted: 7 September 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Given a multivariate data set, sparse principal component analysis (SPCA) aims to extract several linear combinations of the variables that together explain the variance in the data as much as possible, while controlling the number of nonzero loadings in these combinations. In this paper we consider 8 different optimization formulations for computing a single sparse loading vector: we employ two norms for measuring variance (L2, L1) and two sparsity-inducing norms (L0, L1), which are used in two ways (constraint, penalty). Three of our formulations, notably the one with L0 constraint and L1 variance, have not been considered in the literature. We give a unifying reformulation which we propose to solve via the alternating maximization (AM) method. We show that AM is equivalent to GPower for all formulations. Besides this, we provide 24 efficient parallel SPCA implementations: 3 codes (multi-core, GPU and cluster) for each of the 8 problems. Parallelism in the methods is aimed at (1) speeding up computations (our GPU code can be 100 times faster than an efficient serial code written in C++), (2) obtaining solutions explaining more variance and (3) dealing with big data problems (our cluster code can solve a 357 GB problem in a minute). Keywords Sparse PCA · Alternating maximization · GPower · Big data analytics · Unsupervised learning

1 Introduction Principal component analysis (PCA) is an indispensable tool used for dimension reduction in virtually all areas of science and engineering, from machine learning, statistics, genetics and finance to computer networks (Jollife 1986). Let A ∈ 𝐑n×p MT was partially supported by National Science Foundation Grants CCF-1618717, CMMI-1663256 and CCF-1740796. * Martin Takáč [email protected] Extended author information available on the last page of the article

13

Vol.:(0123456789)

P. Richtárik et al.

denote a data matrix encoding n samples (observations) of p variables (features). PCA aims to extract a few linear combinations of the columns of A, called principal components (PCs), pointing in mutually orthogonal directions, together explaining as much variance in the data as possible. If the columns of A are centered, the problem of extracting the first PC can be written as

max{‖Ax‖ ∶ ‖x‖2 ≤ 1},

(1.1)

where ‖ ⋅ ‖ is a suitable norm for measuring variance. The solution x of this optimization problem is called the loading vector, Ax (normalized) is the first PC. Further PCs can be obtained in the same way with A replaced by a new matrix in a process called deflation (Mackey 2008). Classical PCA employs the L2 norm in the objective; using the L1 norm instead may alleviate problems caused by outliers in the data and hence leads to a robus

Data Loading...

Alternating maximization: unifying framework for 8 sparse PCA formulations and efficient parallel codes

Recommend Documents

An ADMM Approach for Constructing Abnormal Subspace of Sparse PCA

Efficient influence spread estimation for influence maximization

Verification Artifacts in Cooperative Verification: Survey and Unifying Component Framework

Designing Parallel Sparse Matrix Transposition Algorithm Using CSR for GPUs

Parallel Sparse Direct Solver for Integrated Circuit Simulation

Parallel Solution Techniques for Sparse Linear Systems in Circuit Simulation

An Efficient Parallel Load-Balancing Framework for Orthogonal Decomposition of Geometrical Data

Mathematical programming formulations for the alternating current optimal power flow problem

Parallel and Serial Concatenated Single Parity Check Product Codes

Efficient stochastic successive cancellation list decoder for polar codes

The Robust Sparse PCA for Data Reconstructive via Weighted Elastic Net

Response to Commentaries: Toward a Unifying Framework for Understanding and Improving Sexual and Gender Minority Mental