Unconstrained representation of orthogonal matrices with application to common principal components

  • PDF / 449,783 Bytes
  • 19 Pages / 439.37 x 666.142 pts Page_size
  • 62 Downloads / 161 Views

DOWNLOAD

REPORT


Unconstrained representation of orthogonal matrices with application to common principal components Luca Bagnato1

· Antonio Punzo2

Received: 8 May 2020 / Accepted: 7 October 2020 © The Author(s) 2020

Abstract Many statistical problems involve the estimation of a (d × d) orthogonal matrix Q. Such an estimation is often challenging due to the orthonormality constraints on Q. To cope with this problem, we use the well-known PLU decomposition, which factorizes any invertible (d × d) matrix as the product of a (d × d) permutation matrix P, a (d × d) unit lower triangular matrix L, and a (d × d) upper triangular matrix U. Thanks to the QR decomposition, we find the formulation of U when the PLU decomposition is applied to Q. We call the result as PLR decomposition; it produces a one-to-one correspondence between Q and the d (d − 1) /2 entries below the diagonal of L, which are advantageously unconstrained real values. Thus, once the decomposition is applied, regardless of the objective function under consideration, we can use any classical unconstrained optimization method to find the minimum (or maximum) of the objective function with respect to L. For illustrative purposes, we apply the PLR decomposition in common principle components analysis (CPCA) for the maximum likelihood estimation of the common orthogonal matrix when a multivariate leptokurtic-normal distribution is assumed in each group. Compared to the commonly used normal distribution, the leptokurtic-normal has an additional parameter governing the excess kurtosis; this makes the estimation of Q in CPCA more robust against mild outliers. The usefulness of the PLR decomposition in leptokurtic-normal CPCA is illustrated by two biometric data analyses. Keywords Orthogonal matrix · LU decomposition · QR decomposition · Common principal components · FG algorithm · Leptokurtic-normal distribution

B

Luca Bagnato [email protected]

1

Dipartimento di Scienze Economiche e Sociali, Università Cattolica del Sacro Cuore, Piacenza, Italy

2

Dipartimento di Economia e Impresa, Università di Catania, Catania, Italy

123

L. Bagnato, A. Punzo

1 Introduction With the term orthogonal matrix we refer to a (d × d) matrix Q whose columns are mutually orthogonal unit vectors (i.e., orthonormal vectors). As highlighted by Banerjee and Roy (2014, p. 209), one might, perhaps more properly, call Q an “orthonormal” matrix, but the more conventional name is an “orthogonal” matrix, and we will adopt it hereafter. For further characterizations, properties, and details about orthogonal matrices see, e.g., Lütkepoh (1996, Chapter 9.10), Healy (2000, Chapter 3.5), Schott (2016, Chapter 1.10), and Searle and Khuri (2017, Chapter 5.4). Orthogonal matrices are used extensively in statistics, especially in linear models and multivariate analysis (see, e.g., Graybill 1976 Chapter 11 and James 1954). The d 2 elements of Q are subject to d (d + 1) /2 (orthonormality) constraints. It is therefore not surprising that they can be represented by only d 2 − d (d + 1) /2 = d (d − 1) /2 ind