Signal Processing and Optimization

In the real world, signals are mostly stochastic. Signal processing makes use of stochastic properties to find the hidden structure we want to know about.

  • PDF / 600,292 Bytes
  • 44 Pages / 439.37 x 666.142 pts Page_size
  • 112 Downloads / 234 Views

DOWNLOAD

REPORT


Signal Processing and Optimization

In the real world, signals are mostly stochastic. Signal processing makes use of stochastic properties to find the hidden structure we want to know about. The present chapter begins with principal component analysis (PCA), by studying the correlational structure of signals to find principal components in which the directions of signals are widely spread. Orthogonal transformations are used to decompose signals into non-correlated principal components. However, “no correlation” does not mean “independence” except in the special case of Gaussian distributions. Independent component analysis (ICA) is a technique of decomposing signals into independent components. Information geometry, in particular semi-parametrics, plays a fundamental role in this. It has stimulated the rise of new techniques of positive matrix decomposition and sparse component analysis, which we also touch upon. The optimization problem under convex constraints and a game theory approach are briefly discussed in this chapter from the information geometry point of view. The Hyvärinen scoring method shows an attractive direction to be studied further from information geometry.

13.1 Principal Component Analysis 13.1.1 Eigenvalue Analysis Let x be a vector random variable, which has already been preprocessed such that its expectation is 0, E[x] = 0. (13.1)

© Springer Japan 2016 S. Amari, Information Geometry and Its Applications, Applied Mathematical Sciences 194, DOI 10.1007/978-4-431-55978-8_13

315

316

13 Signal Processing and Optimization

Then, its covariance matrix is   VX = E x x T .

(13.2)

If we transform x into s by using an orthogonal matrix O, s = OT x,

(13.3)

the covariance matrix of s is given by   V S = E ss T = OT V X O.

(13.4)

Let us consider the eigenvalue problem of V X , V X o = λo.

(13.5)

Then, we have n eigenvalues λ1 , . . . , λn , λ1 > λ2 > . . . > λn > 0 and corresponding n unit eigenvectors o1 , . . . , on , where we assume that there are no multiple eigenvalues. (When there exist multiple eigenvalues, rotational indefiniteness appears. We do not treat such a case here.) Let O be the orthogonal matrix consisting of the eigenvectors (13.6) O = [o1 . . . on ] . Then, V S is a diagonal matrix ⎡ ⎢ VS = ⎣

λ1

⎤ ..

⎥ ⎦

.

(13.7)

λn and the components of s are uncorrelated,   E si s j = 0, i = j.

(13.8)

13.1.2 Principal Components, Minor Components and Whitening Signal x is decomposed into a sum of uncorrelated components as x=

n

i=1

si oi .

(13.9)

13.1 Principal Component Analysis

317

Fig. 13.1 Principal components s1 , s2 , . . .

x2 s1 s2

. . .. . . .. .. . . . . . .. .. . .

x1

Since the variance of si is λi , s1 has the largest magnitude on average, s2 the second, and finally sn has the smallest magnitude. See Fig. 13.1. We call s1 the (first) principal component of x, which is obtained by projecting x to o1 . The first k largest components are given by s1 , . . . , sk . We call the subspace spanned by k eigenvectors o1 , . . . , ok the k-dimensional princip