Ingenieurmathematik kompakt mit Maple
In diesem didaktisch ansprechenden Einführungsbuch zu Maple werden leicht nachvollziehbar Aufgaben- und Problemstellungen der Ingenieurmathematik mit Maple bearbeitet. Sie beziehen sich u.a. auf das Lösen von Gleichungen, Ungleichungen und linearen Gleich
- PDF / 309,857 Bytes
- 20 Pages / 430 x 660 pts Page_size
- 7 Downloads / 208 Views
In this chapter, we briefly review the background of projection methods.
3.1
Linear Projection Methods
Exploratory data analysis is a set of methods with which we try to extract as much information as possible from a data set of high dimension and huge volume. However, performing analysis of complex data usually involves a large number of variables and analysis with a large number of variables generally requires a large amount of memory and computational power and may generalize poorly to new samples. Many techniques change the basis of the considered data space by projecting the data to a lower dimensional space. The basic idea is to find some suitable function ϕ : D → d , d D, which maps the original data sample x ∈ D into a d-dimensional manifold by ϕ(x) = y, where x ∈ D , y ∈ d . In this section, we review several projection methods in detail. 3.1.1
Principal Component Analysis
Principal component analysis (PCA) is a well-known statistical technique for multivariate data analysis. The goal of principal component analysis is to find an orthogonal basis such that the elements of projection of the data in subspace d become uncorrelated, thus this method focuses on the firstand second- order statistics. The variances of the projections of the data are maximized by finding a set of filters W , so that the first principal component (PC) accounts for the maximal variance based on the first filter w1 , the second principal component in the direction othogonal to the first PC corresponds to most of the remaining variance based on w2 , and so on. Figure 3.1 shows the first principal component of a two-dimensional data cloud and the second principal component would be orthogonal to the first. W.A. Barbakh et al.: Non-Stand. Param. Adapt. for Explor. Data Analys., SCI 249, pp. 29–48. c Springer-Verlag Berlin Heidelberg 2009 springerlink.com
30
3 Review of Linear Projection Methods
Fig. 3.1. The first principal component of a two-dimensional data cloud; the first principal component is in the direction with maximal variance
In mathematical terms, we may think of PCA as a linear combination yi =
D
wik xk = wiT x
(3.1)
k=1
where xk is the k − th element of the data set and w1 . . . wd is a set of orthogonal unit norm weight vectors, or filters. The factor yi represents the i − th principal component, where y1 is called the first principal component and thus the variance of y1 is maximally large. yi is constrained to be of maximal variance subject to yi being uncorrelated with all the previously found principal components: Ex {yi yk } = 0, k = 1, . . . , i − 1.
(3.2)
where we have assumed zero mean data. Since the variance of yi depends on the weight vector wi , we look for such a weight vector maximizing the PCA criterion Ji (wi ) = E{yi2 } = E{(wiT x)2 } = wiT E{xxT }wi = wiT Cx wi ,
(3.3)
where ||wi || = 1 and the matrix Cx is the D × D covariance matrix of the data x Cx = E{xxT }, (3.4)
3.1 Linear Projection Methods
31
if the data x is zero-mean. It has been shown [128] that the maximization occurs when
Data Loading...