Statistic Feature Extraction

Feature extraction is a commonly used technique applied before diagnosis and prognosis when a number of measures, or features, have been taken from a set of objects in a typical statistical pattern recognition or trending reasoning task. The goal is to de

  • PDF / 1,076,210 Bytes
  • 38 Pages / 439.37 x 666.142 pts Page_size
  • 7 Downloads / 215 Views

DOWNLOAD

REPORT


Statistic Feature Extraction

5.1

Introduction

Feature extraction is a commonly used technique applied before diagnosis and prognosis when a number of measures, or features, have been taken from a set of objects in a typical statistical pattern recognition or trending reasoning task. The goal is to define a mapping from the original representation space into a new space where the classes are more easily separable. This will reduce the classifier or prediction complexity, increasing in most cases accuracy. Accurate data-driven PHM/CBM needs multi-sensor to obtain detailed condition information, which results in plenty of raw data, thereby many features are calculated corresponding to last section to keep data information at the highest level. Too many features can cause curse of dimensionality and peaking phenomenon that greatly degrades classification accuracy. Also many features still can bring traffic jam or storage problem. So what are the curse of dimensionality and peaking phenomena and how to handle them? Two important phenomena that can be identified are the so-called curse of dimensionality and peaking phenomenon. The performance of a data-driven PHM/CBM system depends on the interrelationship between sample size, number of features, and algorithm complexity. If one consider a very simple naive table-lookup technique consisting in partitioning the feature space into cells and associating a class label to each cell, it can be pointed out that this technique requires a number of training data points which is exponential in the feature space dimension (Bishop 1995). This phenomenon is termed the curse of dimensionality which produces as a consequence the peaking phenomenon in classifier design. This is a paradoxical effect that appears by considering the following; it is well known that the probability of misclassification of a decision rule does not increase as the number of considered features increases, as long as the class-conditional densities are known (or alternatively the number of training samples is arbitrarily large and representative of the underlying densities). © Springer Science+Business Media Singapore and Science Press, Beijing, China 2017 G. Niu, Data-Driven Technology for Engineering Systems Health Management, DOI 10.1007/978-981-10-2032-2_5

101

102

5 Statistic Feature Extraction

However, it has been often noticed in practice that increasing the features to be considered by a classifier may degrade its performance if the number of training examples that are used to design the classifier is small relative to the number of features. This paradoxical behavior is termed the peaking phenomenon (Raudys and Jain 1991). The explanation stands in the following: The most commonly used parametric classifier estimates the unknown parameters and plugs them in for the true ones in the class-conditional densities. For a fixed sample size, as the number of features increases, and consequently the number of unknown parameters to be estimated from the sample, the reliability of parameter estimation decreases. As