Probabilistic DEAR models

  • PDF / 963,341 Bytes
  • 17 Pages / 595.276 x 790.866 pts Page_size
  • 4 Downloads / 220 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

Probabilistic DEAR models Yanhong Cui • Renkuan Guo • Danni Guo

Received: 18 August 2011 / Accepted: 18 May 2012 / Published online: 21 June 2012  Springer-Verlag 2012

Abstract The differential equation associated regression (DEAR) is a flexible and powerful data mining modeling approach, which is intended to catch up the first-order nonlinear trend (i.e., regularity) governing the behavior of the data under investigation. DEAR modeling is a formal mathematical–statistical representation of the so-called grey differential equation model. It should be pointed out that DEAR models were originally proposed on the random fuzzy theoretical foundation. Nevertheless, DEAR models can be defined on any measure theoretic platform, for example, probabilistic, fuzzy, or uncertain measure foundation as long as the model and approximation two constituting components are appropriately specified. In this paper, we re-examine the compositional elements of DEAR models and the potential model selection portfolio in the statistical machine learning (SML) algorithm developments. Then the differential equation backed DEAR may contribute to the SML algorithm significantly, particularly, in developing robot movement system, where the motion laws are expressed directly by a set of differential equations. Under a statistical decision theoretical framework, a DEAR model which is constituted by a random function with a linear difference equation-wise regression as the central tendency and a variance bound specified by Gaussian error analysis theory is developed delicately, in

Y. Cui  R. Guo (&) Department of Statistical Sciences, University of Cape Town, Rondebosch, Private Bag, Cape Town 7701, South Africa e-mail: [email protected] D. Guo South African National Biodiversity Institute, Kirstenbosch Research Center, Claremont, Private Bag X7, Cape Town 7735, South Africa e-mail: [email protected]

which the prior distribution will be facilitated by a Gaussian process such that the replication of sampling for estimating the weight matrix will be avoided. We not only address the model selection compositional elements of the SML algorithm but also address the optimization scheme, which is called k-global optimization scheme to make the DEAR learning as one of the fastest, most efficient and accurate SML algorithm. Keywords SML algorithm  Grey differential equation  Coupling principle  DEAR models  Gaussian process DEAR models  k-Global optimization scheme

1 Introduction Since machine learning, a merging field of computer science and statistics, was initiated by a group of computer scientists 50 years ago, it has become one of the fastest developing and most successful scientific fields with many magnificent industrial applications, for example, genetic engineering, speech recognition, face recognition, computer vision, bio-surveillance, robot control, etc., and the new computational statistics, i.e., statistical machine learning (abbreviated as SML) algorithm, has accelerating the developments of many empirical sc