Testing the Order of a Normal Mixture in Mean

  • PDF / 472,688 Bytes
  • 18 Pages / 439.37 x 666.142 pts Page_size
  • 20 Downloads / 169 Views

DOWNLOAD

REPORT


A tribute to Professor Xiru Chen

Testing the Order of a Normal Mixture in Mean Jiahua Chen1,2 · Pengfei Li3

Received: 1 December 2015 / Accepted: 9 December 2015 / Published online: 14 March 2016 © School of Mathematical Sciences, University of Science and Technology of China and Springer-Verlag Berlin Heidelberg 2016

Abstract There has been a rapid progress in designing valid and effective statistical hypothesis tests for the order of a finite mixture model. In particular, EM-test for the order of the mixture model has been developed and found effective when the component distribution contains a single parameter. EM-test is found to be particularly effective and elegant for the order of normal mixture in both mean and variance. The idea behind EM-test has been found widely applicable. In this paper, we investigate the use of EM-test for the order of a finite normal mixture in the mean parameter with equal but unknown component variances. We show that for any positive integer m 0 ≥ 2, the limiting distribution of the EM-test for the order of m 0 against the higher order alternative is χm2 0 −1 . A genetic example is used to illustrate the application of the EM-test. Keywords EM-test · Homogeneity · Mixture model · Modified likelihood ratio test · Structural parameter Mathematics Subject Classification

B

62F03 · 62F05

Jiahua Chen [email protected]

1

School of Mathematics and Statistics, Yunnan University, Kunming, Yunnan, China

2

Department of Statistics, University of British Columbia, Vancouver, BC, Canada

3

Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, ON, Canada

123

22

J. Chen, P. Li

1 Introduction Normal mixture models are arguably the most important mixture models. They are widely used in many disciplines. For example, such models are often used for quantitative traits influenced by major genes [24,25], and for heterogeneity in the age of onset for male and female schizophrenia patients [12]. They play a fundamental role in cluster analysis [23,28], and in the study of the false discovery rate [2,11,22,27]. In statistical finance, they are used for daily stock returns [15]. The order m of a normal mixture is an important parameter in scientific applications. In genetics, if a quantitative trait is determined by a simple gene with two alleles, m = 2 implies that the mode of inheritance is dominant, m ≥ 3 implies that the mode of inheritance is additive or more complex [5,24]. A natural question is whether or not the gene is dominant. Testing the order of normal mixture models also provides a natural way to quantify the evidence for a more parsimonious model. Due to non-regularity, straightforward likelihood ratio test for the order of a finite mixture is difficult to implement: the corresponding statistics diverges to infinite if the parameter space is not restricted [1,14], or has complex limiting distributions otherwise [3,9,13,17–19]. There have been substantial new developments recently on developing likelihood-based valid and effective tests. When the com