Priors via imaginary training samples of sufficient statistics for objective Bayesian hypothesis testing

  • PDF / 762,416 Bytes
  • 21 Pages / 439.37 x 666.142 pts Page_size
  • 96 Downloads / 166 Views

DOWNLOAD

REPORT


Priors via imaginary training samples of sufficient statistics for objective Bayesian hypothesis testing D. Fouskakis1 Received: 28 March 2019 / Accepted: 21 September 2019 © Sapienza Università di Roma 2019

Abstract The expected-posterior prior (EPP) and the power-expected-posterior (PEP) prior are based on random imaginary observations and offer several advantages in objective Bayesian hypothesis testing. The use of sufficient statistics, when these exist, as a way to redefine the EPP and PEP prior is investigated. In this way the dimensionality of the problem can be reduced, by generating samples of sufficient statistics instead of generating full sets of imaginary data. On the theoretical side it is proved that the new EPP and PEP definitions based on imaginary training samples of sufficient statistics are equivalent with the standard definitions based on individual training samples. This equivalence provides a strong justification and generalization of the definition of both EPP and PEP prior, since from the individual samples or from the sufficient samples the criteria coincide. This avoids potential inconsistencies or paradoxes when only sufficient statistics are available. The applicability of the new definitions in different hypotheses testing problems is explored, including the case of an irregular model. Calculations are simplified; and it is shown that when testing the mean of a normal distribution the EPP and PEP prior can be expressed as a beta mixture of normal priors. The paper concludes with a discussion about the interpretation and the benefits of the proposed approach. Keywords Bayesian hypothesis testing · Expected-posterior priors · Imaginary training samples · Objective priors · Power-expected-posterior priors · Sufficient statistics

1 Introduction In this paper focus is given on testing statistical hypotheses under a Bayesian perspective. Let Y = (Y1 , . . . , Yn )T be a random vector for which two competing models are proposed, under the following two hypotheses H0 : model M0 : f ( y|θ 0 , M0 ), θ 0 ∈ 0 H1 : model M1 : f ( y|θ 1 , M1 ), θ 1 ∈ 1 ,

B 1

(1)

D. Fouskakis [email protected] Department of Mathematics, National Technical University of Athens, Athens, Greece

123

D. Fouskakis

where θ 0 and θ 1 are unknown, model specific, parameters and f ( y|θ  , M ) denotes the sampling distribution under model M ,  ∈ {0, 1}. For the rest of the paper it will be further assumed that M0 is nested in M1 ; see for example Consonni et al. [8] for a brief discussion of this point. In some occasions, a particular value of the parameter θ is proposed under the null hypothesis; see for instance Bernardo and Rueda [7]. For example under the variable selection problem when we wish to test the effect of a particular covariate the null hypothesis suggests that its coefficient is equal to zero. The Bayesian approach requires specification of prior densities for the unknown model parameters and also specification of prior model probabilities. For the observed data y = (y1 , . . . , yn )T , the