The variation of the posterior variance and Bayesian sample size determination

  • PDF / 1,936,053 Bytes
  • 21 Pages / 439.37 x 666.142 pts Page_size
  • 93 Downloads / 146 Views

DOWNLOAD

REPORT


(0123456789().,-volV)(0123456789().,-volV)

ORIGINAL PAPER

The variation of the posterior variance and Bayesian sample size determination Jo¨rg Martin1



Clemens Elster1

Accepted: 20 July 2020  The Author(s) 2020

Abstract We consider Bayesian sample size determination using a criterion that utilizes the first two moments of the posterior variance. We study the resulting sample size in dependence on the chosen prior and explore the success rate for bounding the posterior variance below a prescribed limit under the true sampling distribution. Compared with sample size determination based on the average of the posterior variance the proposed criterion leads to an increase in sample size and significantly improved success rates. Generic asymptotic properties are proven, such as an asymptotic expression for the sample size and a sort of phase transition. Our study is illustrated using two real world datasets with Poisson and normally distributed data. Based on our results some recommendations are given.

1 Introduction Sample size determination (SSD) is the attempt to estimate the data size that is needed in order to meet a certain criterion (Desu 2012). This task is usually performed at a planning stage before any data is actually measured or recorded so that especially in the context of high financial or temporal expenses a careful SSD becomes indispensable. In the design, say, of animal experiments or clinical trials SSD can even have an ethical dimension (Charan and Kantharia 2013; Dell et al. 2002). In this article we study a Bayesian method for SSD that limits the expected fluctuations of the uncertainty of the result. By ‘‘uncertainty’’ we will here mean (the square root of) the posterior variance. For n data points xn ¼ ðx1 ; . . .; xn Þ drawn from a sampling distribution pðxn jhÞ with parameter h the posterior distribution is defined by & Jo¨rg Martin [email protected] Clemens Elster [email protected] 1

Physikalisch-Technische Bundesanstalt, Abbestraße 2, 10587 Berlin, Germany

123

J. Martin, C. Elster

pðhjxn Þ / pðhÞ  pðxn jhÞ;

ð1Þ

where pðhÞ denotes the prior for the parameter h. The posterior variance is then given as u2n :¼ Varh  pðhjxn Þ ðhÞ:

ð2Þ

In practice, a scientist performing an experiment might desire to specify her/his result with an according uncertainty, say h^  un ; with un being the square root of u2n as defined in (2) and with h^ being the posterior mean. In order for this result to be precise enough the scientist might desire to fulfill a condition such as un \e or, equivalently, u2n \e2

ð3Þ

for some small, positive e that is chosen a priori. As the posterior distribution is dependent on the data xn , so is u2n . Choosing an appropriate sample size n so that (3) is guaranteed before xn is known is only possible for a few restricted scenarios, for instance Bernoulli distributed samples (Pham-Gia and Turkkan 1992; Joseph and Be´lisle 2019). A more generally applicable criterion is to require instead of (3) u2n ¼ Exn  mðxn Þ ½u2n \e2 ;

ð4Þ

R

where mðxn Þ ¼ pðxn jhÞp