The variation of the posterior variance and Bayesian sample size determination
- PDF / 1,936,053 Bytes
- 21 Pages / 439.37 x 666.142 pts Page_size
- 93 Downloads / 146 Views
(0123456789().,-volV)(0123456789().,-volV)
ORIGINAL PAPER
The variation of the posterior variance and Bayesian sample size determination Jo¨rg Martin1
•
Clemens Elster1
Accepted: 20 July 2020 The Author(s) 2020
Abstract We consider Bayesian sample size determination using a criterion that utilizes the first two moments of the posterior variance. We study the resulting sample size in dependence on the chosen prior and explore the success rate for bounding the posterior variance below a prescribed limit under the true sampling distribution. Compared with sample size determination based on the average of the posterior variance the proposed criterion leads to an increase in sample size and significantly improved success rates. Generic asymptotic properties are proven, such as an asymptotic expression for the sample size and a sort of phase transition. Our study is illustrated using two real world datasets with Poisson and normally distributed data. Based on our results some recommendations are given.
1 Introduction Sample size determination (SSD) is the attempt to estimate the data size that is needed in order to meet a certain criterion (Desu 2012). This task is usually performed at a planning stage before any data is actually measured or recorded so that especially in the context of high financial or temporal expenses a careful SSD becomes indispensable. In the design, say, of animal experiments or clinical trials SSD can even have an ethical dimension (Charan and Kantharia 2013; Dell et al. 2002). In this article we study a Bayesian method for SSD that limits the expected fluctuations of the uncertainty of the result. By ‘‘uncertainty’’ we will here mean (the square root of) the posterior variance. For n data points xn ¼ ðx1 ; . . .; xn Þ drawn from a sampling distribution pðxn jhÞ with parameter h the posterior distribution is defined by & Jo¨rg Martin [email protected] Clemens Elster [email protected] 1
Physikalisch-Technische Bundesanstalt, Abbestraße 2, 10587 Berlin, Germany
123
J. Martin, C. Elster
pðhjxn Þ / pðhÞ pðxn jhÞ;
ð1Þ
where pðhÞ denotes the prior for the parameter h. The posterior variance is then given as u2n :¼ Varh pðhjxn Þ ðhÞ:
ð2Þ
In practice, a scientist performing an experiment might desire to specify her/his result with an according uncertainty, say h^ un ; with un being the square root of u2n as defined in (2) and with h^ being the posterior mean. In order for this result to be precise enough the scientist might desire to fulfill a condition such as un \e or, equivalently, u2n \e2
ð3Þ
for some small, positive e that is chosen a priori. As the posterior distribution is dependent on the data xn , so is u2n . Choosing an appropriate sample size n so that (3) is guaranteed before xn is known is only possible for a few restricted scenarios, for instance Bernoulli distributed samples (Pham-Gia and Turkkan 1992; Joseph and Be´lisle 2019). A more generally applicable criterion is to require instead of (3) u2n ¼ Exn mðxn Þ ½u2n \e2 ;
ð4Þ
R
where mðxn Þ ¼ pðxn jhÞp
Data Loading...