Inference for a Population Proportion

Statistical inference is drawing conclusions about an entire population based on data in a sample drawn from that population. From both frequentist and Bayesian perspectives, there are three main goals of inference: estimation, hypothesis testing, and pre

  • PDF / 332,454 Bytes
  • 17 Pages / 439.36 x 666.15 pts Page_size
  • 11 Downloads / 191 Views

DOWNLOAD

REPORT


Inference for a Population Proportion

Statistical inference is drawing conclusions about an entire population based on data in a sample drawn from that population. From both frequentist and Bayesian perspectives, there are three main goals of inference: estimation, hypothesis testing, and prediction. Estimation and hypothesis testing deal with drawing conclusions about unknown and unobservable population parameters. Prediction is estimating the values of potentially observable but currently unobserved quantities. For example, we might want to predict the number of “yesses” in a future survey of 50 UI students. Prediction in statistical inference isn’t restricted to predicting future observations, however. It may refer to estimating values that have already occurred but were not measured. For example, we may want to use values of acid rain deposition measured from rain gauges at specific sites to predict acid rain deposition at other locations that have no rain gauges. Before investigating how the Bayesian uses the posterior distribution of a population parameter to make inference, we will review the approach usually undertaken by frequentists so that we are ready to make comparisons.

4.1 Estimation and Testing: Frequentist Approach 4.1.1 Maximum Likelihood Estimation When trying to estimate the unknown value of a population parameter, the frequentist statistician, like the Bayesian, begins by specifying the distribution of the data, given the unknown parameter(s). In the case of our data consisting of independent yes/no responses to the survey questions, this will be the binomial probability mass function, first given in (3.1) and repeated here:

M.K. Cowles, Applied Bayesian Statistics: With R and OpenBUGS Examples, Springer Texts in Statistics 98, DOI 10.1007/978-1-4614-5696-4 4, © Springer Science+Business Media New York 2013

49

50

4 Inference for a Population Proportion

  n p(y|π ) = π y (1 − π )n−y, y = 0, 1, . . . , n y Then the statistician switches perspective and views the same expression as a function of the unknown parameter, given known data values. The frequentist does not treat the parameter as if it were a random variable and does not specify a prior distribution to summarize other information not contained in the current dataset. One goal of frequentist estimation is to obtain a point estimate of a population parameter. The point estimate may be thought of as the best single-number guess of the value of the population parameter, based solely on the current data. The most commonly used method of frequentist point estimation is maximum likelihood estimation—finding the value of the parameter that would give the largest possible evaluation of the likelihood. Intuitively, this is the value of the parameter that would have made the observed data the most likely. A maximum likelihood estimate is the numeric value calculated for a particular dataset. A maximum likelihood estimator is the formula for calculating maximum likelihood estimates for a given form of the likelihood. When the likelihood is