Basic Notions
This chapter presents a brief introduction in the book illustrated by few examples. It also defines some useful notions like linear and general statistical models, risk and efficiency.
- PDF / 144,642 Bytes
- 10 Pages / 439.36 x 666.15 pts Page_size
- 74 Downloads / 208 Views
Basic Notions
The starting point of any statistical analysis is data, also called observations or a sample. A statistical model is used to explain the nature of the data. A standard approach assumes that the data is random and utilizes some probabilistic framework. On the contrary to probability theory, the distribution of the data is not known precisely and the goal of the analysis is to infer on this unknown distribution. The parametric approach assumes that the distribution of the data is known up to the value of a parameter from some subset ‚ of a finite-dimensional space Rp . In this case the statistical analysis is naturally reduced to the estimation of the parameter : as soon as is known, we know the whole distribution of the data. Before introducing the general notion of a statistical model, we discuss some popular examples.
1.1 Example of a Bernoulli Experiment Let Y D .Y1 ; : : : ; Yn /> be a sequence of binary digits zero or one. We distinguish between deterministic and random sequences. Deterministic sequences appear, e.g., from the binary representation of a real number, or from digitally coded images, etc. Random binary sequences appear, e.g., from coin throw, games, etc. In many situations incomplete information can be treated as random data: the classification of healthy and sick patients, individual vote results, the bankruptcy of a firm or credit default, etc. Basic assumptions behind a Bernoulli experiment are: • the observed data Yi are independent and identically distributed. • each Yi assumes the value one with probability 2 Œ0; 1. The parameter completely identifies the distribution of the data Y . Indeed, for every i n and y 2 f0; 1g,
V. Spokoiny and T. Dickhaus, Basics of Modern Mathematical Statistics, Springer Texts in Statistics, DOI 10.1007/978-3-642-39909-1__1, © Springer-Verlag Berlin Heidelberg 2015
1
2
1 Basic Notions
P.Yi D y/ D y .1 /1y ; and the independence of the Yi ’s implies for every sequence y D .y1 ; : : : ; yn / that n Y P Y Dy D yi .1 /1yi :
(1.1)
i D1
To indicate this fact, we write P in place of P. Equation (1.1) can be rewritten as P Y D y D sn .1 /nsn ; where sn D
n X
yi :
i D1
The value sn is often interpreted as the number of successes in the sequence y. Probabilistic theory focuses on the probabilistic properties of the data Y under the given measure P . The aim of the statistical analysis is to infer on the measure P for an unknown based on the available data Y . Typical examples of statistical problems are: 1. Estimate the parameter , i.e. build a function Q of the data Y into Œ0; 1 which approximates the unknown value as well as possible; 2. Build a confidence set for , i.e. a random (data-based) set (usually an interval) containing with a prescribed probability; 3. Testing a simple hypothesis that coincides with a prescribed value 0 , e.g. 0 D 1=2; 4. Testing a composite hypothesis that belongs to a prescribed subset ‚0 of the interval Œ0; 1. Usually any statistical method is based on a
Data Loading...