Asymptotic Theory of Statistical Inference

Let \(M=\left\{ p({\varvec{x}}, {\varvec{\xi }})\right\} \) be a statistical model specified by parameter \({\varvec{\xi }}\) , which is to be estimated.

PDF / 235,366 Bytes
13 Pages / 439.37 x 666.142 pts Page_size
113 Downloads / 346 Views

DOWNLOAD

REPORT

Asymptotic Theory of Statistical Inference

7.1 Estimation Let M = { p(x, ξ)} be a statistical model specified by parameter ξ, which is to be estimated. When we observe N independent data D = {x 1 , . . . , x N } generated from p(x, ξ), we want to know the underlying parameter ξ. This is a problem of estimation, and an estimator ξˆ = f (x 1 , . . . , x N )

(7.1)

is a function of D. The estimation error is given by e = ξˆ − ξ,

(7.2)

when ξ is the true value. The bias of the estimator is defined by b(ξ) = E ξˆ − ξ,

(7.3)

where the expectation is taken with respect to p(x, ξ). An estimator is unbiased when b(ξ) = 0. The asymptotic theory studies the behavior of an estimator when N is large. When the bias satisfies (7.4) lim b(ξ) = 0, N →∞

it is asymptotically unbiased. It is expected that a good estimator converges to the true parameter as N tends to infinity. It is written as (7.5) lim ξˆ = ξ. N →∞

© Springer Japan 2016 S. Amari, Information Geometry and Its Applications, Applied Mathematical Sciences 194, DOI 10.1007/978-4-431-55978-8_7

165

166

7 Asymptotic Theory of Statistical Inference

When this holds, an estimator is consistent. The accuracy of an estimator is measured by the error covariance matrix, V = Vi j , Vi j = E ξˆi − ξi ξˆ j − ξ j . (7.6) It decreases in general in proportion to 1/N , so that the estimator ξˆ becomes sufficiently accurate as N increases. The well-known Cramér–Rao Theorem gives a bound of accuracy. ˆ the following inequality Theorem 7.1 For an asymptotically unbiased estimator ξ, holds: 1 (7.7) V ≥ G−1 , N 1 E ξˆi − ξi ξˆ j − ξ j ≥ g i j , (7.8) N where G = gi j is the Fisher information matrix, G−1 = g i j is its inverse, and the matrix inequality implies that V − G−1 /N is positive semi-definite. The maximum likelihood estimator (MLE) is the maximizer of the likelihood, ξˆ MLE = arg max ξ

N

p (x i , ξ) .

(7.9)

i=1

It is known that the MLE is asymptotically unbiased and its error covariance satisfies VMLE

1 = G−1 + O N

1 N2

,

(7.10)

attaining the Cramér–Rao bound (7.7) asymptotically. Such an estimator is said to be Fisher efficient (first-order efficient). Remark We do not mention Bayes estimators, where a prior distribution of parameters is used. However, when the prior distribution is uniform, the MLE is the maximum a posteriori Bayes estimator. Moreover, it has the same asymptotic properties for any regular Bayes prior. Information geometry of Bayes statistics will be touched upon in a later chapter.

7.2 Estimation in Exponential Family An exponential family is a model having excellent properties such as dual flatness. We begin with an exponential family

7.2 Estimation in Exponential Family

167

p(x, θ) = exp {θ · x − ψ(θ)}

(7.11)

to study the statistical theory of estimation, because it is simple and transparent. Given data D, their joint probability distribution is written as p(D, θ) = exp [N {(θ · x¯ ) − ψ(θ)}] ,

(7.12)

where x¯ is the arithmetic mean of the observed examples, x¯ =

N 1 xi . N i=1

(7.13)

Data Loading...

Asymptotic Theory of Statistical Inference

Recommend Documents

Statistical Inference

Introduction to Statistical Inference

Inference Control in Statistical Databases

Statistical Inference on Random Structures

Asymptotic inference for AR(1) panel data

Statistical inference of some effect sizes

Asymptotic perturbation theory

Stochastic Processes - Inference Theory

Statistical Theory of Heat

Topics on Methodological and Applied Statistical Inference

Probability and Statistical Inference Volume 1: Probability

Composite Asymptotic Expansions: Gevrey Theory