Using a priori information in regression analysis
- PDF / 152,464 Bytes
- 14 Pages / 594 x 792 pts Page_size
- 33 Downloads / 206 Views
USING A PRIORI INFORMATION IN REGRESSION ANALYSIS UDC 519.237.5
A. S. Korkhin
Abstract. The paper considers the methods to evaluate regression parameters under indefinite a priori information of two types: fuzzy and stochastic. Fuzzy a priori information is assumed to be formulated on the basis of fuzzy notions of the model designer. Stochastic a priori information is systems of equations, which are linear in regression parameters and whose right-hand sides are random variables. Regression parameters may both be constant and vary in time. A classification of the evaluation methods using indefinite a priori information is proposed and used to generalize well-known methods. An evaluation method is developed, which combines the fuzzy and stochastic a priori information about regression parameters. Keywords: stationary and nonstationary regressions, a priori information, fuzzy constraints, two-criteria estimation, mixed regression, combined methods of estimation. 1. INITIAL PROVISIONS A priori information is an important tool to enhahce the accuracy of regression models. It is also necessary to obtain prescribed properties formulated by developers based on the purpose of creation of one model or another. Using a priori information is always important, for example, if initial data for constructing a regression are a small sample. Such a situation is usual in econometric problems. In the present paper, by a priori information we will mean the information about model parameters available for the developer. The examples of such information are the signs of parameters, their possible limits, functional relations between parameters. All the reasoning in the paper is applied to a linear regression model. As its general form, let us consider switching regression with switching dependent on time t: y t = x¢t A t + e t , A t = J 0i ,
t Î I i = [ t i , Ti ], i = 1, N ,
(1)
where y t Î Ñ1 is a dependent variable; x t Î Ñn is a regressor (independent variable); J 0i Î Ñn is the true value of the regression parameter on the interval I i (unknown quantity); and e i is a random variable. The quantities y t and x t , t = 1, T , are assumed known and T = T N is the length of the observation interval. In (1) and in what follows, prime denotes transposition and vectors and matrices are bolded. The parameter of regression (1) is constant and equal to J 0i on the time interval I i with the number of observations N
mi . Suppose t 1 = 1, t i = Ti -1 + 1, i = 2, N . Then mi = Ti - t i + 1, i = 1, N , the length of the observation interval is T = å mi . i =1
Switching points t i , i = 2, N , are assumed known: they can be specified by the model developer or be found based on some theory (considering it is a separate problem and is beyond the scope of our study). National Mining University, Dnepropetrovsk, Ukraine, [email protected]. Translated from Kibernetika i Sistemnyi Analiz, No. 1, January–February, 2013, pp. 49–64. Original article submitted November 17, 2011. 1060-0396/13/4901-0041
©
2013 Springer Science+Business Media New York
Data Loading...