Transform-Both-Sides Regression

Fitting multiple regression models by the method of least squares is one of the most commonly used methods in statistics. There are a number of challenges to the use of least squares, even when it is only used for estimation and not inference, including t

PDF / 311,258 Bytes
10 Pages / 504.581 x 719.997 pts Page_size
59 Downloads / 179 Views

DOWNLOAD

REPORT

Transform-Both-Sides Regression

16.1 Background Fitting multiple regression models by the method of least squares is one of the most commonly used methods in statistics. There are a number of challenges to the use of least squares, even when it is only used for estimation and not inference, including the following. 1. How should continuous predictors be transformed so as to get a good ﬁt? 2. Is it better to transform the response variable? How does one ﬁnd a good transformation that simpliﬁes the right-hand side of the equation? 3. What if Y needs to be transformed non-monotonically (e.g., |Y − 100|) before it will have any correlation with X? When one is trying to draw an inference about population eﬀects using conﬁdence limits or hypothesis tests, the most common approach is to assume that the residuals have a normal distribution. This is equivalent to assuming that the conditional distribution of the response Y given the set of predictors X is normal with mean depending on X and variance that is (one hopes) a constant independent of X. The need for a distributional assumption to enable us to draw inferences creates a number of other challenges such as the following. 1. If for the untransformed original scale of the response Y the distribution of the residuals is not normal with constant spread, ordinary methods will not yield correct inferences (e.g., conﬁdence intervals will not have the desired coverage probability and the intervals will need to be asymmetric). 2. Quite often there is a transformation of Y that will yield well-behaving residuals. How do you ﬁnd this transformation? Can you ﬁnd a transformation for the Xs at the same time?

© Springer International Publishing Switzerland 2015

F.E. Harrell, Jr., Regression Modeling Strategies, Springer Series in Statistics, DOI 10.1007/978-3-319-19425-7 16

389

390

16 Transform-Both-Sides Regression

3. All classical statistical inferential methods assume that the full model was pre-speciﬁed, that is, the model was not modiﬁed after examining the data. How does one correct conﬁdence limits, for example, for data-based model and transformation selection?

16.2 Generalized Additive Models Hastie and Tibshirani275 have developed generalized additive models (GAMs) for a variety of distributions for Y . There are semiparametric GAMs, but most GAMs for continuous Y assume that the conditional distribution of Y is from a speciﬁc distribution family. GAMs nicely estimate the transformation each continuous X requires so as to optimize a ﬁtting criterion such as sum of squared errors or log likelihood, subject to the degrees of freedom the analyst desires to spend on each predictor. However, GAMs assume that Y has already been transformed to ﬁt the speciﬁed distribution family. There is excellent software available for ﬁtting a wide variety of GAMs, such as the R packages gam, mgcv, and robustgam.

16.3 Nonparametric Estimation of Y -Transformation When the model’s left-hand side also needs transformation, either to improve R2 or to achieve constant variance of the r

Data Loading...

Transform-Both-Sides Regression

Recommend Documents

Regression and Hierarchical Regression Models

Regression

Regression

Regression

Regression

Regression

Regression Analysis and Estimating Regression Models

Regression by special functions ISOTONIC REGRESSION PROBLEMS

Multivariate Regression

Linear Regression

Regression Analysis

Linear Regression