Introduction to Survival Analysis

Suppose that one wished to study the occurrence of some event in a population of subjects. If the time until the occurrence of the event were unimportant, the event could be analyzed as a binary outcome using the logistic regression model. For example, in

  • PDF / 448,984 Bytes
  • 24 Pages / 504.581 x 719.997 pts Page_size
  • 33 Downloads / 244 Views

DOWNLOAD

REPORT


Introduction to Survival Analysis

17.1 Background Suppose that one wished to study the occurrence of some event in a population of subjects. If the time until the occurrence of the event were unimportant, the event could be analyzed as a binary outcome using the logistic regression model. For example, in analyzing mortality associated with open heart surgery, it may not matter whether a patient dies during the procedure or he dies after being in a coma for two months. For other outcomes, especially those concerned with chronic conditions, the time until the event is important. In a study of emphysema, death at eight years after onset of symptoms is different from death at six months. An analysis that simply counted the number of deaths would be discarding valuable information and sacrificing statistical power. Survival analysis is used to analyze data in which the time until the event is of interest. The response variable is the time until that event and is often called a failure time, survival time, or event time. Examples of responses of interest include the time until cardiovascular death, time until death or myocardial infarction, time until failure of a light bulb, time until pregnancy, or time until occurrence of an ECG abnormality during exercise. Bull and Spiegelhalter83 have an excellent overview of survival analysis. The response, event time, is usually continuous, but survival analysis allows the response to be incompletely determined for some subjects. For example, suppose that after a five-year follow-up study of survival after myocardial infarction a patient is still alive. That patient’s survival time is censored on the right at five years; that is, her survival time is known only to exceed five years. The response value to be used in the analysis is 5+. Censoring can also occur when a subject is lost to follow-up. If no responses are censored, standard regression models for continuous responses could be used to analyze the failure times by writing the expected failure time as a function of one or more predictors, assuming that © Springer International Publishing Switzerland 2015

F.E. Harrell, Jr., Regression Modeling Strategies, Springer Series in Statistics, DOI 10.1007/978-3-319-19425-7 17

399

1

2

400

17 Introduction to Survival Analysis

the distribution of failure time is properly specified. However, there are still several reasons for studying failure time using the specialized methods of survival analysis. 1. Time to failure can have an unusual distribution. Failure time is restricted to be positive so it has a skewed distribution and will never be normally distributed. 2. The probability of surviving past a certain time is often more relevant than the expected survival time (and expected survival time may be difficult to estimate if the amount of censoring is large). 3. A function used in survival analysis, the hazard function, helps one to understand the mechanism of failure.308 Survival analysis is used often in industrial life-testing experiments, and it is heavily used in clinical and epi