Cox Proportional Hazards Model Overview

The main characteristic of survival analysis that differentiates itself from other statistical or data-mining domains is that, methods in survival analysis are specifically designed to handle censored data.

A data point is considered censored, if the end point of interest is not observed for a particular individual. For this type of data, many modeling techniques are inappropriate, e.g., normal regression models.

The British statistician David Cox introduced the proportional hazards model in the 1972 paper, Regression Models and Life Tables, Journal of the Royal Statistical Society Series B 34 (2): 187-220. This statistical model, the Cox proportional hazards model, does not impose any specific form of the survivor function, allowing censored data to be modeled flexibly.

Specifically, Cox's proportional hazards model is a distribution-free model in which predictors are related to lifetime multiplicatively.

The form of the Cox proportional hazards model is as follows:

h(t|x) = h0(t) exp(xß)

where h0(t) is the baseline hazard and ß = ( ß1 , ..., ßp)' is the vector of regression coefficients. This model does not impose any distributional assumption on the baseline hazard. It is referred to as proportional because the ratio of hazard rates of two individuals is constant and not dependent on time.

This model has become popular in various domains whenever the dependent variable of interest represents the time to a terminal event, and the duration of study is limited in time.

Examples include:

  1. Medical applications
  2. Customer churn analysis
  3. Consumer credit risk
  4. Industrial applications, e.g. survival times of parts under stress