Cox's Proportional Hazard Model

The proportional hazard model is the most general of the regression models because it is not based on any assumptions concerning the nature or shape of the underlying survival distribution. The model assumes that the underlying hazard rate (rather than survival time) is a function of the independent variables (covariates); no assumptions are made about the nature or shape of the hazard function. Thus, in a sense, Cox's regression model can be considered to be a nonparametric method. The model can be written as:

h{(t), (z1, z2, ..., zm)} = h0(t)*exp(b1*z1 + ... + bm*zm)

where h(t,...) denotes the resultant hazard, given the values of the m covariates for the respective case (z1, z2, ..., zm) and the respective survival time (t).

The term h0(t) is called the baseline hazard; it is the hazard for the respective individual when all independent variable values are equal to zero.

You can linearize this model by dividing both sides of the equation by h0(t) and then taking the natural logarithm of both sides:

log[h{(t), (z...)}/h0(t)] = b1*z1 + ... + bm*zm

You now have a fairly "simple" linear model that can be readily estimated.

Assumptions
While no assumptions are made about the shape of the underlying hazard function, the preceding model equations imply the following two assumptions:
  1. The equations specify a multiplicative relationship between the underlying hazard function and the log-linear function of the covariates. This assumption is also called the proportionality assumption. In practical terms, it is assumed that, given two observations with different values for the independent variables, the ratio of the hazard functions for those two observations does not depend on time.
  2. There is a log-linear relationship between the independent variables and the underlying hazard function.