Cox Regression using Solver

We now show how to calculate the Cox regression coefficients using Solver for the model:

image070

where X is a list of random variables x1, …, xr. This model is equivalent to

image071

whereimage072

Now for any subject s in the sample with values xs1, …, xsr for the r covariates, we define

image073x

whereimage074x

We assume that time is divided into the following intervals  t0 < t1 < ⋅⋅⋅< tm where t0 is the start time and tm is the time of the last observed death. The risk set Rj is the set of all subjects who survive to the time just before tj. Thus the risk set consists of all those who die at time tj or who are censored or die after time tj. Let Dj be the set of all subjects who die at time tj. nj = the number of survivors just before time tj (i.e. the number of elements in Rj) and dj = the number who die at time tj (i.e. the number of elements in Dj) and no one dies between times tj and tj+1.

We further assume that dj > 0 for each j (i.e. we won’t assign times tj for censored data unless a subject also dies at that time).

In fact, initially, we will assume that dj = 1 for each j. If we consider time as a continuum, then this is a reasonable assumption since two deaths are unlikely to occur at precisely the same time. If we think of time in discrete steps (as we have until now), then this assumption is too restrictive. For now, we will stick with this assumption, but we will return to this issue later.

If dj = 1 for each j, then the partial likelihood function is

image075x

where xji is defined to be xsi where s is the unique element in Dj. Similarly, gj(X) is defined to be gs(X) in this case.

The natural log of L is then

image076x

whereimage077x

As we have done for logistic regression, our goal is to find the coefficients b1, …, br which maximize LL. First, we show how this is done using Excel’s Solver and then we will use Newton’s Method, which will also provide additional information.

Example 1: Find the coefficients for Cox regression for the clinical trial for 18 patients shown in range B3:E21 of Figure 1. Here each patient is given an experimental drug for the treatment of lung cancer and the number of months of survival is recorded until the patient dies (died = 1) or leaves the trial or has not died when the clinical trial ends (died = 0), along with the patient’s age and the size of the tumor at the start of treatment.

We begin by sorting the data by the number of months of survival, as shown in range G3:J21 of Figure 1. This can be done by using Excel’s sort capability (Data > Sort&Filter|Sort) or by using the Real Statistics formula =QSORTRows(B4:E21). Since we assume that deaths precede censored data, we need to make sure that the patient in range G16:J16 comes before the patient in range G17:J17 (since both survive 62 months).

Cox regression initial state

Figure 1 – Cox Regression (initial state)

We now find the value of LL by placing the array formula =MMULT(I4:J4,$O$4:$O$5) in cell K4, the formula =EXP(K4) in cell L4 and the formula =IF(H4=1,K4-LN(SUM(L4:L$21)),0) in cell M4. We then highlight the range K4:M21 and press Ctrl-D. Finally, we place the formula =SUM(M4:M21) in cell M22.

We see from Figure 1 that when b1 = b2 = 0, LL = -15.2733. We next use Excel’s Solver capability by selecting Data > Analysis|Solver. We fill in the dialog box that appears as shown in Figure 2 and then press the OK button.

Cox regression Solver dialog

Figure 2 – Solver dialog box

Solver then adjusts the values of the coefficients (range O4:O5) to maximize the value of LL (cell M22). The results are shown in Figure 3.

As we can see, the coefficient corresponding to age is .202465 and the coefficient corresponding to size is .094457. We also see that the value of LL increased from -15.2733 to -11.0447.

Cox regression Solver output

Figure 3 – Cox Regression (Solver)

15 thoughts on “Cox Regression using Solver”

  1. Is the g sub j (X) in the numerator of the partial likelihood function the subject who dies at t sub j? If so, then it will help to mention that explicitly.

    Reply
    • Parikshit,
      It is mentioned explicitly. See the following three statements on the webpage:
      “Dj be the set of all subjects who die at time tj”
      “In fact, initially we will assume that dj = 1 for each j.” (i.e. one death in Dj)
      “where xji is defined to be xsi where s is the unique element in Dj. Similarly gj(X) is defined to be gs(X) in this case.” (this is where it is stated)
      Charles

      Reply
  2. Charles,

    Thanks for posting this excellent resource. It’s very helpful to see things step-by-step. In trying to replicate the above spreadsheet and related outputs I may have stumbled on a small inconsistency in the paragraph below Figure 1. Should ‘=EXP(J4)’ read as ‘=EXP(K4)’ instead? (Since column K contains B(x) whilst L is exponent of B(x)).

    Reply
    • Huey,
      Glad that you find the posts useful and thanks for finding this error. I have now corrected the referenced webpage.
      Your help in making the website easier to read is very much appreciated by me and other users.
      Charles

      Reply
  3. Hi Charles,

    Excellent blog. I don’t understand things well unless I solve them myself.

    I had a question regarding the data. How should I read data from row 6. at the end of month 25 no one died. then what does the age and size correspond to?

    Thank you
    Akshay

    Reply
  4. Dr. Buenas noches, espero se encuentre muy bien, agradezco mucho su trabajo por facilitarnos esta página realmente útil. Dr no entiendo por que no me funciona la regresión de Box Cox, si trabajo con una varaibel dependinte y una independiente, me informa que debo tener mas de 3 independientes, ubico 3 independientes y sale la expresión #¡VALOR!.

    Dr, good evening, I am waiting that you are ok, I am grateful by your page, realy its very util. Dr I do not understand why in Box Cox regression, when I work with one dependent variable and independent variable the Real Statistics inform that is not possible, and when i use more independent variables, Real Statistiscs inform to me #¡VALOR!.

    Reply

Leave a Comment