Log-Rank Test

We show how to use the Log-Rank Test (aka the Peto-Mantel-Haenszel Test) to determine whether two survival curves are statistically significantly different.

Example

Example 1: Clinical trials of two cancer drugs were undertaken based on the data shown on the left side of Figure 1 (Trial A is the one described in Example 1 of Kaplan-Meier Overview).

As we did in Example 1 of Kaplan-Meier Overview, we can use the Kaplan-Meier method to calculate the empirical survival functions for each trial (using the combined values for the times t). This is shown in Figure 1.

Kaplan-Meier-compare-samples

Figure 1 – Two sample case

We now create survival charts for both trials, as shown in Figure 2.

Kaplan-Meier comparing samples

Figure 2 – Survival curves for both trials

Hypothesis Test

The results of the trials look similar, but are they statistically equivalent? We use the log-rank test to determine this. First, we create the following worksheet, based on the data in Figure 1, as shown in Figure 3.

The test resembles the chi-square test of independence. The observed values for the number of deaths are those given in columns AH and AK. We calculate expected values for the number of deaths for each time t for each trial (columns AJ and AM). The expected values {e}_j^A and {e}_j^B for time tj for trials A and B are given by the formulas

image041x

where
image042x

Note that
image043x

and so
image044x

Log-rank Test

Figure 3 – Log-Rank Test

The log-rank test statistic is then

image045x

where
image046ximage047x

If the null hypothesis is true (that the two survival distributions are the same), then the log-rank test statistic has a chi-square distribution with one degree of freedom, i.e.

image048x

For Example 2, ObsA = SUM(AH7:AH19) = 12 and ExpA = SUM(AJ7:AJ19) = 9.828, and similarly for trial B. Thus the log-rank test statistic (cell AR6) is

image049x

We see from Figure 3 (cell AR8) that p-value = CHISQ.DIST(AR6,AR7,TRUE) = .331 > .05 = α, and so we cannot reject the null hypothesis that the survival rates for the two drugs under trial are statistically the same.

Examples Workbook

Click here to download the Excel workbook with the examples described on this webpage.

References

NCSS (2015) Kaplan-Meier curves (logrank tests)
https://www.ncss.com/wp-content/themes/ncss/pdf/Procedures/NCSS/Kaplan-Meier_Curves-Logrank_Tests.pdf

Sullivan, L. (2016) Comparing survival curves
https://sphweb.bumc.bu.edu/otlt/MPH-Modules/BS/BS704_Survival/BS704_Survival5.html

Tian, L., Olshen, R (2017) Survival analysis: logrank test
https://web.stanford.edu/~lutian/coursepdf/survweek3.pdf

42 thoughts on “Log-Rank Test”

  1. Dr Zaiontz,
    Can I run the Log-Rank Test is my ‘years in trial’ are the same for every row and for both samples. I want to test for a difference in survivorship between two populations, both were studied for 19 years.
    Thanks,
    Gerard

    Reply
    • Hello Gerard,
      I woud think that this should work, although I have never tried it when the “years in trial” were the same. I suggest that you try and see what happens. It should work.
      Charles

      Reply
  2. why the time point of 9 years is added to the tables in Figures 1 and 3 as there are no death events in both groups for this time point?

    Reply
      • thank you Charles for a quick response. Can we omit the data points for which there are no events in either group (like t=9)? in the example above, t=9 doesn’t contribute to the expected and the observed values (the sums). Is there a case when they do contribute and therefore must be included in the calculations?

        Reply
        • Hi Kathrin,
          Sorry, but I don’t know the answer to your question. Once you work through the math it is probably obvious what the answer is, but I haven’t had the time to do this. Perhaps you can change your data in a few ways to test out some possible cases.
          Charles

          Reply
  3. Dr Zaiontz,

    Hello and thank you for all your amazing and comprehensive tutorials. Using your data, I replicated the whole p value calculation step. I had successfully replicated all the values as shown in your tutorial. However, when I reached the final step of calculating p values, instead of 0.331324, I obtained 0.668676 (which is actually 1-0.331324). I checked my formula and didn’t see anything wrong. Would you happen to know what happened?

    Looking forward for your reply soon.

    Adam

    Reply
    • Hello Adam,
      I am pleased that you are getting value from the tutorials.
      I used the CHISQ.DIST.RT (or CHIDIST) function (right tail) to calculate the p-value. You probably used the CHISQ.DIST function (left tail). You need to use the right tail version of the Chi-square distribution to get the p-value.
      Charles

      Reply
  4. Hello, thank you for your brief yet clear explanations. Just a note: there is a missing question mark in the first sentence after Figure 2: The results of the trials look similar, but are they statistically equivalent.

    Reply
    • Hello Jiří,
      Thanks for identifying the missing question mark. I have now added this.
      The chi-square test in Figure 3 shows that there isn-t evidence to disprove that they are equivalent.
      Charles

      Reply
  5. Hi Charles,

    I found your article very useful, and helpful to begin to understand the principles of survival analysis and the Log-rank test. I do have a concern, though, which is that taking your raw data and running it through survival analysis in both GraphPad Prism and R with the survival package gives a different result. In both cases the chi-square test result is 1.017, with a p=0.313. Sure, the difference is very small, but nevertheless since both other methods agree with each other I fear that there is a problem somewhere with your methodology.
    Best wishes,
    David

    Reply
  6. Prof,
    I think there’s a typo here: ExpA = SUM(AJ7:AJ19) = 9.428

    “9.428” should be “9.828092” (as shown on Figure 3), and subsequently, using the correct value to calculate the LR yields 1.069618. Using =CHIDIST() yields 0.301032. It still doesn’t change the fact the we reject the null. Just a bit confusing when following along.

    Many thanks for the lecture!

    -Ray

    Reply
    • Ray,
      I believe that the calculations shown in Figure 3 are correct, although, as you correctly point out, there is a typo in the text (9.428 should be 9.828). The values for LR and p-value still seem to be correct.
      Let me know whether you disagree and thanks for identifying the error in the text.
      Charles

      Reply
  7. Hello,
    All the explanation is really clear, thank you!
    But I cannot understand how you calculated the value “df=1” in cells AQ7 in figure 3.
    Can you explain me?
    Thank you very much,
    Federica

    Reply
  8. I am looking for step by step (simple) instruction on how to use excel for log rank. Is it something like that on these pages (or elsewhere)?

    Thank you!

    Reply
  9. Dear Dr Zaiontz,

    First, thank you very much, this website as been very useful.

    I was wondering why we keep the censored data when calculating Log-Rank Test.

    Also, if in your example, all patients in trial B were dead after 10 days, I assume you would still calculate “e” for trial A and B up to day 13. Am I right ?

    Thank you for your time

    Reply
  10. how would you calculate e if study is complete in that all the patients have passed? since you are dividing the n of one set to the total n, you end up dividing by 0. is the last e value set as 0?

    Reply
    • Kwan,
      Perhaps I don’t understand your question, but n is never set to zero. When the study is complete n is not zero, If you take it one step later (i.e. past completion) then yes n would be equal to zero, but you should not include that step in the analysis.
      Charles

      Reply
  11. I think you should use a right-tailed chi-squared distribution for calculating the p-value:

    p-value = CHISQ.DIST.RT(AR6,AR7)

    otherwise the bigger the log-rank, the closer to 1 the p-value gets, instead of being smaller.

    Reply
    • Ran,
      I believe that I used the older CHIDIST function, namely =CHIDIST(AR6,AR7) which is equivalent to = CHISQ.DIST.RT(AR6,AR7)
      Charles

      Reply
      • Yes, I was unable to reproduce your methods until I used CHISQ.DIST.RT(AR6,AR7). Thank you this has been very helpful.

        Reply

Leave a Comment