Real Statistics Capabilities for Kaplan-Meier

Real Statistics Function: The Real Statistics Resource Pack provides the following array function to calculate the log-rank test and other tests to determine whether two survival curves are statistically different.

LOGRANK(R1, R2, lab) – returns a 4 × 2 range which contains the following statistics along with their p-value (using a chi-square test with df = 1): Log-rank 1, Log-rank 2, Wilcoxon, Tarone-Ware, when lab = FALSE (default). If lab = TRUE, then the output is a 5 × 3 range including labels.

Referring to Example 3 of Log-Rank Test, the output from the array formula  =LOGRANK(H8:I19,O8:P19,TRUE) is shown in Figure 1.

Log-Rank Real Statistics

Figure 1 – Log-Rank and similar tests

Real Statistics Data Analysis Tool: The Real Statistics Resource Pack provides the Survival Analysis data analysis tool to perform Kaplan-Meier Survival Analysis.

For example, to perform the analysis for Example 1, press Ctrl-m and select the Survival Analysis option (selected from the Misc tab when using the Multipage user interface). Fill in the dialog box that appears as shown in Figure 2 and click on the OK button.

Survival Analysis dialog box

Figure 2 – Survival Analysis dialog box

The output for the one-sample analysis is shown in Figure 3.

Kaplan-Meier survival analysis

Figure 3 – Kaplan-Meier Survival Analysis

The analysis for Example 3 is done similarly. This time by inserting A5:B23 in Input Range 1 and D5:E23 in Input Range 2 of Figure 2. The output is shown in Figure 4 and 5.

Kaplan-Meier two samples

Figure 4 – Kaplan-Meier Survival Analysis Part 1

two sample survival curve

Figure 5 – Kaplan-Meier Survival Analysis Part 2

Note that you can also use a stacked version of the data in Figure 4 as input. Such data consists of three columns, where the third column contains a 1 for the elements in Trial A and a 2 for the elements in Trial B (actually any two numbers will do). Figure 6 shows the first 10 and last 10 data elements for Example 3 in this format. If you insert range A3:C39 in Input Range 1 (and leave Input Range 2 blank) of the dialog box in Figure 2, then the output will be the same as that shown in Figures 4 and 5.

Kaplan-Meier stacked data

Figure 6 – Data (middle 16 data elements not shown)

26 thoughts on “Real Statistics Capabilities for Kaplan-Meier”

  1. Hi Charles.

    Thank you for your amazing work here. I started looking into Kaplan Meier survival analysis, and in statistical software like STATA it is possible to add “Numbers at risk” to the diagrams. Have you considered that option, or did I miss it?

    Reply
  2. Dear Dr. Zaiontz.

    Thank you for your PERFECT site according Kaplan-Meier.
    A few days ago I asked you for help according repairable Systems. This Topic is solved. It just was a matter of sorting / filtering the Population accordingly.
    Kind regards,
    Dr. Detlef Maier

    Reply
  3. Hello Charles,
    thank you for your awesome explanations. I have successfully applied the Kaplan-Meier procedure to render survival curves; however I would like to use your program to automate the process, instead of copying and pasting everytime.

    The Kaplan-Meier options works great but stops at the last “dead” patient, and does not display further data even though I have some censored patients alive. How can I display those in the chart? Also, does their absence inficiate over p value?

    Thanks!

    Reply
    • Hi Filippo,
      I am pleased that you like the explanations on the website. I try my best to make the explanations easy to understand yet rigorous enough.
      The Kaplan-Meier process stops at the last dead patients since it is unknown when the remaining patients will die. If you have some way of knowing this or estimating this, then you can extend the table and chart with this information.
      The alive patients are shown on the chart since the last point on the chart is not ay y = 0 but at y = the # of patients that are still alive.
      To see whether the # of alive patients influence the p-value, I suggest that you rerun one of the analyses changing only the number of patients who are alive at the end (say from 2 to 1).
      Charles

      Reply
  4. Hello Dr. Zaointz.
    I have a question and the answer is probably very obvious as I haven’t found an explanation anywhere: what is the difference between log-rank 1 and 2?
    Thank you so much for your help – your website has been incredibly useful and I’m very grateful.

    Reply
  5. Dr. Zaiontz good morning. Dr. How can I compare, more than two variables in the survival curve of Kaplan Meier, using Real Statistics?

    Dr. Zaiontz buenos días. Dr. Como puedo comparar, más de dos variables en la curva de supervivencia de Kaplan Meier, usando Real Statistics?

    Reply
    • Hello Gerardo,
      I have not yet researched this issue. At this point, I can suggest that you perform multiple pairwise comparisons using an alpha correction such as Bonferroni’s correction.
      Charles

      Reply
  6. Hi Charles,
    I get the same error as Soraya, when I tried Survival analysis in Excel 2013.
    “Compile error in hidden module: Survival. This error mainly occurs when code is incompatible with the version, platform, or architecture of this application”
    I checked and the ‘Solver’ Add-in is not checked.
    Can you please check that?

    Reply
  7. Hello Charles,

    I’m looking at your tutorial on how to generate Kaplan Meir step curves. I cannot for the life of me figure out how you generated your ‘n’ column data (column F) https://real-statistics.com/survival-analysis/kaplan-meier-procedure/survival-curve/ or on this page https://real-statistics.com/survival-analysis/kaplan-meier-procedure/real-statistics-kaplan-meier/

    It seems like the final percentage mortality doesn’t match up with what would be expected. What is the equation you used to generate the ‘n’ columns used in the 1-d/n?

    Thanks for your help.

    Reply
  8. Thanks for a great plug in, hugely helpful.

    Is there a means to plot more than 2 KM curves and to test for differences between each?

    Reply
  9. Charles,
    great website…
    I am looking at oil well failure (due to integrity problems) data with my dataset including either –
    1. Failed wells (these wells may have been drilled at start-up or drilled later)
    2.a. No-Failure Wells
    – drilled pre-startup and still haven’t failed = right censored
    – entered service late (due to either infill drilling and/or wellbore repair) but haven’t failed = right censored
    2.b. No-Failure Wells
    as per 2.a. but well life curtailed due to sidetracking the well for production reasons (rather than integrity failure) = right censored
    I can use Kaplan-Meier (after following your example), but I wanted to compare to Weibull. This is straight forward for non censored data using Excel’s regression (data analysis pack) for y = ln(ln(1/(1-Median Rank))) and x = ln(well life). Beta Shape factor = slope & Alpha Scale factor = exp*(-Intercept/Beta Shape factor).
    My question is whether you have an excel based method to determine Beta and Alpha from a dataset that includes right censored data? as per my example.
    Regards,
    Andrew

    Reply
    • Andrew,
      I don’t provide an Excel method with censored data. In the latest release of the software (Rel 5.0, released today), I do provide a capability for automatically estimating the Weibull parameters using regression, method of moments and maximum likelihood.
      Charles

      Reply
  10. Hi
    I’m trying to perform a Kaplan-Meier test on data using 2016 32 bit excel for windows. Whenever I try to complete it, it comes up with the following message: “Compile error in hidden module: Survival. This error mainly occurs when code is incompatible with the version, platform, or architecture of this application” I have downloaded the correct version for 2013/2016 and followed all instructions for installation so not sure what is going wrong here

    Reply
    • Soraya,
      Are you using the latest version Real Statistics (Rel 4.14)? When you press Alt-TI do you see both RealStats and Solver on the list of addins with check marks next to them?
      If so, one suggestion is to try to use the Excel 2007 version of the Real Statistics software to see whether this works better for your computer. By the way, what language of Excel are you using (English, French, etc.)?
      If non of this helps, if you send me an Excel file with your data, I will try to figure out if there is a problem in the Real Statistics software.
      Charles

      Reply
  11. Hi,
    I’m environnemental engineer and i’m a master student in Civil engineer
    at the Sherbrooke University. In this moment i’m writing a scientific
    article about biological treatments and i have a data base with 40
    sampling dates (more or less) and my datas show me a lot (30 to 50% of
    data) of Non detected values at the effluent of my bioreactors.
    In this moment i have a problem because i need to compare influent vs
    effluent and obtain representative mean, médian and ecart-type.
    What it’s the best method to obtain this values?
    Hoe can I used the Kaplan-Meier to censored data?
    Thank’s
    Sebastian.

    Reply
    • Juan,
      I would need a lot more information to be able to tell you which is the best test to use. Kaplan-Meier is generally used with censored data. The following webpage explains how to use this test: Kaplan-Meier.
      Charles

      Reply

Leave a Comment