Survival Analysis

Survival analysis is concerned with the time it takes until a certain event occurs, especially when censored data is present. The event could be the death (or relapse) of a patient with cancer or the date when a student graduates from high school. Thus, the key event can be viewed as success (getting a law degree) or failure (death), although generally the terminology used is most suited to the second type of situation.

Failure (i.e. the key event) can correspond to a component breaking in an engineering context (reliability analysis), an organism dying in a biological context (survival analysis), or the end of an economic downturn in an economic context (duration analysis).

In the context of a clinical trial, we are interested in answering the following types of questions:

  • What is the probability that the participant survives for 3 years?
  • Are there differences in survival rates for participants who take the new versus old drug?
  • How do age, gender, family history, etc. affect a participant’s probability of survival?

Topics

For those with a calculus background, you can also see the proofs of some of the properties described on the above web pages at

References

Wikipedia (2015) Survival analysis
https://en.wikipedia.org/wiki/Survival_analysis#:~:text=Survival%20analysis%20is%20a%20branch,and%20failure%20in%20mechanical%20systems.

Clark et al. (2003) Survival analysis part I: basic concepts and first analyses
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2394262/

Sullivan, L. (2016) Survival analysis
https://sphweb.bumc.bu.edu/otlt/mph-modules/bs/bs704_survival/BS704_Survival_print.html

13 thoughts on “Survival Analysis”

  1. Hello, dr Charles.
    Survival analysis discusses the logrank test for comparing two Kaplan-Meier curves. Unfortunately, there is no chi-square test for comparing more than two curves. I have just such a problem to solve. If possible, I have a request for a description of the test with the necessary formulas and an example solution in an Excel sheet. Thank you in advance and best regards.

    Reply
  2. Hi Dr. Charles,
    Could you please tell me how my panel data (longitudinal data) should be organized in order to fit it using a Cox Regression through the Excel package developed by you? Imagine I have the same loan Id observed over different time horizons. Basically, Id like to know how I can incorporate the “Start_Point” and the “End_Point” using your package?
    Using the language R, the formula would be:
    fit <- coxph(Surv(start_time, end_time, status) ~ covariate_1+ covariate_2, data=sample)

    Reply
  3. Dr, good morning, what probability do you have to implement in the Real Statistics page, in the survival analysis the C statistic?
    Thank you

    Dr, buenos días, que probabilidad hay que implementes en el análisis de supervivencia el estadístico C?.

    Gracias

    Reply
  4. Hello

    Do you have references (papers or book) for the formula to get Hazard rate. I saw this used a couple of places without reference (I need a reference to give for my research):

    Hazard Rate = (ln(1-Events/N)) / t

    Also, if total events in experimental are Eexp and lost to follow-up is Nexplft then is there a way to estimate Hazard RATE in exp arm?

    If not, then if evenets in control arm is Ectl and lost to follow up is Nctllft then is there a way to get Hazard RATIO?

    Reply
    • Peru,

      There are many references for the formula for the hazard rate. E.g.
      Antimicrob Agents Chemother. 2004 Aug; 48(8): 2787–2792.
      doi: 1 0.1 1 28/AAC.48.8.2787­2792.2004
      Hazard Ratio in Clinical Trials
      Spotswood L. Spruance, Julia E. Reid, Michael Grace, and Matthew Samore

      Sorry, but I don’t understand your other questions. What are Eexp, Nexplft, Ectl, etc.

      Charles

      Reply

Leave a Comment