Weibull & Multi-Censored Data | Real Statistics Using Excel

Basic Concepts

In Weibull with Censored Data, m+n components enter into service at time 0 and n components fail at various times before time t. x₁, …, x_n units of time, and m components stay in service for (at least) y₁, …, y_m units of time.

We now consider the case where n components fail after x₁, …, x_n units of time, and m components stay in service for y₁, …, y_m units of time. In this scenario, the components don’t all need to go into service at the same time. Also, some of the components can be removed from service at various times without failing.

The likelihood function now takes the form

where f(x) is the pdf of a distribution and F(x) is the cdf of that distribution. The log-likelihood function is

For the Weibull distribution

Thus

Optimization

We can use Solver to find the values of α and β that maximize LL(α,β). Alternatively, we can use Newton’s method based on the extension to the iterative approach described in Fitting Weibull Parameters via MLE and Newton’s Method when there is no censored data.

Step 0: make an initial guess β₀ for the value of β.

Step k+1: Assuming that we have an estimate of β_k, we define a new estimate β_k+₁, that should be more accurate, as follows:

where

and

This process is repeated until the value of β_k converges, at which point we calculate the alpha parameter by

Convergence occurs when h(β_k) is close to zero or β_k+₁ ≈ β_k.

We won’t give an example of the use of these steps, although the approach is similar to that used for Example 1 of Weibull with Censored Data. Instead, we will give two examples using the following Real Statistics functions.

Worksheet Functions

Real Statistics Functions: The Real Statistics Resource Pack provides the following functions where R1 is a column array containing the x₁, …, x_n values, R2 is a two-column array in the form of a frequency table containing the y₁, …, y_m values. lab, iter, and bguess are as for the WEIBULL_FIT function (see Weibull with Censored Data). viter is the number of iterations used to calculate, by simulation, the variance of the data (including censored values). viter defaults to zero (meaning that this variance is not calculated).

WEIBULL_MCFIT(R1, R2, lab, iter, bguess, viter): returns an array with the Weibull distribution parameter values alpha and beta, the actual and estimated mean and variance, and MLE based on the data in the column range R1 plus the estimated mean time to failure of the censored data elements

There are also the following non-array functions:

WEIBULL_MCMEAN(R1, R2, alpha, beta) = the mean of all the data in R1 plus the average life of the censored data in R2 based on a Weibull distribution with parameters alpha and beta

WEIBULL_MCVAR(R1, R2, alpha, beta, iter) = the variance of all the data in R1 as well as an estimate of the variance of the censored data in R2 using iter simulations (default 20,000) from a Weibull distribution with parameters alpha and beta

We can use the WEIBULL_MCFIT function to give the results for Example 1 of Weibull with Censored Data, as shown in columns F and G of Figure 1.

Figure 1 – Multi-censored fit

We insert the array formula

=WEIBULL_MCFIT(A4:A15,F4:G4,TRUE,,,20000)

in range F6:G12 to obtain the results that are identical to those in range L15:M21 of Figure 1 of Weibull with Censored Data with the exception of a slightly different actual variance estimate that is based on a simulation.

Example

Example 1: Repeat Example 1 of Weibull with Censored Data using the censored data shown in the frequency table in range I3:J6 of Figure 1.

Once again two components go into service at time t = 0 but have not failed at time t = 900. In addition, there is one component that stayed in service for 400 hours and another 3 that stayed in service for 100 hours.

It doesn’t matter whether the component that stayed in service for 400 hours came into service at time t = 0 and then was removed from service at time t = 400 without failing or came into service at time t = 500 and then hadn’t failed at time t = 900. In fact, this component could have come into service at any time between t = 0 and t = 500, as long it only remained in service for 400 hours without failing.

The situation is similar for the 3 components that remained in service for 100 hours. In fact, one of these components could have been put into service at t = 0 and removed at t = 100, another could have been put in service at t = 200 and removed at t = 300 and the third could have been put into service at t = 800 and removed at t = 900.

Using the array formula =WEIBULL_MCFIT(A4:A15,I4:J6,TRUE,,,20000) we get the results shown in range I8:J14.

Examples Workbook

Click here to download the Excel workbook with the examples described on this webpage.

Reference

EpiX Analytics (2018) Using optimization to maximize a likelihood calculate to obtain MLEs
https://modelassist.epixanalytics.com/display/EA/Using+optimization+to+maximize+a+likelihood+calculation+to+obtain+MLEs

7 thoughts on “Weibull with Multi-Censored Data”

Mertz

February 15, 2022 at 11:49 am

Hi Charles,

To close the topic, do you intend to implement a fitting method to compute a Weibull regression from exclusively (or not) censored data (means right and left censored data, and hypothetically uncensored data)?

Regards.
- Charles
  
  February 15, 2022 at 3:38 pm
  
  Hello,
  Sorry, but I haven’t yet explored the issue you raised in your previous comment. Sorry for the delay, but I will get to it soon.
  Charles
  - Mertz
    
    February 15, 2022 at 5:37 pm
    
    Hello,
    
    Thank-you!
    
    Regards
- Charles
  
  April 14, 2022 at 9:05 am
  
  Hi Mertz,
  Sorry for the late response.
  I don’t have any plans to implement more fitting methods for Weibull regression. I will shortly look at Survival Analysis, but can’t promise to add the capability that you are looking for.
  Charles
Mertz

January 26, 2022 at 11:29 am

Dear Charles,

In the previous example, if I understand well, you are considering a few right-censored data and on the other side uncensored data (since you use the f function and not F). Have you got a method (other than solver) that could apply to left-censored data and right-censored with no uncensored data? I suppose that we just have to replace f by F in the likelihood function?

PS: I work in the biomechanics domain where often we do not know the precise parameter (like force) value for which the injury occurs (left-censored) and sometimes it does not occur (right censored).

I use a solver software but it poses significant challenges for automation, for example to compute CIs.

Thank-you for your wonderful website.

Best Regards
- Charles
  
  February 1, 2022 at 7:53 pm
  
  Hello,
  From your comment, I understand that you are ok with the approach described on the webpage, but would like it to work even when there is no uncensored data. Is this correct?
  Does the approach described on the website work even when there is no uncensored data, but the Real Statistics function doesn’t work in this case?
  Charles
  - Mertz
    
    February 2, 2022 at 1:31 pm
    
    Hello,
    In fact I did not check yet since I did not upload Real Statistics yet. I wanted to check before if I am able to understand the maths and then if I can expect answers to my approach.
    In the example above for which the likelihood formula L is written, we consider 2 cases:
    * 1-F(yi) which corresponds to right-censored data (the “event” did not happened for yi) then we use the complement to 1 for the probability or likelihood of the event at yi (a time in general, a force or acceleration or displacement or a more sophisticated parameter for me)
    * f(xi) which corresponds to the event happening exactly at xi (time, force for me)
    
    Often in biomechanics we are not able to measure the exact force (or other parameter) at which the event (the injury) happens, the force could carry on increasing after the injury occurs then the peak force value is left-censored.
    In this case we use a third factor in the likelihood calcutation based on F(wi), wi being the left-censored value (the peak). As this third factor is not included in your L(alpha, beta) formula I cannot expect it to work (to answer your question).
    
    If we consider that we have no uncensored value (it needs a specific invasive instrumentation in biomechanics) the second factor is absent. Finally we just need to replace f(xi) by F(wi) but of course the following formulas are then not valid.
    
    Hope you can understand my English.
    
    Thank-you for your help.
    Regards.