Box-Cox Transformation

We now discuss one of the most commonly used transformations, namely the Box-Cox Transformations based on the parameter λ, which is defined by the function f(x) where

image292z

image293z

If we need to ensure that all values of x are positive (e.g. to avoid the situation where ln λ is undefined when λ = 0), then we first perform the transformation g(x) = x + a for some constant a which is larger than all the values of x under consideration. The Box-Cox transformation then takes the form

image294z

image295z

Our goal is to determine the value of the λ parameter which gives the most useful transformation. We discuss two cases. The first is where our goal is to obtain linear data and the second is where we are looking for normally distributed data.

Topics

References

Li, P. (2005) Box-Cox transformations: an overview
https://www.ime.usp.br/~abe/lista/pdfm9cJKUmFZp.pdf

Box, G. E. P., Cox, D. R. (1964). An analysis of transformations (with discussion). Journal of the Royal Statistical Society. Series B (Methodological), 26(2), 211–252.
https://www.ime.usp.br/~abe/lista/pdfQWaCMboK68.pdf

Rossiter, D. G. (2019) Box-Cox transformation
https://www.css.cornell.edu/faculty/dgr2/_static/files/R_html/Transformations.html

2 thoughts on “Box-Cox Transformation”

  1. Hi Charles.

    I’m just curious. What’s the use of the second function f(x) = ln(x) when lambda = 0, when it returns an undefined value anyway? Do you have any examples where the applicability of this second function will be realized further? Thanks!

    Reply
    • John,
      f(x) = ln(x) only returns an undefined value for non-positive x. Since we aren’t using ln(lambda) there isn’t a problem. For lambda = 0 and negative values of x we use the transformation f(x) = ln(x-a+1) where a is the smallest non-positive value in the sample.
      Charles

      Reply

Leave a Comment