We now discuss one of the most commonly used transformations, namely the Box-Cox Transformations based on the parameter λ, which is defined by the function f(x) where
If we need to ensure that all values of x are positive (e.g. to avoid the situation where ln λ is undefined when λ = 0), then we first perform the transformation g(x) = x + a for some constant a which is larger than all the values of x under consideration. The Box-Cox transformation then takes the form
Our goal is to determine the value of the λ parameter which gives the most useful transformation. We discuss two cases. The first is where our goal is to obtain linear data and the second is where we are looking for normally distributed data.
Topics
References
Li, P. (2005) Box-Cox transformations: an overview
https://www.ime.usp.br/~abe/lista/pdfm9cJKUmFZp.pdf
Box, G. E. P., Cox, D. R. (1964). An analysis of transformations (with discussion). Journal of the Royal Statistical Society. Series B (Methodological), 26(2), 211–252.
https://www.ime.usp.br/~abe/lista/pdfQWaCMboK68.pdf
Rossiter, D. G. (2019) Box-Cox transformation
https://www.css.cornell.edu/faculty/dgr2/_static/files/R_html/Transformations.html
Hi Charles.
I’m just curious. What’s the use of the second function f(x) = ln(x) when lambda = 0, when it returns an undefined value anyway? Do you have any examples where the applicability of this second function will be realized further? Thanks!
John,
f(x) = ln(x) only returns an undefined value for non-positive x. Since we aren’t using ln(lambda) there isn’t a problem. For lambda = 0 and negative values of x we use the transformation f(x) = ln(x-a+1) where a is the smallest non-positive value in the sample.
Charles