Distribution Fitting

Given a collection of data that we believe fits a particular distribution, we would like to estimate the parameters that best fit the data. We focus on three such methods: Method of Moments, Maximum Likelihood Method, and Regression.

References

Wikipedia (2017) Maximum likelihood estimation
https://en.wikipedia.org/wiki/Maximum_likelihood_estimation

Wikipedia (2021) Method of moments (statistics)
https://en.wikipedia.org/wiki/Method_of_moments_(statistics)

Hastings, N., Peacock, B. (2011) Statistical distributions. 4th Ed, Wiley
https://www.wiley.com/en-us/Statistical+Distributions%2C+4th+Edition-p-9780470390634

20 thoughts on “Distribution Fitting”

  1. Hi Charles,
    How to know which distribution will fit my data? Is it through goodness of fit tests? I test the goodness of fit of different distributions and I compare the p-value?

    Reply
  2. Hi Charles,
    How to generate a set of random numbers to follow pareto (type 1) distribution using a giving mean/sd?

    Thanks.
    K

    Reply
    • The Pareto distribution has two parameters: a scale parameter m and a shape parameter alpha. The inverse function for the Pareto distribution is I(p) = m/(1-p)^(1/alpha). If you know the values of mn and alpha then a random value from the distribution can be calculated by the Excel formula = m/(1-RAND())^(1/alpha).
      Now if the mean and the standard deviation sd are known, then these can be used to calculate the m and alpha parameters by solving the equations:
      mean = m*alpha/(alpha-1)
      sd^2 = m^2*alpha/((alpha-1)^2*(alpha-2))
      Charles

      Reply
  3. Hi,

    Thank you for creating this great tool for Excel.

    I have a question regarding distribution fitting. This tool estimates the parameters for different distributions. Is it possible to compare which distribution is best fit for the data (Anderson-Darling statistic etc)?

    Best regards,
    Jukka

    Reply
    • Jukka,
      A commonly used approach is to choose the distribution with the smallest Akaike information criterion (AIC) value. AIC = 2k – 2LL where LL = the log-likelihood (LL) and k = the number of parameters being estimated. Essentially this means that you are choosing the distribution with the largest LL value with a penalty for extra parameters. See
      https://www.spcforexcel.com/knowledge/basic-statistics/deciding-which-distribution-fits-your-data-best
      Another commonly used approach is the Bayesian Information Criterion (BIC), namely BIC = k*LN(n) – 2LL where n = the number of elements in the sample and LN is the natural log. BIC is used in the same way as the AIC except that the penalty for additional parameters is calculated slightly differently.
      Charles

      Reply
  4. Hi Charles,

    I wanted to ask if you have a continuous variable which fits a certain distribution, how can or should you bin (categorize/discretize i don’t know the correct term) according to that distribution?

    I would like to tell you that the website is very well organized and filled with usefull information.

    Best regards,
    Leonardo

    Reply
  5. Hi Charles,

    If I want to estimate the distribution of the bus inter-arrival time, what is the best distribution to fit my data? Appreciate your advice on this. Thanks.

    Regards,
    Jessica

    Reply
  6. Dear Charles,
    Does GAMMA_FIT function supports curly parentheses, for instance {GAMMA_FIT(IF(R100:R138>0,R100:R138),,100)} does not create a result.
    Thanks in advance,

    Reply
  7. Hi Charles,

    I recently came across your website and found it very very useful. Thank you so much for your great efforts!

    I wanted to ask whether it would be possible to do distribution fitting via MLE (by using Real Statistics functions) for a Gumbel distribution?

    Thank you so much.

    With best regards,
    Wayne

    Reply
  8. Hi Charles,

    First, I loved you site! Helped me a lot.

    Second, how can you use the MLE with Newton Method and censored data to fit a three parameter weibull distribution?

    Best regards,
    Leandro Dutra.

    Reply

Leave a Comment