Multivariate Normal Distribution Basic Concepts

Univariate case

A random variable x has normal distribution if its probability density function (pdf) can be expressed as

Multivariate normal density function

Here e is the constant 2.7183…, and π is the constant 3.1415…

The normal distribution is completely determined by the parameters μ (mean) and σ (standard deviation). We use the abbreviation N(μ, σ) to refer to a normal distribution with mean μ and standard deviation σ, although for comparison with the multivariate case it would actually be better to use the abbreviation N(μσ2) where σ2 is the variance.

Multivariate case

Definition 1: A random vector X has a multivariate normal distribution with vector mean μ and covariance matrix Σ, written X ~ N(μ, Σ) if X has the following joint probability density function:

Multivariate normal density function

Here |Σ| is the determinant of the population covariance matrix Σ. The exponent of e consists of the product of the transpose of Xμ, the inverse of Σ and Xμ, which has dimension (1 × k) × (k × k) × (k × 1) = 1 × 1, i.e. a scalar. Thus f(X) yields a single value. The coefficient  (2π)k |Σ| can also be expressed as |2πΣ|.

Definition 2: The expression

Mahalanobis distance squared

which appears in the exponent of e, is called the squared Mahalanobis distance between X and μ. We can also define the squared Mahalanobis distance for a sample to be

Mahalanobis distance squared sample

Where S is the sample covariance matrix and  is the sample mean vector. In Multivariate Normality Functions, we show how to calculate the Mahalanobis distance using the Real Statistics MDistSq worksheet function.

Observation: If k = 1 then the above definition is equivalent to the univariate normal distribution. If k = 2 the result is a three-dimensional bell-shaped curve (as described in Figure 1).

Properties

Property 1: If X ~ N(μ, Σ) where all the xj in X are independent, then the population covariance matrix is a diagonal matrix [aij] with ajj= {\sigma}_j^2 for all j and aij = 0 for all ij, and so the joint probability function simplifies to

image9015

where each {f}_{\mu_j ,\sigma_j}(x_j) is the univariate normal pdf of xj with mean μj and standard deviation σj.

Property 2: If y = \sum_{j=1}^{k} c_j x_j = CTX, where C = the k × 1 vector [cj], and X ~ N(μ, Σ) then y has normal distribution with mean \sum_{j=1}^{k} c_j \mu_jCTμ and variance \sum_{i=1}^{k} \sum_{j=1}^{k} c_i c_j \sigma_{ij} = CTΣC; i.e. y ~ N(CTμ,CTΣC).

The unbiased estimates for population mean and population variance are given by the sample mean \sum_{j=1}^{k} c_j \bar{X}_j = CT and sample variance \sum_{i=1}^{k} \sum_{j=1}^{k} c_i c_j s_{ij} = CTSC, where sij = cov(xi, xj).

When k = 2, the joint pdf of X depends on the parameters μ1, μ2, σ1, σ2, and ρ. A plot of the distribution for different values of the correlation coefficient ρ is displayed in Figure 1.

bivariate-normal-7 bivariate-normal-0

Figure 1 – Bivariate normal density function

Observations about Mahalanobis Distance

Suppose X has a multivariate normal distribution. For any constant c, the set of points X which have a Mahalanobis distance from μ of c sketches out a k-dimensional ellipse. The value of the probability density function at all these points is the constant

image9023

Let’s take a look at the situation where k = 2. In this case, we have

image9024

Thus
image9025
and so
image9026
Hence
image9027
where
image9028

Finally, note that the equation
image9029

is an ellipse with foci at μ = (μ1, μ2).

Note that when ρ = 0, indicating that x1 and x2 are uncorrelated, then the ellipse takes the form z12z22 = c2 which is a circle. When ρ = ±1, indicating that x1 and x2 are completely correlated, then the ellipse becomes a straight line.

Using the above calculations, when k = 2 the multivariate normal pdf is

Bivariate normal density function

Distribution of Mahalanobis Distance

Property 3: If X ~ N(μ, Σ), then the squared Mahalanobis distance between X and μ has a chi-square distribution with k degrees of freedom.

This property is an extension of Corollary 1 of Chi-square Distribution. We can interpret the property as follows. Let c = the critical value of the chi-square distribution with k degrees of freedom for α = .05. Then the probability that X will fall within the ellipse defined by c, i.e. (X–μ)T Σ-1 (X–μ) = c2, is 1 – α = .95.

References

Genz, A. (2024) Package ‘mvtnorm’ Multivariate normal and t distributions
https://cran.r-project.org/web/packages/mvtnorm/mvtnorm.pdf

Genz, A. (2004) Numerical computation of rectangular bivariate and trivariate normal and t probabilities
https://sci-hub.st/10.1023/b:stco.0000035304.20635.31

19 thoughts on “Multivariate Normal Distribution Basic Concepts”

  1. A quick question.

    According to the equation of the squared Mahalanobis distance, “Z1^2 -(2*p*Z1*Z2)+Z2^2,”

    when ρ = 0, the ellipse takes the form (z1 + z2)2 = c2, instead of (z1 – z2)2 = c2?

    Reply
    • Hi John,
      Thanks for catching this error. I have just corrected this on the webpage.
      I appreciate your help in improving the accuracy of the website.
      Charles

      Reply
  2. One more question.

    When I was looking at the ‘Confidence Ellipse’ section. You calculate if a point is inside/outside of the ellipse. I think you’re using the equation x’S^-1x <= c^2 and then you substitute chi-square = c. Then you use c^2 = (chi-square)^2 — in your formula you use [… <= H8^2] which is the square of the chi-square value making the ellipse equation x'S^-1x = (chi-square)^2

    When I look at other references, (Tatsuoka, Multivariate Analysis, 2nd ed, 1988, page 73) they use x'S^-1x = chi-square. This would be [… <=H8] and not the squared value. Should it be x'S^-1 x = chi-square or x'S^-1x = (chi-square)^2?

    Thanks

    Reply
  3. Good site and program. Question about cumulative bivariate. Suppose sx, sy = 1, means =0 and correlation = 0. When I directly integrate over the limits x, y {-1, 1} using MathCAD, I get 0.466. However, when I do a numerical simulation and count the units within the ellipse, I get ~0.395. Also, I noted somewhere that the integral for a bivariate was ~0.395.

    What is the correct area under the range {+/-1} for the x,y for the bivariate? And why do I get ~0.466 and the reported value is ~0.395, and why are these values different? Any help is appreciated.

    Reply
      • I did a simulation with a standard ellipse. I obtained random x,y values using the usual algorithm for generating random values from a defined bivariate. I used MathCAD for the direct integral of the bivariate distribution equation.

        If the ellipse lies on either the x or y axes, then the integral over the bivariate with std-x, y = 1, I get the correct value of 0.395.

        Subsequently, I found that the probability of being within the ellipse for a bivariate (df = 2) follows the chi-square distribution. So, the probability of being within a standard ellipse (both x, y standard deviations = 1, df =2) should be 0.395.

        Your program performs the integral for the bivariate from -inf to a value or a value to +inf. So, the question becomes how to use your program to find the probability between +/-1 for both variables. Or ranges {x1, x2}, {y1, y2}.

        Hopefully I clarified. But any help in finding the bivariate distribution integral over arbitrary ranges will help.

        Ultimately I am trying to relate to eigenvalues for the covariance matrix for the bivariate distribution. Hence, the SQRT(eigenvalue * chi-square) should be the principal axis. I was trying to simulate the probability for some proportion of the population (i. e. ellipse that contains 90% of the population) bivariate distribution to fall within the an ellipse drawn using the principal axes derived from the eigenvalues. Or maybe there is a specific reference that covers this?

        Thanks for the response and any help or clarifications on my confusion.

        Reply
        • Ignore the last comment. I searched your website more thoroughly and have found the answers. Good references. Thanks!

          However, is there a method to calculate the cumulative bivariate between {x1, x2} and {y1, y2} with your program?

          Maybe I missed it somewhere.

          Reply
        • Daniel,
          Thanks for the additional information.
          I am not sure what you mean by the bivariate distribution integral. Are you looking for the area inside the ellipse or the cumulative distribution function (cdf) or the bivariate normal distribution or something else?
          Charles

          Reply
          • Sorry about the imprecise wording.
            When I use your function: =BNORMDIST(1,1,0,0,1,1,0,TRUE) I get 0.707860786678

            I think this performs the cdf from -inf to +1 for both x, y. This is close to the value when I do the direct integration in MathCAD (integraly(-inf, 1)intergralx(-inf, 1) [binomial distribution]dxdy = 0.70786098175

            In that function [bnormdist[]] I think you only integrate from -inf to x or y. However, how would I find the cdf for the RANGE for x, y of say {-1,1}, {-1,1}? I think the integral should be the chi-square (~0.395).

            I was not able to find a method in your program to find the cdf for th binomial distribution over a limited rage in x and y. Is there a method that I missed in the tutorials?

            Hopefully this is clearer.

            Thanks

          • Dan,
            BNORMDIST(x, y, m1, m2, s1, s2, r, TRUE) = the cdf of the standard bivariate normal distribution at (x,y) with means m1 and m2, standard deviations s1 and s2 and correlation coefficient r. This means that the function evaluates the cdf from (-inf, inf) to (x,y), which seems like what you want. Am I missing something?
            Charles

          • Thanks for the response. Sorry about not being clear. Let me try to simplify.

            I need to calculate the bivariate distribution between x1LOW and x1HIGH and y1LOW and y2HIGH.

            Your formula:
            =BNORMDIST(x1, x2, m1, m2, s1, s2, r, cum)
            Calculates the cdf from -inf to the corresponding x1, x2 values.

            However, what I am looking for is a formula to calculate across a range of x, y such as:
            =BNORMDIST(x1LOW, x1HIGH, x2LOW, x2HIGH, m1, m2, s1, s2, r, cum).

            In other words, calculate x1 over the RANGE of (-1, 1) and x2 across the range of (-1, 1).

            From theory, if m1, m2 = 0 and s1, s2 = 1, the result I believe will be the chi-square value ~0.395.

            How can I use your program to calculate the cdf for the bivariate distribution across a specific range for x1 and x2?

          • Dan,
            I have now implemented a newer, better version of the bivariate normal distribution. I have also included support for ranges as you have requested. This newer version will be included in the next release of the Real Statistics software. I expect this release to be issued in a few days.
            Charles

  4. Hi,

    Are you sure about BNORMDIST function ?
    Using covariance matrix or simplified equation (btw thanks) calculus I get the same result that i believe to be good. When using BNORMDIST i got a value 72.56433684 higher (value remains constant) than simplified equation …
    i think there is a glitch somewhere (and should be me) , just want to understand.
    BR

    Reply
    • Reza,
      This would have to be a three dimensional graph. In any case, you can use the Real Statistics function BNORMDIST to calculate the pdf values.
      Charles

      Reply
  5. Dr. buneos días, disculpe que significa pdf y cdf, podría ser que
    pdf es lo mismo que función de densidad de probabilidad?
    Pero cdf?
    Muchas gracias
    Dr Zaionts, good morning, excuse me what mean PDF and CDF values?
    Could be PDF= probability density function?
    But CDF?
    Thank you

    Reply

Leave a Comment