Weighted Variance, Standard Deviation, and Covariance

Introduction

For any set of weights W = {w1, w2, …, wn} where each wi ≥ 0 and wi > 0 for at least one i, the weighted mean of a sample S = {x1, x2, …, xn} is defined by

Weighted mean

where

standardizing weights

The biased weighted variance of the sample is

Biased weighted variance

When all the weights are the same, this reduces to the usual

Biased sample variance

which is the maximum likelihood estimate of the variance σ2. It turns out that if the wi are frequency weights, then the biased weighted variance is also the MLE of σW2.

We now seek an unbiased estimate of the weighted variance.

Unbiased weighted variance (frequency weights)

In this case, we consider the  to be non-negative integers that indicate the number of replications of the sample . We clearly want the unbiased sample variance to be the same as that of the unweighted sample with replications. This is the situation examined in Frequency Table Statistics. Here, the unbiased estimate of the variance is

Unbiased frequency weighted variance

The sample size in this case is viewed as w and not n.

Note that if the wi are percentages, then we still need to know the full sample size N. Replacing wi by Nwi will yield the desired result.

Unweighted Case

The unweighted variance is equivalent to the weighted variance where all the weights are equal to 1. In this case, w = n and the weighted mean is equal to the unweighted mean. Thus, the weighted unbiased variance is equivalent to the (unweighted) sample variance.

Unweighted case

Unbiased weighted variance (reliability weights)

We can show that

Unbiased weighted variance

where

Sum of squared weights  

Now define the following statistic

Unbiased reliability weighted variance

This is the unbiased weighted variance when reliability weights are employed.

Unbiased weighted variance

We can simplify sW2 as follows

Weighted s-squared

Weighted s-squared part 2

Weighted s-square part 3

Note that if the weights are normalized by replacing wi by wi/w, then

Simplification of correction term

and so

Weighted s-square standardized weights

Unweighted Case

If all the weights are the same, the normalized weights are all 1/n. Thus

v with equal normalized weights

and so

Correction factor

In this case, the weighted unbiased variance is equivalent to the usual sample variance.

Unbiased weighted standard deviation

The unbiased weighted standard deviation is equal to the square root of the unbiased weighted variance. This is true for both frequency and reliability weights.

Weighted covariance matrix

For an n × k matrix X, each of whose rows Xi = [xij] is assigned a weight wi, the weighted mean vector is

Worksheet Functions

Starting with Rel 9.2.1, the Real Statistics Resource Pack will provide the following worksheet functions.

WVAR(R1, Rw, rel) = the unbiased weighted variance for the data in column array R1 and weights in column array Rw. If rel (default TRUE), then it is assumed that reliability weights are used; otherwise frequency weights are used.

WSTDEV(R1, Rw, rel) = the unbiased weighted standard deviation for the data in column array R1 and weights in column array Rw = SQRT(WVAR(R1, Rw, rel)).

WCOV(R1, Rw, rel) = the unbiased weighted covariance matrix for the data in the n × k array R1 and weights in the n × 1 column array Rw.

Examples

Weighted variance (reliability weights)

Example 1: Find the weighted variance for the reliability weights in B2:B13 and data in A2:A13.

Reliability weighted variance

Figure 1 – Weighted variance (reliability weights)

We see from Figure 1 that the weighted mean is 79.5 (cell C16) and the (unbiased) weighted variance is 247.5551. We obtain the same values by using the formulas =MEAN(A2:A13, B2:B13) and =WVAR(A2:A13, B2:B13), respectively. The unbiassed weighted standard deviation is 15.73389, which can be obtained from the formula =SQRT(C21) or =WSTDEV(A2:A13, B2:B13).

Weighted Variance (frequency weights)

Example 2: Repeat Example 1 using frequency weights for a sample of size 30 where the weights in column B of Figure 1 represent the proportions of the sample.

Since the sum of the weights is 3 and the sample size is 30, we need to multiply the weights in column B by 10 to obtain the frequencies. The weighted variance is now calculated as shown in Figure 2.

Frequency weighted variance

Figure 2 – Weighted variance (frequency weights)

The biased weighted variance of 224.45 (cell C17) is calculated as in Figure 1. The unbiased weighted variance (cell C20) is calculated by =C17/(C18-1), the square root of this value is the unbiased weighted standard deviation (cell C22).

The variance and standard deviation can also be calculated, respectively, by the formulas WVAR(A2:A13, B2:B13, FALSE) and  =WSTDEV(A2:A13, B2:B13, FALSE).

Finally note that we can obtain the same values using the Frequency Table data analysis tool, whose output is shown in columns E and F. Cells F6 and F7 contain the desired results.

Weighted Covariance Matrix

Example 3: Find the weighted covariance matrix for the data in range A2:C13 using the reliability weights in D2:D13.

We first normalize the weights as shown in column E. E.g. cell E2 contains the formula =D2/D$30 where D30 contains the formula =SUM(D2:D13). Next, we calculate the sum of the squared normalized weights, as shown in cell E14 using the formula =SUMSQ(E2:E13).

Weighted covariance matrix

Figure 3 – Weighted covariance matrix

We next find the weighted means for the three data sets, as shown in range A14:C14. This is done by inserted the formula =SUMPRODUCT(A2:A13,$E2:$E13) in cell A14, highlighting range A14:C14, and pressing Ctrl-R.

Finally, we insert the following array formula in range G2:I4.

=MMULT(TRANSPOSE(A2:C13-A14:C14),E2:E13*(A2:C13-A14:C14))/(1-E14)

This is the weighted covariance matrix. We obtain the same result by using the Real Statistics formula =WCOV(A2:C13, D2:D13) in range G2:I4.

Note that the value in cell G2 is the same as the value of C21 of Figure 1. This is because the data in column A of Figures 1 and 3 are the same, and the normalized weights in both figures are the same.

To get the frequency version of the weighted covariance matrix, we use the following formula:

=D14*MMULT(TRANSPOSE(A2:C13-A14:C14),E2:E13*(A2:C13-A14:C14))/(D14-1)

Examples Workbook

Click here to download the Excel workbook with the examples described on this webpage.

References

Wikipedia (2024) Weighted arithmetic mean
https://en.wikipedia.org/wiki/Weighted_arithmetic_mean#Weighted_sample_variance

Cross Validated (2013) Bias correction in weighted variance
https://stats.stackexchange.com/questions/47325/bias-correction-in-weighted-variance

Leave a Comment