Data Analysis Tool
Real Statistics Data Analysis Tool: WLS regression can be performed by using the Weighted Linear Regression data analysis tool as described in Example 3 of WLS Regression for Heteroskedasticity.
Press Ctrl-m and double click on the Regression option in the dialog box that appears (or click on the Reg tab if using the Multipage user interface). Next, select Weighted Linear Regression. Fill in the dialog box that appears as shown in Figure 1 (with references to Figure 1 of WLS Regression for Heteroskedasticity).
Figure 1 – Weighted Linear Regression dialog box
Upon clicking on the OK button, the output shown in Figure 11 of Regression for Heteroskedasticity appears.
Worksheet Functions
Real Statistics Functions: The following array function computes the coefficients and their standard errors for weighted linear regression. Here R1 is an n × k array containing the X sample data, R2 is an n × 1 array containing the Y sample data and R3 is an n × 1 array containing the weights.
WRegCoeff(R1, R2, R3) = k × 2 array whose first column contains the WLS regression coefficient vector and whose second column contains the corresponding standard errors of these coefficients,
WRSquare(R1, R2, R3) = the R-Square value for the weighted regression
WRSquare(R1, j, R3) = the R-Square value for the weighted regression using the X values in range R1 without column j, the Y values in column j of R1 and the weights in R3.
WVIF(R1, R3, j) = the VIF value for the weighted regression using the X values in range R1 without column j, the Y values in column j of R1 and the weights in R3.
For Example 3 of WLS Regression for Heteroskedasticity, referring to Figure 8 and 11 of WLS Regression for Heteroskedasticity, range X19:Y21 contains the array formula =WRegCoeff(N4:O18,P4:P18,U4:U18), cell AD22 contains the worksheet formula =WVIF(N4:O18,U4:U18,2) and cell X7 contains the formula =Y14/Y16, although it can also be calculated using the formula =WRSquare(N4:O18,P4:P18,U4:U18).
References
Wikipedia (2018) Weighted least squares
https://en.wikipedia.org/wiki/Weighted_least_squares
Eisenhauer, J. G. (2003) Regression through the origin
https://onlinelibrary.wiley.com/doi/10.1111/1467-9639.00136
Huang, Y. (2021) Weighted least squares
https://www.stat.uchicago.edu/~yibi/teaching/stat224/L14.pdf
Shalizi, C. (2015) Weighted and generalized least squares
https://www.stat.cmu.edu/~cshalizi/mreg/15/lectures/24/lecture-24–25.pdf
Hi Charles, only one question about this topic. How do I calculate my weights? I ask this because of two reasons: first of all I have to run a multiple linear regression and I wanna know if I have to calculate the weights for all of my independent variables, and secondly because I didn’t understand how to use the “WRegCoeff” formula. In this formula I have to put the “X”, the “Y” and the coefficients, even though I use this formula to calculate the coefficients? So I ask you some clarifications, thank you in advance
Hello Giordano,
1. Which weights to use depends on the reason you have for using the weighted version of linear regression. The weights are completely determined by the user. If you are using weighted linear regression to deal with heteroskedasticity, then you should choose weights as described at https://www.real-statistics.com/multiple-regression/weighted-linear-regression/wls-regression-and-heteroskedasticity/
If you don’t need to use weights, then you should use the ordinary Multiple Linear Regression data analysis tool instead.
2. If you use weighted linear regression then you need to assign a weight for each row of data. The weights are not assigned to the coefficients.
3. In WRegCoeff, you don’t specify the coefficients. X is a range that contains the X data (i.e. the data for the independent variables). For example, suppose that you have 10 rows of data and 3 independent variables. This data could be stored in range A1:C10. This range is what you need to specify for X. For each row of data, you need a corresponding dependent variable value. These y-values could be stored in range D1:D10. This is the range that you need to specify for Y.
Charles