Bootstrapping Multivariate Regression Support

Objective

The Real Statistics Resource Pack provides a number of worksheet functions to support bootstrapping for multivariate regression in Excel. We describe these functions and give some examples of their use.

Worksheet Functions

The following functions support bootstrapping for a regression model with intercept using Approach 2 based on Rx and Ry where Rx is an n × k array of X data, Ry is an n × m array of Y data. iter (default 2000) bootstrap samples are generated. alpha is the significance level (default .05).

Resampling cases

MRegCovBoot(Rx, Ry, iter, dprec): returns the covariance matrix of the regression coefficients

MRegCIBoot(Rx, Ry, iter, alpha, dprec): returns an array with km+1 rows, one per regression coefficient. Each row contains the mean value of the regression coefficient, the standard error, and the lower and upper ends of the 1-alpha confidence interval

MRegPredBoot(Rx0, Rx, Ry, iter, alpha, dprec, pred): returns an array with rm rows where Rx0 is a k × r array. Each row in Rx0 specifies X values. Each row in the output consists of the predicted value for the data in the corresponding row of Rx0 based on the regression of Ry on Rx, its standard error, and the lower and upper ends of the 1-alpha prediction/confidence interval. If pred = TRUE (default), then the prediction interval is output: otherwise, the confidence interval is output.

When creating a bootstrap sample using resampling cases, if ZTZ is not invertible for one of the samples Z, then we don’t want to include this sample. In fact, if Abs(det ZTZ) < dprec then this sample is excluded. This is generally an issue only when the original sample is small, and so selecting rows with replacement can lead to too many duplicates.

Resampling residuals

The following worksheet functions are similar to the above functions except that bootstrap based on resampling residuals is employed. Note that the dprec argument is not needed.

MRegCovBootRes(Rx, Ry, iter): returns the covariance matrix of the regression coefficients

MRegCIBootRes(Rx, Ry, iter, alpha): returns the regression coefficients, the standard errors, and the 1-alpha confidence intervals

MRegPredBootRes(Rx0, Rx, Ry, iter, alpha, pred): returns the predictions, s.e., and 1-alpha prediction/confidence intervals

Examples

For the following examples, we use the cholesterol and glucose data in Figure 1.

Cholesterol and glucose data

Figure 1 – Cholesterol/glucose level data

Coefficients

We first calculate the regression coefficients using the array formula

=MRegCoeff(B1:D21,E1:F21,TRUE)

in range H1:J5 of Figure 2.

We next use bootstrapping to estimate the standard errors and confidence intervals for each of the regression coefficients, as shown in range of J8:M15 of Figure 2. 

Confidence intervals for coefficients

Figure 2 – Coefficient confidence intervals

We can use resampling residuals to obtain these confidence intervals as shown in Figure 3.

Coefficient CIs: resampling residuals

Figure 3 – Coefficient CIs using resampling residuals

We next estimate the coefficient covariance matrix in Figure 4 using bootstrapping by resampling cases. By placing the array formula =SQRT(DIAG(O8:V15)) in range W8:W15, we obtain the standard errors, which we can compare with those in column K of Figure 2.

We can also use resampling residuals via the formula

=MRegCovBootRes(B2:D21,E2:F21)

although these results are not displayed.

Bootstrapping covariance matrix

Figure 4 – Coefficient covariance matrix

Predictions

We now explore the four scenarios described in range H29:K33 of Figure 5. Using the regression described in Figure 2, we can predict the cholesterol and glucose levels for each scenario. These are shown in range L30:M33.

Next, we use bootstrapping to estimate the prediction intervals, as shown in the rest of Figure 5.

Bootstrapping prediction intervals

Figure 5 – Bootstrapping prediction intervals

We can also use bootstrapping to estimate confidence intervals, as shown in Figure 6.

Bootstrapping confidence intervals

Figure 6 – Bootstrapping confidence intervals

Examples Workbook

Click here to download the Excel workbook with the examples described on this webpage.

References

Eck, D. J. (2017) Bootstrapping for multivariate linear regression models
https://arxiv.org/abs/1704.07040

Stine, R. A. (1985) Bootstrap prediction intervals for regression
https://www.jstor.org/stable/2288570?seq=1

Fox, J. (2015) Bootstrapping regression models.
Applied Regression Analysis and Generalized Linear Models, 3rd ed. Sage Publishing
https://us.sagepub.com/sites/default/files/upm-binaries/21122_Chapter_21.pdf

Stack Exchange (2021) Bootstrap prediction interval
https://stats.stackexchange.com/questions/226565/bootstrap-prediction-interval

Roustant, O. (2017) Bootstrap & confidence/prediction intervals
https://olivier-roustant.fr/wp-content/uploads/2018/09/bootstrap_conf_and_pred_intervals.pdf

Kumar, A. (2020) Bootstrapping to re-estimate parameters in small samples
https://bookdown.org/anshul302/HE902-MGHIHP-Spring2020/Boot.html

Turgeon, M. (2021) Bootstrap and linear regression
https://www.maxturgeon.ca/f21-stat3150/slides/bootstrap-linreg.pdf

Leave a Comment