Objective
The Real Statistics Resource Pack provides a number of worksheet functions to support bootstrapping for multivariate regression in Excel. We describe these functions and give some examples of their use.
Worksheet Functions
The following functions support bootstrapping for a regression model with intercept using Approach 2 based on Rx and Ry where Rx is an n × k array of X data, Ry is an n × m array of Y data. iter (default 2000) bootstrap samples are generated. alpha is the significance level (default .05).
Resampling cases
MRegCovBoot(Rx, Ry, iter, dprec): returns the covariance matrix of the regression coefficients
MRegCIBoot(Rx, Ry, iter, alpha, dprec): returns an array with km+1 rows, one per regression coefficient. Each row contains the mean value of the regression coefficient, the standard error, and the lower and upper ends of the 1-alpha confidence interval
MRegPredBoot(Rx0, Rx, Ry, iter, alpha, dprec, pred): returns an array with rm rows where Rx0 is a k × r array. Each row in Rx0 specifies X values. Each row in the output consists of the predicted value for the data in the corresponding row of Rx0 based on the regression of Ry on Rx, its standard error, and the lower and upper ends of the 1-alpha prediction/confidence interval. If pred = TRUE (default), then the prediction interval is output: otherwise, the confidence interval is output.
When creating a bootstrap sample using resampling cases, if ZTZ is not invertible for one of the samples Z, then we don’t want to include this sample. In fact, if Abs(det ZTZ) < dprec then this sample is excluded. This is generally an issue only when the original sample is small, and so selecting rows with replacement can lead to too many duplicates.
Resampling residuals
The following worksheet functions are similar to the above functions except that bootstrap based on resampling residuals is employed. Note that the dprec argument is not needed.
MRegCovBootRes(Rx, Ry, iter): returns the covariance matrix of the regression coefficients
MRegCIBootRes(Rx, Ry, iter, alpha): returns the regression coefficients, the standard errors, and the 1-alpha confidence intervals
MRegPredBootRes(Rx0, Rx, Ry, iter, alpha, pred): returns the predictions, s.e., and 1-alpha prediction/confidence intervals
Examples
For the following examples, we use the cholesterol and glucose data in Figure 1.
Figure 1 – Cholesterol/glucose level data
Coefficients
We first calculate the regression coefficients using the array formula
=MRegCoeff(B1:D21,E1:F21,TRUE)
in range H1:J5 of Figure 2.
We next use bootstrapping to estimate the standard errors and confidence intervals for each of the regression coefficients, as shown in range of J8:M15 of Figure 2.
Figure 2 – Coefficient confidence intervals
We can use resampling residuals to obtain these confidence intervals as shown in Figure 3.
Figure 3 – Coefficient CIs using resampling residuals
We next estimate the coefficient covariance matrix in Figure 4 using bootstrapping by resampling cases. By placing the array formula =SQRT(DIAG(O8:V15)) in range W8:W15, we obtain the standard errors, which we can compare with those in column K of Figure 2.
We can also use resampling residuals via the formula
=MRegCovBootRes(B2:D21,E2:F21)
although these results are not displayed.
Figure 4 – Coefficient covariance matrix
Predictions
We now explore the four scenarios described in range H29:K33 of Figure 5. Using the regression described in Figure 2, we can predict the cholesterol and glucose levels for each scenario. These are shown in range L30:M33.
Next, we use bootstrapping to estimate the prediction intervals, as shown in the rest of Figure 5.
Figure 5 – Bootstrapping prediction intervals
We can also use bootstrapping to estimate confidence intervals, as shown in Figure 6.
Figure 6 – Bootstrapping confidence intervals
Examples Workbook
Click here to download the Excel workbook with the examples described on this webpage.
References
Eck, D. J. (2017) Bootstrapping for multivariate linear regression models
https://arxiv.org/abs/1704.07040
Stine, R. A. (1985) Bootstrap prediction intervals for regression
https://www.jstor.org/stable/2288570?seq=1
Fox, J. (2015) Bootstrapping regression models.
Applied Regression Analysis and Generalized Linear Models, 3rd ed. Sage Publishing
https://us.sagepub.com/sites/default/files/upm-binaries/21122_Chapter_21.pdf
Stack Exchange (2021) Bootstrap prediction interval
https://stats.stackexchange.com/questions/226565/bootstrap-prediction-interval
Roustant, O. (2017) Bootstrap & confidence/prediction intervals
https://olivier-roustant.fr/wp-content/uploads/2018/09/bootstrap_conf_and_pred_intervals.pdf
Kumar, A. (2020) Bootstrapping to re-estimate parameters in small samples
https://bookdown.org/anshul302/HE902-MGHIHP-Spring2020/Boot.html
Turgeon, M. (2021) Bootstrap and linear regression
https://www.maxturgeon.ca/f21-stat3150/slides/bootstrap-linreg.pdf





