Objective
Partial Least Squares (PLS) Regression is a form of regression that is especially useful when there are a large number of explanatory (i.e. independent) variable (especially when there are more such variables than data vectors for these variables) or when there is multicollinearity (i.e correlation) between the independent variables.
Essentially, PLS Regression maps the independent variables into a smaller number of latent variables and then uses ordinary multiple regression or multivariate regression to create a regression model.
Introduction
We start with a n × k matrix of X data and a n × m matrix of Y data, and our goal is to construct an h × m regression coefficient matrix with h ≤ k (preferably with h << k).
We use the NIPALS (Nonlinear Iterative Partial Least Squares) algorithm to construct the model.
Our goal is to find weight vectors w and c and factor score vectors t and u with t = Xw, u = Yc, w′w = 1, and t′t = 1, such that t′u is maximized.
NIPALS Algorithm
We will use lower case Latin letters to represent column vectors, and for any such vector v, N(v) = v/||v|| where ||v|| is the length of v, which can be calculated in Excel by =SQRT(SUMSQ(v)). N(v) can also be calculated by the Real Statistics formula =NORM(v).
To standardize a vector v, we subtract each element in the vector by the mean of the elements in the vector and then divide by the standard deviation of the elements in the vector. We can accomplish this in Excel via the array formula
=STANDARDIZE(v,AVERAGE(v),STDEV.S(v))
To standardize an array A, we simply standardize each column of the array. This can be done using the above formulas of via the Real Statistics formula =XSTANDARDIZE(A).
In the following we will use A′ to represent the transpose of A.
Initialization
Set E = the standardized version of X and F = the standardized version of Y.
Initially set u = the first column of F.
Set h = 1. h contains the number of latent vectors used in the model.
Inner iteration
Next repeatedly perform the following four steps until vector t converges (i.e. its value doesn’t change by more than a preset amount (or until a maximum number of iterations is reached).
w = N(E′u)
t = N(Ew)
c = N(F′t)
u = Fc
Here, w is k × 1, t is n × 1, c is m × 1, and u is n × 1. t contains X factor scores and u contains Y factor scores. w contains X weights and c contains Y weights.
Outer Iteration
Once the iteration converges, perform the following steps where d is a scalar (used to predict Y from t) and p is a k × 1 vector (factor loadings for X).
d = t′u
p = E′t
E = E – tp′
F = F – dtc′
We thus obtain reduced values for the matrices E and F.
Provided E is not the null matrix and h < min(k, m), we can increment h, h = h + 1, and return to the Inner Iteration step of the algorithm. We can also terminate the algorithm prematurely if we want to limit the number of latent vectors to some value h < min(k, m).
Termination
Upon termination, we set the n × h T matrix to be the concatenation of the t vectors (in order).
Similarly, we define the k × h W, m × h C, n × h U, and k × h P matrices from the w, c, u and p vectors.
Finally, we set D = the diagonal matrix whose main diagonal consists of the d values.
Standardized Regression Coefficients
We observe that the standardized version of X can be expressed by
X = TP′
and so
T = X(P′)+
where (P′)+is the pseudo-inverse of P′ as described in Pseudo-Inverse. It now follows that the predicted value of (the standardized) Y, Ŷ, can be expressed as
Ŷ = TDC′ = X(P′)+DC′
We now define the h × m PLS regression coefficient matrix as follows:
B = (P′)+DC′
Thus
Ŷ = XB
We can use this relationship for predictions even when X is not part of the original data.
Note too that we don’t actually need the T, U or W matrices to calculate B.
Unstandardized Regression Coefficients
Finally, we can obtain the (unstandardized) regression coefficients B*as follows:
If B = [bij], sxi = the standard deviation of the ith column of X, and syj = the standard deviation of the jth column of Y, then B* = [b*ij] where
Also, B* contains m intercept coefficients, defined by
where x̄i is the mean of the ith column of X and ȳj is the mean of the jth column of Y.
This is the same transformation described in Standardized Regression Coefficients.
Example
Click here for an example of how these steps are implemented in Excel.
When Y is a column vector
Note that if m = 1, then u is initialized to F. Also, c is 1 × 1, i.e. a scalar. But because c is normalized, c = 1, and so u = Fc = F. Thus, the value of u doesn’t change. This means that the inner iteration terminates after one iteration, and the last two of the four steps can be dropped.
References
Hervé Abdi (2003) Partial least squares (PLS) regression
https://www.utdallas.edu/~herve/Abdi-PLS-pretty.pdf
Minitab (2025) Methods and formulas for model information in partial least squares regression
https://support.minitab.com/en-us/minitab/help-and-how-to/statistical-modeling/regression/how-to/partial-least-squares/methods-and-formulas/model-information/