Multivariate Regression Proofs

Objective

We now provide proofs of properties presented in Multivariate Regression Basic Concepts.

Proofs (part 1)

Property 1

B = (XTX)-1XTY

Proof: By univariate regression properties

B = [B1 B2 ⋅⋅⋅ Bm]

= [(XTX)-1XTY1   (XTX)-1XTY2  ⋅⋅⋅  (XTX)-1XTYm]

= (XTX)-1XT[Y1 Y2 ⋅⋅⋅ Ym] = (XTX)-1XTY

Property 2: B minimizes the trace

Tr((Y – XB)T(Y – XB))

Proof: The m × m SSCP matrix S

S = (Y – XB)T(Y – XB)

has diagonal terms which are non-negative scalars of the form

Diagonal of S terms

Now

Trace of S

Since the values bjp minimize each term in the above sum, they also minimize the sum.

Property 3:

E[ε] = 0

Proof: This is a consequence of the fact that E[εp] = 0 for all p

Property 4B is an unbiased estimator of β; i.e. E[B] = β

Proof: By Property 1

B = (XTX)-1XTY

But

Y = Xβ + ε

Thus

B = (XTX)-1XTY = (XTX)1XT(Xβ+ε)

=(XTX)1XTXβ + (XTX)1XTε

= (XTX)1(XTX)β + (XTX)1XTε

= β + (XTX)1XTε

Thus

E[B] = E[β + (XTX)1XTε] = E[β] + E[(XTX)1XTε]

 = β + (XTX)1XTE[ε] = β + 0 = β

since E[ε] = 0 by Property 3.

Property 5:

cov(Bp, Bq) = σpq(XTX)-1

Proof: Using univariate regression properties

Bp = (XTX)-1XTYp= (XTX)1XT(p + εp)

= (XTX)1XTp + (XTX)1XTεp = βp + (XTX)1XTεp

Thus

Bp = βp + (XTX)1XTεp

and so

BpE[Bp] = βp + (XTX)1XTεpβp = (XTX)1XTεp

Similarly

BqE[Bq] = (XTX)1XTεq

Hence

cov(Bp, Bq) = E[((XTX)1XTεp)((XTX)1XTεq)T]

= E[(XTX)1XTεpεqTX(XTX)1] = (XTX)1XTE[εpεqT]X(XTX)1

= (XTX)1XTpqI)X(XTX)1

The last equality is a result of the fact that E[εpεqT] = E[(εp-E[εp])(εq-E[εq])T] = cov(εp,εq) = σpqsince E[εp] = E[εq] = 0. Finally,

cov(Bp, Bq) = (XTX)1XTpqI)X(XTX)1

= σpq(XTX)1(XTX)(XTX)1 = σpq(XTX)1

Property 6:

E[Ep] = 0

Proof: Here Ep is the pth column of E = [eip].

E[Ep] = E[Yp–XBp] = E[Yp] – XE[Bp] = E[Yp] – Xβp

The last equality results from the fact that Bp is an unbiased estimator of βp.

But

Yp = Xβp + εp

and so

E[Yp] = E[p + εp] = E[p] + E[εp] = p + 0 = p

Putting everything together, we have

E[Ep] = E[Yp] – Xβp = Xβp– Xβp = 0

Hat matrix properties

We now present some properties of the hat matrix

H =  X(XTX)1XT

Property A: H is symmetric

Proof:

HT = (X(XTX)1XT)T = X((XTX)1)TXT = X((XTX)T)-1XT = X(XTX)1XT = H

Property B: H is idempotent

Proof:

H2 = (X(XTX)1XT)2 = (X(XTX)1XT)(X(XTX)1XT)

= X(XTX)1(XTX)(XTX)1XT = X(XTX)1XT = H

Property C: I – H is symmetric and idempotent

Proof: The result follows from Properties A and B since

(I – H)T = IT – HT = I – H

(I – H)2 = (I – H)(I – H) = I 2H +H2 = I 2H + H = I – H

Property D: From Property C, it follows that

(I – H)T(I – H) = I – H

Proofs (part 2)

Property 7:

E[EpTEq] = σpqdfRes

Proof: Here dfRes = n – k 1. First we note that

EpTEq = (Yp–XBp)T(Yq–XBq) = (Yp–HYp)T(Yq–HYq)

= ((I–H)Yp)T((I–H)Yq) = YpT(I–H)T(I–H)Yq = YpT(I–H)Yq

The last equality follows from Property D. Thus

EpTEq = YpT(I–H)Yq = YpTYq – YpTHYq

Under construction

Property 8: SSE/dfRes is an unbiased estimate for Σ; i.e. E[SSE] = E[ETE] = dfResΣ

Proof: This is a consequence of Property 7.

Property 9:

cov(Bp, Eq) = 0          cov(B, E) = 0

Proof: The proof of the first assertion is similar to that for Property 7. The second assertion follows from the first.

References

Johnson, R. A., Wichern, D. W. (2007) Applied multivariate statistical analysis. 6th Ed. Pearson
https://mathematics.foi.hr/Applied%20Multivariate%20Statistical%20Analysis%20by%20Johnson%20and%20Wichern.pdf

Rencher, A.C., Christensen, W. F. (2012) Methods of multivariate analysis (3nd Ed). Wiley

Leave a Comment