Symmetric Matrices | Real Statistics Using Excel

Roots of a polynomial

By the fundamental theorem of algebra any nth degree polynomial p(x) has exactly n roots, i.e. n values for which p(x) = 0. In fact, if r₁, …, r_n are the n roots, then the polynomial can be expressed as a_n · $\prod_{i=1}^n {}$ (x – r_i).

For example, the 2^nd degree polynomial x² – 1 has two roots, namely 1 and -1, and in fact this polynomial is equivalent to (x – 1)(x + 1). The 2^nd degree polynomial x² + 2x + 1 also has two roots, namely -1 and -1, and so can be expressed as (x + 1)(x + 1) or (x + 1)².

But what about the polynomial x² + 1? It too has two roots, namely $\sqrt{-1}$ and – $\sqrt{-1}$ . Unfortunately, as we know the square root of a negative number is not one of the numbers we ordinarily deal with, namely the decimal numbers, also called the real numbers.

In order to capture the roots of all polynomials, even those with only real coefficients, such as x² + 1, we have to deal with what are called complex numbers. These are numbers of the form a + bi where a and b are real numbers and i is an abbreviation for $\sqrt{-1}$ . These include the real numbers (i.e. those where b = 0), but also include non-real numbers, called imaginary numbers (namely those where b ≠ 0).

Eigenvalues

This discussion is relevant to eigenvalues. By definition, λ is an eigenvalue of the n × n matrix A if det(A – λI) = 0. Since det(A – λI) = 0 can be expressed as an nth degree polynomial, as described in Eigenvalues and Eigenvectors, it has n roots, i.e. n values for which det(A – λI) = 0. This means that every n × n matrix has n eigenvalues, but it is not necessarily the case that all of the eigenvalues are real numbers.

Fortunately, in the cases we are most interested in, namely symmetric matrices (i.e. those where A = A^T), it turns out that all the eigenvalues are real. We look at this case next.

Property 1: All the eigenvalues of a symmetric (real) matrix are real.

Note that on this webpage, we only consider matrices all of whose elements are real numbers.

Properties of Complex Numbers

Before we proceed with the proof of Property 1, we briefly state a few properties of complex numbers.

If z = a + bi is a complex number, then z′ = a – bi is called the conjugate of z and |z| = $\sqrt {a^2 + b^2}$ is called the absolute value of z. Clearly, when z is a real number (i.e. b = 0) its conjugate is simply z, but when z is an imaginary number, then its conjugate is distinct from z. Addition, subtraction and multiplication are performed as for real numbers (noting that i times i = -1). If we consider vectors that contain complex numbers as elements, then the conjugate X′ of vector X is a vector with the same shape as X consisting of the conjugates of the elements in X.

First note that if X = [x_i], then X′ ∙ X = $\sum_{i=1}^n{}$ |x_i|² ≥ 0, i.e. the dot product of a vector and its conjugate is a non-negative real number. If X ≠ 0, then clearly the dot product is a positive real number.

In general, if z is a root of the polynomial p(x), then the conjugate of z is also a root of p(x). Thus if λ is an eigenvalue of an n × n matrix A, then the conjugate λ′ is also an eigenvalue. If X is an eigenvector that corresponds to λ (i.e. AX = λX), then it is easy to show that X′ is an eigenvector which corresponds to λ′ (i.e. AX′ = λ′X′).

Proof of Property 1

Proof (Property 1): Let λ be an eigenvalue of the n × n matrix A. By definition, this means that det(A – λI) = 0. Thus there are an infinite number of non-trivial solutions to the equation (A – λI)X = 0, which can be found using Gaussian Elimination (extended to complex numbers if necessary). Now let X be one such solution, i.e. X is an eigenvector corresponding to λ. Using the observations made above and the fact that A^T = A since A is symmetric, we have

Thus, λ(X′ ∙ X) = λ′(X′ ∙ X), but since X′ ∙ X is a positive real number we conclude that λ = λ′, which can only happen if λ is a real number.

Property 2: If A is a symmetric matrix and X and Y are eigenvectors associated with distinct eigenvalues of A, then X and Y are orthogonal.

Proof: Let c be the eigenvalue associated with X and d be the eigenvalue associated with Y, with c ≠ d.

Using the above observation

But since c ≠ d, it follows that X ∙ Y = 0.

Orthogonal Diagonalizability

Definition 1: A square matrix A is orthogonally diagonalizable if there exist an orthogonal matrix P and a diagonal matrix D such that A = PDP^-1. Note that since P is an orthogonal matrix, by Property 7 of Orthogonal Vectors and Matrices, P^-1 = P^T, and so equivalently A = PDP^T.

If A = PDP^T is an n × n matrix where D is the diagonal matrix whose main diagonal consists of the n eigenvalues of A and P is the n × n matrix whose columns are the n unit eigenvectors corresponding to these eigenvalues, then we call PDP^T a spectral decomposition of A.

Property 3: If A is orthogonally diagonalizable, then A is symmetric.

Proof: Suppose that A = PDP^T. It follows that

since diagonal matrices are symmetric and so D^T = D. This proves that A^T = A, and so A is symmetric.

Converse of Property 3

We next show the converse of Property 3. In fact we show that any symmetric matrix has a spectral decomposition. First we show that this is true for a special case.

Property 4: Let A be an n × n matrix for which all the eigenvalues are distinct and let C be the n × n matrix whose columns are the n unit eigenvectors C₁, …, C_n corresponding to the n eigenvalues λ₁, …, λ_n of A and let D be the n × n diagonal matrix whose main diagonal consists of λ₁, …, λ_n. Then A has a spectral decomposition A = CDC^T.

Proof: By Property 2, C₁, …, C_n are mutually orthogonal. By Property 3 of Orthogonal Vectors and Matrices, is an orthogonal matrix and so CC^T= I. Now the jth column AC is AC_j = λ_j C_j. Thus AC = CD, and so A = AI = ACC^T = CDC^T.

Observation: Note we didn’t specify that the matrix in Property 4 needs to be symmetric. By Property 3 and 4, we see that if all the eigenvalues of a square matrix are distinct then the matrix must be symmetric.

Unfortunately not all symmetric matrices have distinct eigenvalues as can be seen from the diagonal matrix with 1, 1, 2 on the main diagonal. As we know from Property 1 of Determinants and Linear Equations, the eigenvalues of this matrix are the values on the main diagonal, namely 1, 1 and 2, which are clearly not distinct. Since all diagonal matrices are symmetric, this provides a simple counterexample.

It still turns out that any symmetric matrix has a spectral decomposition (with distinct orthonormal eigenvectors), as can be seen from the Spectral Decomposition Theorem (see Spectral Decomposition.

References

Golub, G. H., Van Loan, C. F. (1996) Matrix computations. 3rd ed. Johns Hopkins University Press

Searle, S. R. (1982) Matrix algebra useful for statistics. Wiley

Perry, W. L. (1988) Elementary linear algebra. McGraw-Hill

Fasshauer, G. (2015) Linear algebra
https://math.iit.edu/~fass/532_handouts.html

Lambers, J. (2010) Numerical linear algebra
https://www.yumpu.com/en/document/view/41276350