Lecture 8 Econ 2001

2015 August 19

Lecture 8 Outline

1

Eigenvectors and eigenvalues

2

Diagonalization

3

Quadratic Forms

4

De…niteness of Quadratic Forms

5

Uniqure representation of vectors

Eigenvectors and Eigenvalues De…nition An eigenvalue of the square matrix A is a number such that A I is singular. If is an eigenvalue of A, then any x 6= 0 such that (A I)x = 0 is called an eigenvector of A associated with the eigenvalue . Therefore: 1

Eigenvalues solve the equation det(A

2

I) = 0:

Eigenvectors are non trivial solutions to the equation Ax = x

Why do we care about eigenvalues and their corresponding eigenvectors? They enable one to relate complicated matrices to simple ones. They play a role in the study of stability of di¤erence and di¤erential equations. They make certain computations easy. They let us to de…ne a way for matrices to be positive or negative and that matters for calculus and optimization.

Characteristic Equation De…nition If A is an n

f ( ) = det(A

n matrix, the characteristic equation is de…ned as 0 I) = 0

or

B B f ( ) = det B @

11

12

1n

.. .

.. .

.. .

m1

m2

21

22

2n

mn

This is a polinomial equation in .

Example For a two by two matrix: a11 a12 a21 a22 Hence the characteristic equation is A

2 2

I =

2 2

2

) det A = (a11

(a11 + a22 ) + a11 a22

Which typically has two solutions.

)(a22

a12 a21 = 0

)

a12 a21

1

C C C= A

Characteristic Polynomial The characteristic equation f ( ) = det(A

I) = 0 is a polynomial degree n.

By the Fundamental Theorem of Algebra, it has n roots (not necessarily distinct and not necessarily real). That is f( )=(

c1 )(

c2 )

(

cn )

where c1 ; : : : ; cn 2 C (the set of complex numbers) and the ci ’s are not necessarily distinct. Notice that f ( ) = 0 if and only if 2 fc1 ; : : : ; cn g, so the roots are all the solutions of the equation f ( ) = 0. if = ci 2 R, there is a corresponding eigenvector in Rn . if = ci 62 R, the corresponding eigenvectors are in Cn n Rn .

Another way to write the characteristic polynomial is P( ) = (

m1

r1 )

(

mk

rk )

;

where r1 ; r2 : : : ; rk are distinct roots (ri 6= rj when i 6= j) and mi are positive integers summing to n. mi is called the multiplicity of root ri .

Distinct Eigenvectors FACT The eigenvectors corresponding to distinct eigenvalues are distinct.

Proof By Contradiction. Let 1 ; : : : ; k be distinct eigenvalues and x1 ; : : : ; xk the associated eigenvectors. Suppose these vectors are linearly dependent (why is this a contradiction?). WLOG, let the …rst k 1 vectors be linearly independent, while xk is a linear combination of the others. Pk 1 Thus, 9 i i = 1; : : : ; k 1 not all zero such that: i xi = xk i =1 Multiply both sides by A and use the eigenvalue property (Ax = x): k 1 X i i xi = k xk i =1

Multiply the …rst equation by k and subtract it from the second: k 1 X i( i k )xi = 0 i =1

Since the eigenvalues are distinct i 6= k for all i; hence, we have a non-trivial linear combination of the …rst k 1 eigenvectors equal to 0, contradicting their linear independence.

Diagonalization De…nition We say B is diagonalizable if we can …nd matrices P and D, with D diagonal, such that P 1 BP = D If a matrix is diagonalizable PP

1

BPP

1

= PDP

1

or B = PDP

1

where D is a diagonal matrix. Why do we care about this? We can use simple (i.e. diagonal) matrices to ‘represent’more complicated ones. This property is handy in many applications. An example follows: linear di¤erence equations.

Di¤erence Equations Detour A di¤erence equation is an equation in which “discrete time” is one of the independent variables. For example, the value of x today depends linearly on its value yesterday: xt = axt

8t = 0; 1; 2; 3; : : :

1

This is a fairly common relationship in time series data and macro. Given some initial condition x0 , this equation is fairly easy to solve using x1 = ax0 x2 = ax1 = a2 x0 ‘recursion’: ... xt 1 = axt 2 = at xt = axt 1 = at x0

1

x0

Hence: xt = at x0

8t = 0; 1; 2; 3; : : :

Di¤erence Equations Detour Continued Consider now a two-dimensional linear di¤erence equation: ct+1 b11 b12 ct = 8t = 0; 1; 2; 3; : : : kt+1 b21 b22 kt given some initial condition c0 ; k0 . ct kt compactly as Set yt =

8t and B = yt+1 = Byt

where bij 2 R each i; j.

b11 b21

b12 b22

and rewrite this more

8t = 0; 1; 2; 3; : : :

We want to …nd a solution yt ; t = 1; 2; 3; : : : given the initial condition y0 . Such a dynamical system can arise as a characterization of the solution to a standard optimal growth problem (you will see this in macro).

This is hard to solve since the two variables interact with each other as time goes on. Things would be much easier if there were no interactions (b12 = 0 = b21 ) because in that case the two equations would evolve independently.

Di¤erence Equations Detour: The End We want to solve yt+1 = Byt

8t = 0; 1; 2; 3; : : :

If B is diagonalizable, there exist an invertible 2 2 matrix P and a diagonal 2 2 matrix D such that d1 0 P 1 BP = D = 0 d2 Then yt+1 = Byt

8t

where we de…ned ^ yt = P

1

()

P

1

yt+1 = P

1

Byt

P

1

yt+1 = P

1

BPP

()

^ yt+1 = D^ yt

()

8t

8t 1

yt

8t

yt 8t (this is just a change of variable).

Since D is diagonal, after this change of variable to ^ yt we now have to solve two independent linear univariate di¤erence equations ^ yit = dit ^ y0

8t

which is easy because we can use recursion.

for i = 1; 2

Diagonalization Theorem Theorem If A is an n n matrix that either has n distinct eigenvalues or is symmetric, then there exists an invertible n n matrix P and a diagonal matrix D such that A = PDP

1

Moveover, the diagonal entries of D are the eigenvalues of A, and the columns of P are the corresponding eigenvectors.

Note Premultiply by P and postmultiply by P P

1

1

, the theorem says:

AP = D

De…nition Two square matrices A and B are similar if A = P P.

1

BP for some invertible matrix

The theorem says that some square matrices are similar to diagonal matrices that have eigenvalues on the diagonal.

Diagonalization Theorem: Idea of Proof

We want to show that for a given A there exist a matrix P and a diagonal matrix D such that A = PDP 1 , where the diagonal entries of D are the eigenvalues of A and the columns of P are the corresponding eigenvectors. Idea of Proof (a real proof is way too di¢ cult for me) Suppose

is an eigenvalue of A and x is an eigenvector. Thus Ax = x.

If P is a matrix with column j equal to the eigenvector associated with follows that AP = PD.

i,

it

The result would then follow if one could guarantee that P is invertible. The proof works by showing that when A is symmetric, A only has real eigenvalues, one can …nd n linearly independent eigenvectors even if the eigenvalues are not distinct (these results use properties of complex numbers). See a book for details.

A Few Computational Facts For You To Prove Facts det AB = det BA = det A det B If D is a diagonal matrix, then det D is equal to the product of its diagonal elements. det A is equal to the product of the eigenvalues of A.

De…nition The trace ofPa square matrix A is given by the sum of its diagonal elements. That n is, tr (A) = i =1 aii :

Fact

tr (A) =

n X

i;

i =1

where

i

is the ith eigenvalue of A (eigenvalues counted with multiplicity).

Unitary Matrices

Remember At is the transpose of A: the (i; j)th entry of At is the (j; i)th entry of A.

De…nition An n

n matrix A is unitary if At = A

1

.

REMARK By de…nition every unitary matrix is invertible.

Unitary Matrices Notation A basis V = fv1 ; : : : ; vn g of Rn is orthonormal 1 2

if each basis element has unit length (vi vi = 1 8i), and

distinct basis elements are orthogonal (vi vj = 0 for i 6= j).

Compactly, this can be written as vi vj =

ij

1 0

=

if i = j . if i = 6 j

Theorem An n

n matrix A is unitary if and only if the columns of A are orthonormal.

Proof. Let vj denote the j th column of A. At = A

1

()

At A = I

()

fv1 ; : : : ; vn g is orthonormal

()

vi vj =

ij

Symmetric Matrices Have Orthonormal Eigenvectors

Remember A is symmetric if aij = aji for all i; j, where aij is the (i; j)th entry of A.

Theorem If A is symmetric, then the eigenvalues of A are all real and there is orthonormal basis V = fv1 ; : : : ; vn g of Rn consisting of the eigenvectors of A. In this case, P in the diagonalization theorem is unitary and therefore: A = PDPt The proof is also beyond my ability (uses the linear algebra of complex vector spaces).

Quadratic Forms

Think of a second-degree polynomial that has no constant term: n X X 2 f (x1 ; : : : ; xn ) = ii xi + ij xi xj i =1

Let ij

Then,

=

(

ij

2 ji

2

if i < j if i > j

i 0 for all x 6= 0.

2

positive semi de…nite

if Q(x)

3

negative de…nite

4

negative semi de…nite

if Q(x) < 0 for all x 6= 0.

5

inde…nite if there exists x and y such that Q(x) > 0 > Q(y).

if Q(x)

0 for all x.

0 for all x.

In most cases, quadratic forms are inde…nite. What does all this mean? Hard to tell, but we can try to look at special cases to get some intuition.

Positive and Negative De…niteness Idea Think of positive and negative de…niteness as a way one applies to matrices the idea of “positive” and “negative”. In the one-variable case, Q(x) = ax 2 and de…niteness follows the sign of a. Obviously, there are lots of inde…nite matrices when n > 1.

Diagonal matrices also help with intuition. When A is diagonal: n X Q(x) = xt Ax = aii xi2 : i =1

therefore the quadratic form is:

positive de…nite if and only if aii > 0 for all i , positive semi de…nite if and only if aii 0 for all i negative de…nite if and only if aii < 0 for all i , negative semi de…nite if and only if aii 0 for all i , and inde…nite if A has both negative and positive diagonal entries.

Quadratic Forms and Diagonalization For symmetric matrices, de…niteness relates to the diagonalization theorem. Assume A is symmetric. By the diagonalization theorem: A = Rt DR; where D is a diagonal matrix with (real) eigenvalues on the diagonal and R is an orthogonal matrix. For any quadratic form Q(x) = xt Ax, by de…nition, A is symmetric. Then we have t

Q(x) = xt Ax = xt Rt DRx = (Rx) D (Rx) : The de…niteness of A is thus equivalent to the de…niteness of its diagonal matrix of eigenvalues, D. Think about why.

Quadratic Forms and Diagonalization: Analysis A quadratic form is a function Q(x) = xt Ax where A is symmetric. Since A is symmetric, its eigenvalues 1 ; : : : ; n are all real. Let V = fv1 ; : : : ; vn g be an orthonormal basis of eigenvectors of A with corresponding eigenvalues 1 ; : : : ; n . By an earlier theorem, the P in the diagonalization theorem is unitary. 0 1 0 0 1 B 0 0 C 2 B C Then: A = Ut DU where D = B . . .. C and U is unitary . .. .. @ .. . A 0 0 n Pn n We know that any x 2 R can be written as x = i =1 i vi . Then, one can rewrite a quadratic form as follows: X X X X X t t Q(x) = Q A Ut DU i vi = i vi i vi = i vi i vi X X X X t t = U D U D i vi i vi = i Uvi i Uvi 0 1 1 B C X 2 = ( 1 ; : : : ; n )D @ ... A = i i n

Quadratic Forms and Diagonalization The algebra on the previous slide yields the following result.

Theorem The quadratic form Q(x) = xt Ax is 1

positive de…nite if

> 0 for all i.

2

positive semi de…nite if

3

negative de…nite if

4

negative semi de…nite if

5

inde…nite if there exists j and k such that

i

i

0 for all i.

i

< 0 for all i. i

0 for all i. j

>0>

k.

REMARK We can check de…niteness of a quadratic form using the eigenvalues of A.

Principal Minors De…nition A principal submatrix of a square matrix A is the matrix obtained by deleting any k rows and the corresponding k columns.

De…nition The determinant of a principal submatrix is called the principal minor of A.

De…nition The leading principal submatrix of order k of an n n matrix is obtained by deleting the last n k rows and column of the matrix.

De…nition The determinant of a leading principal submatrix is called the leading principal minor of A. Principal minors can be used in de…nitess tests.

Another De…niteness Test

Theorem A matrix is 1

positive de…nite if and only if all its leading principal minors are positive.

2

negative de…nite if and only if its odd principal minors are negative and its even principal minors are positive.

3

inde…nite if one of its kth order leading principal minors is negative for an even k or if there are two odd leading principal minors that have di¤erent signs. This classi…es de…niteness of quadratic forms without …nding the eigenvalues of the corresponding matrices. Think about these conditions when applied to diagonal matrices and see if they make sense in that case.

Back to Linear Algebra De…nitions All vectors below are elements of X (a vector space) and all scalars are real numbers. The linear combination of x1 ; : : : ; xn with coe¢ cients 1 ; : : : ; n : n X y= i xi i =1

The set of all linear combinations of elements of V = fv1 ; : : : ; vk g k X spanV = fx 2 X : x = i vi with v 2V g i =1

A set V A set V

X spans X if spanV = X . X is linearly dependent if

9v1 ; : : : ; vn 2 V and A set V

n

not all zero such that

n X

i vi

=0

i =1

X is linearly independent if it is not linearly dependent.

Thus, V

X is linearly independent if and only if: n X i =1

.

1; : : : ;

i vi

= 0 with each vi 2 V

)

i

= 0 8i

A basis of X is a linearly independent set of vectors in X that spans X .

Vectors and Basis

Any vector can be uniquely written as a …nite linear combination of the elements of some basis of the vector space to which it belongs.

Theorem Let V be a basis for a vector space X over R. Every vector x 2 X has a unique representation as a linear combination of a …nite number of elements of V (with all coe¢ cients nonzero). Haven’t we proved this yet? Not at this level of generality. P The unique representation of 0 is 0 = i 2; i vi .

Any vector has a unique representation as linear combination of …nitely many elements of a basis.

Proof. Since V spans X , any x 2 X can be written as a linear combination of elements of V . We need to show this linear combination is unique. X X Let x= and x= s vs s vs s 2S 1

where S1 is …nite, where S2 is …nite,

s s

De…ne S = S1 [ S2 , s

Then 0=x

x=

2 R, 2 R,

s 2S 2

s s

6= 0, and vs 2 V for each s 2 S1 and 6= 0, and vs 2 V for each s 2 S2 .

= 0 for s 2 S2 n S1

X

X

s vs

s 2S 1

s 2S 2

s vs

and =

X s 2S

s

= 0 for s 2 S1 n S2

s vs

X s 2S

s vs

=

X (

Since V is linearly independent, we must have s s = 0, so s 2 S. s 2 S1 , s 6= 0 , s 6= 0 , s 2 S2 So S1 = S2 and

s

=

s

s )vs

s

s 2S

s

=

s,

for all

for s 2 S1 = S2 , and the representation is unique.

A Basis Always Exists Theorem Every vector space has a (Hamel) basis. This follows from the axiom of choice (did we talk about this?). An equivalent result says that if a linearly independent set is not a basis, one can always “add” to it to get a basis.

Theorem If X is a vector space and V X is a linearly independent set, then V can be extended to a basis for X . That is, there exists a linearly independent set W X such that V W spanW = X There can be many bases for the same vector space, but they all have the same number of elements.

Theorem Any two Hamel bases of a vector space X have the same cardinality (are numerically equivalent).

Standard Basis De…nition The standard basis for Rn consists of the set of N vectors ei , i = 1; : : : ; N, where ei is the vector with component 1 in the ith position and zero in all other positions. 0 1 0 1 0 1 0 1 1 0 0 0 B0C B1C B0C B0C B C B C B C B C B C B C B C B C e1 = B0C e2 = B0C en 1 = B ... C en = B ... C B .. C B .. C B C B C @.A @.A @1A @0A 0

1

0

0

1

A standard basis is a linearly independent set that spans Rn .

2

Elements of the standard basis are mutually orthogonal. When this happens, we say that the basis is orthogonal.

3

Each basis element has unit length. When this also happens, we say that the basis is orthonormal. Verify all these.

Orthonormal Bases

We know an orthonormal basis exists for Rn (the standard basis).

Fact One can always …nd an orthonormal basis for a vector space.

Fact If fv1 ; : : : ; vk g is an orthonormal basis for V then for all x 2 V , x=

k X i =1

i vi

=

k X

(x vi ) vi

i =1

This follows from the properties on the previous slide (check it).

Dimension and Basis De…nition The dimension of a vector space X , denoted dim X , is the cardinality of any basis of X .

Notation Reminder For V

X , jV j denotes the cardinality of the set V .

Fact Mm n , the set of all m A basis is given by fEij : 1

i

m; 1

n real-valued matrices, is a vector space over R. j

ng

1 if k = i and ` = j 0 otherwise n matrices is mn.

where (Eij )k ` =

The dimension of the vector space of m Proving this is an exercise.

Dimension and Dependence Theorem Suppose dim X = n 2 N. If A

X and jAj > n, then A is linearly dependent.

Proof. If not, A is linearly independent and can be extended to a basis V of X : A a contradiction

V ) jV j

jAj > n

Intuitively, if A’s dimension is larger than the dimension of X there must be some lineraly dependent elements in it.

Theorem Suppose dim X = n 2 N, V

X , and jV j = n.

If V is linearly independent, then V spans X , so V is a basis. If V spans X , then V is linearly independent, so V is a basis. Prove this as part of Problem Set 8.

Tomorrow

We illustrate the formal connection between linear functions and matrices. Then we move to some useful geometry. 1

Linear Functions

2

Linear Functions and Matrices

3

Analytic Geometry in Rn