1 A Simple Model of Household Consumption

Consumption and the Permanent Income Hypothesis Craig Burnside Duke University Fall 2009 1 A Simple Model of Household Consumption Imagine a househ...
88 downloads 1 Views 121KB Size
Consumption and the Permanent Income Hypothesis Craig Burnside Duke University Fall 2009

1

A Simple Model of Household Consumption

Imagine a household with the instantaneous utility function u(Ct ) that wishes to maximize its lifetime utility U=

T 1 X

t

(1)

u(Ct );

t=0

where 0
r, so that the household tends to discount future

consumption more than a household with

= r does. Notice that this means

u0 (Ct+1 ) 1+ = >1 0 u (Ct ) 1+r

=)

Ct+1 < Ct :

The household will have a declining consumption pro…le. But notice that the path of consumption will still be smooth since the MRS is constant for all t. In other words, consumption will be declining, but along a very smooth path. 3

In the opposite case where consumption pro…le since

1.3

< r, the household, not surprisingly, has an increasing

< r implies u0 (Ct+1 ) < u0 (Ct ).

Empirical Implications and the Keynesian Model

In traditional Keynesian models of aggregate ‡uctuations, aggregate consumption is a simple linear function of aggregate income: C = a + bY , with 0 < b < 1 being the marginal propensity to consume out of income. This relationship followed from Keynes’belief that at the individual level, the higher one’s income, the higher one’s saving rate would be. When we consider data, one thing we can do is take a cross-sectional sample of households and plot household consumption against household income. When this is done with US data the diagram looks something like Figure 7.1a in Romer. This, indeed, looks consistent with the Keynesian notion that the savings rate would increase with income, since when C = a + bY , S=Y = 1

b

a=Y .

If, instead of plotting household consumption against household income at one point in time, we plotted aggregate consumption against aggregate income over time, the picture would look something like Figure 7.1b in Romer, for US data. This suggests, in contrast to the simple versions of the Keynesian theory in our textbooks, that there is a stable ratio of C=Y over time. As you will see in the homework, explaining this requires thinking about how growth over time a¤ects the analysis. Finally, Romer’s Figure 7.1c shows a graph, like Figure 7.1a, where the data are divided into two di¤erent socio-economic groups, say whites and blacks. Again, consumption appears to obey the relationship C = a + bY , but a is smaller for the group I have labelled “black” while b seems to be the same across groups. We will now discuss whether these observations are consistent with our theory of consumption being determined by permanent income. Suppose you have some cross-sectional data on household consumption, Ci and household income, Yi , for N households, so that i = 1, 2, : : : , N . We will de…ne transitory income, YiT , implicitly using the following identity: Yi = YiP + YiT

(17)

where YiP , permanent income, is de…ned by the right-hand side of (16). Our theory implies that Ci = YiP :

(18)

We want to see if our theory is consistent with, say, Figure 7.1a, where household consumption is plotted against household income, and a regression is used to …t a line through the scatter plot of Ci on Yi . 4

If we were to run a regression of Ci on Yi , to obtain Ci = a + bYi , we know, from simple econometric theory that ^b = Cov(Ci ; Yi )=V ar(Yi ) a ^ = C ^bY : Notice that we can use (17) and (18) to rewrite ^b as P P T ^b = Cov(Ci ; Yi ) = Cov(Yi ; Yi + Yi ) V ar(Yi ) V ar(YiP + YiT ) V ar(YiP ) + Cov(YiP ; YiT ) = : V ar(YiP ) + V ar(YiT ) + 2Cov(YiP ; YiT )

Romer argues that we should expect Cov(YiP ; YiT ) to be small. This makes a certain amount of sense. Within our sample of household data we might expect some individuals to be having a relatively low income year, so that their transitory income would be negative, and we might expect some to be having a relatively high income year, so that their transitory income would be positive. However, there’d be no particular reason to believe that high income individuals would mostly be having a good year, while low income individuals were mostly having a bad year, or vice versa.1 In this case we’d expect Cov(YiP ; YiT ) ^b =

0. In this case

V ar(YiP ) V ar(YiP ) + V ar(YiT )

which implies that 0 < ^b < 1. Also, again using (17) and (18) we have a ^ = C = (1

^bY = Y P ^b(Y P + Y T ) ^b)Y P ^bY T

In a typical year, we would expect Y T 0, and even in a very good year, we would expect T P the scale of Y to be small relative to Y , hence it is reasonable to believe that a ^ > 0 in most samples. So our theory is consistent with Figure 7.1a. When we consider Figure 7.1c, the explanation for the slopes being similar for blacks and whites is that the relative variance, V ar(YiP )=V ar(YiT ) is similar within the two groups. On the other hand, consider a year in which Y T is approximately zero for both groups. Notice that in this case, a ^ (1 ^b)Y P . Since average income levels are higher for whites than for blacks, this is also true of permanent income levels, and this explains the higher intercept for the white group. To show that our theory is consistent with Figure 7.1b requires that we think about the impact of income growth on the time series relationship between consumption and income. We will examine this in the homework. 1

Most people might be having one or the other, say because we were looking at a boom or recession year.

5

2

A Model with Uncertainty

Now imagine that our household is uncertain about its future endowment income. In all other respects we leave the model unchanged. At time 0, the household maximizes E0

T 1 X

t

u(Ct );

t=0

subject to A0 given, AT

0, and the sequence of budget constraints

At+1 = At (1 + r) + Yt

Ct ;

0

t

T

1:

When we optimize under certainty, we can formulate the problem as a Lagrangean as before, or use dynamic programming. The important thing to keep in mind is that the household cannot choose the entire path of consumption at time 0 because of uncertainty. It can only choose C0 and contingency plans for Ct , t realizations of Yt .

1, where the plans are contingent of the

Since we haven’t changed anything in the model, other than to make the income stream uncertain, we end up with the same …rst-order conditions, but any intertemporal conditions hold in expectation as opposed to holding with certainty. So we end up with the following conditions t 0

(19)

t

=

u (Ct )

t

= (1 + r)Et

t+1

(20)

These equations imply that u0 (Ct ) = (1 + r)Et u0 (Ct+1 ): Notice that this means the MRS is not constant, however, the one-step ahead forecast of the MRS is constant: Et mt;t+1

Et u0 (Ct+1 ) u0 (Ct )=[ (1 + r)] 1 = = = : 0 0 u (Ct ) u (Ct ) 1+r

Deviations of the MRS from its conditional mean, 1=(1 + r), are, therefore, white noise. I.e. we have mt;t+1 = (1 + r)

2.1

1

+

t+1 :

A Special Case, Again

Notice that if r = , we have u0 (Ct ) = Et u0 (Ct+1 ), so that the marginal utility of consumption is a random walk (or martingale). I.e. it is a time series process, xt , such that Et xt+1 = xt . In other words the one-step-ahead forecast of xt is just xt . 6

2.2

The Random Walk Hypothesis

Suppose the utility function is quadratic: i.e. u(C) = C

(a=2)C 2 , implying that u0 (C) =

1 aC. As a result, if we consider the case where r = , we have 1 implying that

aCt = 1

aEt Ct+1 ,

Et Ct+1 = Ct : I.e. consumption is a random walk. Thus we can write Ct = Ct

1

+ t . This result is not

too surprising. In the deterministic case we had the result that Ct = C for all t. Here, the household chooses to have a consumption path which has no predictable changes. When uncertainty is resolved, the household will adjust its consumption, but it always does it in a way that implies that any future changes are unpredictable. In a more general setting, of course, we have u0 (Ct ) = (1 + r)Et u0 (Ct+1 ):

(21)

Does a version of the random-walk hypothesis hold in this case? It does, as long as we are willing to work with …rst-order approximations to the …rst-order conditions. Notice, for example, that we could approximate the right-hand side of (21) in the neighborhood of Ct . In particular we could write u0 (Ct )

(1 + r)u0 (Ct ) + (1 + r)u00 (Ct )Et (Ct+1

Ct ):

This would imply that Et Ct+1 This means that if

r 1+r

Ct

u0 (Ct ) : u00 (Ct )

= r, the random walk hypothesis will approximately hold for more

general utility functions.

3 3.1

Testing the Permanent Income Hypothesis Hall’s Test

Early tests of the PIH used an in…nite horizon version of our model with uncertainty. To extend the problem to the in…nite horizon we must impose a condition that is the analog of AT

0. This condition is lim (1 + r) t At = 0

t!1

which is discussed further in the homework. Our household’s problem will be to maximize E0

1 X

t

t=0

7

u(Ct );

subject to A0 given, limt!1 (1 + r) t At = 0, and the sequence of budget constraints At+1 = At (1 + r) + Yt

Ct ;

t

(22)

0:

We end up with the same …rst-order conditions as above, (19) and (20), and therefore, it continues to be the case that (21) holds. Furthermore, if we assume quadratic utility and = r we get (23)

Ct = Et Ct+1 :

Notice that (23) means that consumption changes between t and t + 1 are unforecastable at time t. Hall’s tests of the model essentially involved seeing whether that was true. He measured Ct+1

Ct and regressed it on variables dated t and earlier. According to our

theory, none of these variables should have power in predicting Ct+1 Ct and should show up as insigni…cant in the regressions. Hall found that this was true for a large number of variables, but he also found that some variables did help to predict consumption changes, notably stock market data. This led him to reject the model.

3.2

Flavin’s Test

When we solved the model with perfect foresight we had the condition Ct = C for all t. We combined this condition with the lifetime budget constraint to get an explicit solution for C in terms of A0 and the present value of the future income stream. We will do the same thing here. Notice that if we iterate on the budget constraint, (22), starting at time t, and we impose limt!1 (1 + r) t At = 0, we get 1 X

(1 + r)

(j+1)

Ct+j = At +

j=0

1 X

(j+1)

(1 + r)

Yt+j :

(24)

j=0

If we take the conditional expectation of both sides at time t we have 1 X

(1 + r)

(j+1)

Et Ct+j = At +

j=0

1 X

(j+1)

(1 + r)

Et Yt+j :

(25)

j=0

Using (23), and the fact that

P1

j=0 (1

+ r)

Ct = r[At +

(j+1)

1 X

= 1=r, we can write this as

(1 + r)

(j+1)

Et Yt+j ]:

(26)

j=0

This equation allows us to solve for the change in consumption over time, Ct that if I substitute the fact that At = At 1 (1 + r) + Yt 8

1

Ct

1

Ct 1 . Notice

into (26) I get Ct = r(1 + r)At

1

+ rYt

rCt

1

1

+r

1 X

(j+1)

(1 + r)

(27)

Et Yt+j :

j=0

I can also get an expression for (1 + r)Ct

using (26) dated back one period, and multiplied

1

through by (1 + r): (1 + r)Ct

1

= r(1 + r)At

1 + r(1 + r)

1 X

(1 + r)

(j+1)

Et 1 Yt

(28)

1+j :

j=0

Subtracting both sides of (28) from the equivalent sides of (27) we have Ct

(1 + r)Ct

= r(1 + r)At

1

" Notice that the r(1 + r)At Ct

1

Ct

rCt

1

1

+ rYt

rCt

1

1

+r

1 X

(1 + r)

(j+1)

Et Yt+j

j=0

r(1 + r)At

1

+ r(1 + r)

1 X

(1 + r)

(j+1)

Et 1 Yt

1+j

j=0

on the LHS cancels with

rCt

1

#

:

on the RHS, and that the two

terms cancel with one another, so that we have 1

= rYt

1

+r

1 X

(1 + r)

(j+1)

Et Yt+j

r(1 + r)

j=0

Notice, also that rYt

1

1 X

(1 + r)

(j+1)

Et 1 Yt

1+j :

j=0

cancels with the …rst term in the second in…nite sum, which is

rEt 1 Yt 1 . With some careful rewriting, we can then combine the two sums and write Ct

Ct

1

=r

1 X

(1 + r)

(j+1)

(Et Yt+j

(29)

Et 1 Yt+j ):

j=0

Since we know that the change in consumption is white noise, i.e. Ct

Ct

1

= vt , where

vt is some white noise process, we now have an explicit expression for vt . It is the change, between dates t

1 and t, in the expectation of lifetime income from date t forward.

Flavin’s tests are based on estimating a joint model of consumption and income that nests the PIH. The null hypothesis that the PIH is true can then be tested. She assumes that (L)Yt = where (L) = 1

1L

2L

2

pL

p

0

+

t

. In other words, she assumes that Yt is a p-th

order autoregressive process. She then writes Yt =

(L) 1 (

0

+ t)

= ~ 0 + (L) t = ~0 + 0 t +

1 t 1

9

+

2 t 2

+

;

where ~ 0 =

0=

1

(1), and (L) = (L)

=

Yt Et Yt+1

0

+

1L

+

2L

Et 1 Yt =

0 t

Et 1 Yt+1 =

1 t

2

. Notice that means

+

and, by extension, Et Yt+j

Et 1 Yt+j =

j t:

Hence, from (29) we have Ct

Ct

1

To simplify notation we will de…ne

"

= r =r

1 X

(1 + r)

#

(j+1) j

j=0

P1

j=0 (1

+ r)

(j+1)

j

(30)

t:

and write Ct

Ct

1

=

t.

Flavin’s test of the PIH essentially involves writing down the following model of consumption and income Ct =

t

+

0

+

Yt =

0

+

1 Yt 1

Yt +

1

+

2

2 Yt 2

Yt

1

+ ut

+ t:

(31) (32)

In the equation for output, (32), the order of the autoregression, p, has been set equal to 2. In the consumption equation, (31), several terms have been added relative to what the theory predicts, which is Ct = t . The error term ut is added to allow for the fact that consumption may not be measured perfectly in the data, so that even if the PIH is correct, observed

Ct may di¤er from

t.

The term

0

is added to allow for the possibility that

consumption grows over time. Under the null hypothesis that the PIH holds, the coe¢ cients on the Yt and 1 = 2 = 0.

Yt

1

terms should be zero; i.e. we can test the PIH by testing whether

To estimate the model one has to use instrumental variables, since the error term in the …rst equation,

t

+ ut , is correlated with

Yt . Of course, lagged changes in income would

be natural candidate instruments because as long as Yt is predictable, lagged values of Yt should be correlated with it (therefore they will be relevant instruments) and, also, lagged values of If

1

Yt should be uncorrelated with 6= 0 or

2

t

(therefore they will be valid instruments).

6= 0, as Flavin found, we usually describe this as a situation where

consumption is excessively sensitive to changes in income. But notice one important thing. Consumption, in this case, is excessively sensitive to changes in income that were predictable at time t

1, since the unpredictable components of income are included in the error term

in the regression.

10

3.3

Campbell and Mankiw’s Test

Campbell and Mankiw proposed a di¤erent test of the PIH, where they had in mind a speci…c alternative model to the PIH. In particular they assumed that some set of consumers that represent a fraction 1 of aggregate consumption, do behave according to the PIH. So for these consumers Ct

Ct

1

= vt , a white noise process.

On the other hand, they supposed that there is another group of consumers, representing a fraction of aggregate consumption, that does not behave according to the PIH. This group of consumers simply absorbs any change in its income through an equal change in its consumption: Ct consumers St

St

1

Ct

1

Yt 1 . Notice that this means that for this group of

= Yt

= 0 (i.e. there is no change in this group’s savings). One motivation

for this hypothesis is that some consumers are liquidity constrained, i.e. they have hit some constraint on their borrowing, that does not allow them to dissave in order to smooth consumption. This is not, of course, the only motivation, but it is a standard one. In this case, we have, at the aggregate level: Ct

Ct

1

= (Yt

Yt 1 ) + (1

)vt :

Again, we cannot directly estimate this equation by least squares, since we would expect Yt Yt 1 to be correlated with vt . However, we could use instrumental variables as we described for the Flavin test. Campbell and Mankiw found that using lagged

Yt as an

instrument does not work very well because it does not seem to have much correlation with Yt (i.e. it is a valid but not very relevant instrument). They used lagged consumption changes instead, and found that was positive and signi…cantly di¤erent than zero. You can see that their test is very similar in spirit to Flavin’s but it provides a direct interpretation of the coe¢ cient on

Yt as the fraction of consumers who are not behaving according to the

PIH. There is an interesting discussion in Romer of Shea’s work. He studies household data in order to try to …nd a good interpretation of what ^ > 0 means. In particular, he tries to see whether it makes sense to interpret ^ > 0 as the fraction of household’s that are liquidity constrained. You should read this part of Romer.

11

4

Consumption’s Smoothness

Modeling the Income Process When we described Flavin’s test we assumed that income was stationary. Of course, this is likely to be counterfactual since real income tends to rise over time for most individuals and in the aggregate. However, the assumption of stationarity can be generalized to that of trend-stationarity where Yt = (t) + (L) and

(33)

t

is a deterministic function of time. Notice that this does not change the fact that

Et Yt+j Et 1 Yt+j = j t . Hence, allowing for deterministic trends in income would make no di¤erence as far as the model’s predictions about Ct are concerned. Deaton (on p. 105) argues in favor of a di¤erence-stationary representation of income, arguing that when future income is forecasted, intuitively one should expect the forecast error variance to rise with the forecast horizon. Trend-stationary models are not consistent with this, whereas di¤erence-stationary models are. So he argues for the representation 1 X (34) Yt = vt = j t j: j=0

Here, by having

> 0 we can allow for upward drift in income levels, since

is the mean of

the change in income over time. The polynomial (L) is assumed to be consistent with vt being a stationary process. To understand Deaton’s point consider a simple example of a trend-stationary process: a …rst-order autoregressive model–an AR(1) model— in which (1 L)(Yt a bt) = t , with j j < 1. Notice that this means Yt = a + bt + (L) = 1 + L +

2

L2 +

t

+

t 1

2

+

t 2

, so that

+

. We can construct a simple example of a di¤erence-stationary

process by letting = 1 in my AR(1) example: i.e. we can let (1 Notice that this means Yt b = t . So in this example, (L) = 1.

L)(Yt

a

bt) =

t.

In our trend-stationary example if I forecast output j periods ahead my forecast error is f t;t+j

= Yt+j

Et Yt+j =

t+j

+

t+j 1

+

2

t+j 2

+

j 1

+

t+1 .

Notice that the variance of this forecast error is Var(

f t;t+j )

= [1 +

which converges to a constant

2

=(1

2

4

+ 2

+

+

2(j 1)

]

2

1 = 1

2j 2 2

;

) as j ! 1. On the other hand, for the di¤erence-

stationary example if I forecast output j periods ahead my forecast error is f t;t+j

= Yt+j

Et Yt+j =

+

t+j 2

+

+

Notice that the variance of this forecast error is Var(

f t;t+j )

= j

2

t+j

+

j ! 1. 12

t+j 1

t+1 .

which limits to 1 as

Why the Model of Income Matters The model (34) implies that Et Yt+j

Et 1 Yt+j is

given by (Et

Et 1 )Yt+j = (Et = (

0

Et 1 )(Yt +

1

+

1

+

+

Yt +

Yt+1 +

+

Yt+j )

j ) t:

Hence, Ct = r

1 X

(1 + r)

(j+1)

(Et

(1 + r)

(j+1)

(

Et 1 )Yt+j

j=0

= r

1 X

0

+

+

j) t

j=0

or Ct

"1 X r = (1 + r) j 1 + r j=0 "1 # X = (1 + r) j j t :

0

+

1 X

(1 + r)

j 1

+

j=1

1 X

(1 + r)

j 2

#

+

j=2

t

(35)

j=0

We can de…ne a polynomial b(L) implicitly using b(L)(1 2

b0 + b1 L + b2 L + j

= bj

bj

b1 L

=

0

+

1L

1. So we can write (35) as # 1 X Ct = b0 + (1 + r) j (bj bj 1 ) t = 1

for j "

b0 L

2

j=1

+

2L

2

L) = +

"

r 1+r

1 X

(L). Notice that

. Hence,

(1 + r) j bj

j=0

#

0

t:

= b0 , but

(36)

This result is equivalent to (30), but should not be surprising since, with an abuse of notation, (L) = (1 L) 1 (L) is the moving average polynomial appropriate for the level of Yt . Deaton illustrates an interesting puzzle relating to the choice of how to pick the time series representation of Yt . He uses aggregate labor income data to estimate autoregressive special cases of (33) and (34): (L)[Yt

(t)] =

(L)( Yt

) =

t

(37)

t:

(38)

In estimating (37) he uses an AR(2) representation and …nds (L) = 1

1:42L + 0:45L2 . In

estimating (38) he uses an AR(1) representation for output’s growth rate and …nds (L) = 1 0:44L. Notice that these two estimated models imply quite similar looking representations for the level of Yt since the estimated versions of the two models imply: (1 (1

1:42L + 0:45L2 )Yt = deterministic part + 0:44L)(1

L)Yt = (1

2

t

1:44L + 0:44L )Yt = 0:56 + t : 13

Using a real interest rate of r = 0:01, Deaton shows that the trend-stationary model delivers the solution Ct = 0:28 t while the di¤erence-stationary solution delivers the solution Ct = 1:77 t . Why are the models so di¤erent, despite having similar autoregressive representations for the level of Yt ? One way to see why is to ignore the deterministic trend and consider a second-order autoregressive representation for Yt , (L)Yt = t , factored as (1 L)(1 L)Yt = t . Notice that with this representation, Yt = (L) t = (1 L) 1 (1 L) 1 t , with 2

(L) = (1 + L +

L2 +

= 1 + ( + )L + ( so that j

=

j X

j i i

2

)(1 + L + 2

+

+

2

j+1

=

L2 +

)

)L2 +

j+1

:

i=0

Using the solution (30) this implies Ct =

=

t

r(1 + r) (1 + r )(1 + r

Notice that lim = !1

)

t:

1+r 1+r

so that when there is a unit root, the coe¢ cient on will be greater than 1 for > 0. So if you impose = 1 (as in the di¤erence stationary model) and …nd > 0 (i.e. there is positive serial correlation in the growth rate of output), you will necessarily …nd that

>1

(as Deaton does in his example). But this means that in period t, Ct should rise by more than t , which is the innovation in the level of income. This would mean that consumption— which is the same as permanent income— should be more conditionally volatile than income: Vart 1 ( Ct ) =

2 2

> Vart 1 ( Yt ) =

2

. Subject to a restriction on , it is also possible to

…nd Var( Ct ) > Var( Yt ) in the case where

= 1.

Since— in U.S. data— growth rates of income are positively serially correlated one will tend to …nd > 1. On the other hand, one also …nds changes in consumption that seem to less volatile than changes in income. This suggests that consumption in the data is excessively smooth. Why does the trend-stationary model have such di¤erent implications for ? Although = 1 implies > 1, it turns out that does not need to be much less than 1 for to be quite small. Notice that our di¤erence stationary model implies that and r = 0:01, so

= 0:44,

= 1

= 1:772. On the other hand, the trend-stationary model, with the AR

14

polynomial 1

1:42L + 0:45L2 can be factored as (1

0:48L)(1

0:94L). Thus,

= 0:48,

= 0:94 implying = 0:272. More generally, notice that = 1 = 1 When

1+r : 1+r r ) 1+r

)

=1

is signi…cantly bigger than 0, but signi…cantly smaller than 1, (as in our case) and

r is small, (1 + r)=(1 + r

) will be much bigger than 1, whereas 1

very close to 1.

15

r=(1 + r

) will be

Suggest Documents