Default when Current House Prices are Uncertain

Default when Current House Prices are Uncertain∗ Erwan Quintin‡ Morris A. Davis† Rutgers University University of Wisconsin November 1, 2014 Abstr...
Author: Edwin Heath
0 downloads 2 Views 420KB Size
Default when Current House Prices are Uncertain∗ Erwan Quintin‡

Morris A. Davis† Rutgers University

University of Wisconsin

November 1, 2014

Abstract We specify a new model of homeowner mortgage default. In our model, homeowners do not know the value of their home until they sell; rather, they maintain an unbiased guess of the sales price of their house and each period optimally update this guess as new information about house prices, such as sales prices of similar homes, is observed. Uncertainty about the current sales price adds significant option value to staying in a home even when the current house price is likely substantially less than the remaining mortgage balance. The additional option value reduces default probabilities compared to the predictions of a model where homeowners know the current sales price with certainty. We estimate model parameters using panel data on self-assessed house prices and house-price indexes and (separate) panel data on mortgage defaults. At our estimated parameters, we find uncertainty about the current level of house prices reduced defaults for a cohort of prime mortgages issued in 2006 by 25 percent in 2010 and 2011.



We thank Carlos Garriga and Norm Miller and seminar participants at a number of conferences and institutions for comments and suggestions. † Department of Finance and Economics, Rutgers Business School, Rutgers University, 1 Washington Park #1092, Newark, NJ 07102; [email protected]; http://morris.marginalq.com/. ‡ Department of Real Estate and Urban Land Economics, University of Wisconsin-Madison, 5257 Grainger Hall, Madison, WI 53706; [email protected]; http://erwan.marginalq.com/.

1

Introduction One of the many ways in which housing is different from most financial assets is that the

sale price of any house is not precisely known known until a sale occurs, and the range of possible sale prices is quite large. There may be many reasons for this phenomenon, but two come to mind immediately. The sale of a home is subject to sometimes important search and matching frictions, implying under the right circumstances an ex-post distribution of sales prices for ex-ante identical homes.1 In addition, no two homes are exactly identical, since at a minimum each home occupies a different location. Homeowners and appraisers can observe sales prices of nearby similar homes (“comps”), but differences in location and other attributes suggest these comps provide imperfect signals for the current sales price of any other home.2 Our goal in this paper is to explore how homeowner uncertainty about the current sales price of their homes affects default decisions. To this end, we study the implications of an optimal model of default in which homeowners never observe the current value of their home unless they sell. We abstract from search frictions and assume homeowners do not know the exact price of their house because no otherwise identical housing has sold nearby. In each period of the model the unobserved sales price of the house is subject to random shocks; and, homeowners observe a noisy but unbiased signal of the sales price of their house (based on nearby “comps,” if you like). Homeowners in our model maintain a guess of the mean and variance of the current sales price of their home and optimally update this guess using a Kalman Filter. Every period, homeowners decide whether to stay in their house or sell, and if homewoners sell and the resulting sales price is less than the pre-determined mortgage balance the sale results in a default. Our model can be viewed as an extension of a classic 1

For example, Piazzesi, Schneider, and Stroebel (2014) estimate a search and matching model in which buyer heterogeneity leads to a distribution of prices. See Han and Strange (2014) for a review of the search literature as applied to housing. 2 According to their detailed analysis of pricing characteristics of 59 metropolitan areas, Malpezzi, Ozanne, and Thibodeau (1980) find hedonic regressions of house prices on housing attributes typically yield R2 values in the range of 0.5 to 0.75. Variation in house prices after controlling for observed characteristics is still a prominent feature of house price data, even with more localized data on sales prices. The web site Zillow, for example, lists confidence intervals for its ability to predict the sales price of any home given the sales price of nearby homes. These intervals can be large: See the table under the heading “Data Coverage and Zestimate Accuracy,” at the web site http://www.zillow.com/zestimate/#what.

1

model of optimal default such as Leland (1994), modified to accommodate the case in which decision makers, in this case homeowners, are uncertain as to the current value of the equity. We believe the model environment we describe is not controversial, since anyone that owns a house knows they can not state the current market value with certainty. Thus, we use the model to answer a quantitative question: By how much did this uncertainty raise or lower default rates during the housing bust? To answer this question, we perform a “countefactual experiment.” That is, we estimate the parameters of our model; at the estimated parameters, we simulate default rates given the current sales price of housing is uncertain; and then re-simulate default rates after assuming the sales price of housing is perfectly known, but holding all other parameters unchanged. We estimate default rates would have been 25 percent higher in 2010 and 2011 for the mortgages in our sample if homeowners were able to perfectly always observe the current sale price. We estimate the parameters of our model using two different panel data sets covering the experiences of 20 metropolitan areas from the 2005-2011. Our first data set merges data on house price indexes with data on self-assessed house prices by homeowners. We use this data set to estimate the variance of shocks to house prices and the noisiness of the signals to homeowners of their true house price. These variances identify the “Kalman Gain,” which measures the rate at which homeowners incorporate new signals when they update their guess of the current sales price of their house. In addition, they determine homeowners’ selfassessed variance of the current sales price of the home. The model predictions fit the data closely and a compelling case can be made that homeowners Kalman Filter the avaiable sales data. We estimate a steady state Kalman Gain of about 0.55, and show that homeowners would report a 95 percent confidence of ± 8-1/4 percent around the sale price of their homes. Our second data set covers the same geography and time period, and merges self-assessed house price data with the experiences of a large number of prime mortgages issued in 2006 with 20 percent equity at origination. These data pin down parameters specific to our default model: The value of the outside option after defaulting, any costs of default, and the variance of a periodic shock that affects the net benefit of remaining in the current home.3 3

As is common in discrete choice models, this shock is included to ensure the model always predicts

2

The model does not exactly fit observed patterns of default rates, for explainable reasons. The model does not have unemployment as an explicit “factor” for defaults, and it cannot fit the surge in defaults that occur in 2009 accompanying the onset of the Great Recession: See Gerardi, Herkenhoff, Ohanian, and Willen (2014) and Schelkle (2014), for a discussion of unemployment and default.4 In addition, the model cannot fit some of the default rates observed in 2007 and 2008 as the sales prices of most houses in our sample should have been greater than the par value of mortgages, even after accounting for uncertainty about the sales price.5 The model, however, can fit the size and variation of default rates after 2009: A regression of predicted default rates on actual default rates yields an intercept of zero, a coefficient of 0.992 and an R2 of 0.41. As we mentioned earlier, we use the model to ask what default rates would have been in 2010 and 2011 if homeowners had a more precise guess about the sales price of their home. In this counterfactual experiment, we simulate our model under the assumption that homeowners know the sales price of their house with certainty but keep all other parameters unchanged. In the experiment, default rates rise by 25 percent, from 6 percent on average across our 20 metropolitan areas to 7.5 percent. Default rates increase for two reasons. First, when people sell in the counterfactual experiment they know with certainty if their house is worth less than their mortgage, and this eliminates a few occurrences when people expect to default and receive a positive equity “surprise.” More importantly, uncertainty about the current value of the home adds option value to not selling the home. In most models of default, remaining in the home has option value because the future value of the house may rise. In our model, this option value is larger; since current home values are not known, high future values are more likely to occur. Our paper contributes to three distinct literatures. A new and growing literature has emerged on the accuracy of homeowner perceptions of home value, and how potential biases probabilities over choices. 4 It has been suggested to us that we allow the mean and variance of taste shocks to be a function of local unemployment rates. This would indirectly introduce the effects of unemployemnt into the model. 5 This is suggestive to us that many homes in our sample of Freddie-Mac securitized loans had less than 20% equity at origination in 2006, despite what lenders recorded. Piskorski, Seru, and Witkin (2013) present evidence suggesting 7% of all mortgages in private-label RMBS pools had a second lien when the borrower represented no such lien was present.

3

in the perception of home value affects decisions: Piazzesi and Schneider (2009) and Ehrlich (2014) discuss how perceptions affect prices in search and matching models and Corradin, Fillat, and Vergara-Alert (2014) study how uncertainty about current home value affects optimal portfolio decisions.6 Our research is related to the observation that appraisals tend to be backwards looking and smoother than transactions prices (Geltner, 1991). Our results imply that this so-called “appraisal bias” results from optimal filtering of noisy sales data. Finally, our paper is related to the literature on household defaults. [Erwan can you write this paragraph. I will insert all references into the .bib file if you can send me the IDEAS reference links.]

2

Self-Reported House Prices and House Price Indexes We start by showing features of available housing data that indicate to us that households

are uncertain of the current value of their home and update a guess of their home value by filtering available data. Specifically, we compare inflation-adjusted changes to CaseShiller-Weiss (CSW) house price indexes and changes to self-reported house prices from the Decennial Census of Housing and the American Community Survey for 20 Metropolitan Statistical Areas (MSAs) over the 2000-2011 period, a period of rapidly rising (2000-2006) and declining (2006-2011) house prices. Many details are in the appendix, but in short, the CSW indexes and the self-reports cover roughly the same geography and set of homes in each of the 20 MSAs we study. We formally analyze these data later in the paper. In this section, we describe patterns in the data that appear to be consistent with homeowner uncertainty. Table 1 reports the real ($2005) average self-reported value of housing. The table shows there is meaningful variation in the self-reports across MSAs and over time. The median of 6

Burnside, Eichenbaum, and Rebelo (2014) study house price dynamics when agents with different house price perceptions interact. Also relevant are papers that ask if homeowners maintain unbiased estimates about the value of their house: See Follain and Malpezzi (1981) and Kiel and Zabel (1999). Bucks and Pence (2008), Kuzmenko and Timmins (2011) and Genesove and Han (2013) are recent papers in this literature in which the authors check the accuracy of homeowner assessments by comparing the growth rate of selfassessed home prices to the growth rates of commonly available house price indexes, related to what we do.

4

the MSA-average values increases from $209 to $345 thousand during the housing boom and then falls to $246 thousand by 2011. In 14 of the 20 MSAs, the average of the self-reported values peaks at the same time or one year after the CSW index peaks.7 However, the CSW indexes and the self-reported data do not move in lock-step throughout the sample, the focus of this paper. Figure 1 compares the CSW index (solid line, “CSW”) to the average of the self reports (dashed line, “SR”) for the Los Angeles MSA; the other metro areas in our sample have similar patterns, as we show later. The SR line is linearly interpolated between 2000-2005, as no self report data exist in that period, and the average of the self-reports and the CSW indexes are each normalized to 100 in the year in which the CSW peaks. Figure 1 shows that after the peak, the CSW price index declines more sharply than the self-reports. In many MSAs, the difference in the percentage decline of the CSW index and the self reports is quite large. Table 2 reports the percentage decline in the average self-report data and the CSW indexes, measured from the MSA-specific peak date of the CSW to 2011. At the median MSA, the CSW declines by 14 percentage points more than the self-reports. In the extreme case of the San Francisco MSA, the CSW index declines by 44 percent whereas the self-reports only declines by 19 percent. It is not obvious from Figure 1 and Table 2, but the magnitude of the difference between the CSW and self-reported values is strongly correlated with the change in the CSW index. This is true for the boom period, the year 2000 to the CSW peak date, as well as the bust period. The top panel of figure 2 shows results from the boom period. For each MSA, cumulative real growth in the CSW index, the x-axis, is plotted against cumulative real growth in the self-reports. Figure 2 shows that growth in the CSW and self-reports are highly correlated but growth in self-reports does not reflect growth in the CSW index on a percent-for-percent basis. Rather, when cumulative growth in the self-reported data over this period is regressed on cumulative growth in the CSW index, the coefficient is 0.74 with a standard error of 0.04 and the intercept is 15.9 with a standard error of 3.0.8 The bottom panel of figure 2 shows results from the bust period, the CSW peak date 7

In the case of Dallas, we adjust the real peak date of the CSW from 2002, a year lacking self-report data, to 2006. This reduces the peak value of the CSW by 1.3 percent. 8 The R2 of the regression is 0.95.

5

through 2011. As with the boom period, cumulative growth of the two series are highly correlated, but they do not move percent-for-percent. A regression of cumulative growth in the self-reported data over this period on cumulative growth in the CSW index yields a coefficient of 0.86 with a standard error of 0.09; an intercept of 9.4 with a standard error of 3.5; and a R2 value of 0.85. During the bust period, the regression estimates are sensitive to the ending year of the sample. When the sample is specified to end in 2009, the same regression yields a coefficient of 0.58 with a standard error of 0.08 and an R2 of 0.74; with an end date of 2010, the coefficient is 0.74 with a standard error of 0.08 and an R2 of 0.82. We view the increase in the coefficient from 0.58 when the sample ends in 2009 to 0.86 when the sample ends in 2011 as additional evidence supporting our model – homeowners continually revise down their appraised home value with each passing year the CSW remains depressed. In sum, during both the boom period and bust period, although the CSW and self-report data are highly correlated, the averaged self-report data do not increase or decrease percentfor-percent with changes to the CSW indexes. We show that these results are consistent with an environment in which homeowners receive noisy signals (such as the CSW) about the value of their home and optimally filter these signals when determining their home’s current value. We consider the implications of this result in a model of optimal homeowner default, detailed next.

3

Default Model We consider the behavior of an infinitely-lived mortgage-holder with linear preferences

and geometric discount rate β ∈ (0, 1). For simplicity we assume that the mortgage is an interest-only console so that the mortgage balance, which we denote by d > 0, is constant over time. We assume the mortgagee knows that she does not know the exact value of her home if it were to sell in the current period, the focus of this paper. The mortgagee’s priors about the price of her home in period t are log-normally distributed with unbiased mean hbt and dispersion σbt – where the subscript b stands for“beliefs.” The economic environment we describe below is consistent with this as an optimally formed prior distribution. 6

We assume that the unobserved log of the sales price of the home, denoted h∗t , follows an AR(1) process with autoregressive parameter ρ ≤ 1 and innovation et at date t. In the event ρ < 1, as a convenient normalization we assume that the process has zero long-term mean: h∗t = ρh∗t−1 + et .

(1)

When we say “unobserved,” we mean that the homeowner does not know h∗t when she decides whether or not to sell her home; however, if the homeowner decides to sell, h∗t is revealed. Obviously, we do not know if equation (1) is a reasonable characterization of beliefs about house prices during the housing boom and bust, which is the primary focus of our study. The data suggest ρ is close to 1. Each period, the owner observes a noisy but unbiased signal hst of the log of the true price that satisfies : hst = h∗t + νt .

(2)

The subscript s refers to “signal.” The shock processes et and νt are orthogonal to one another, independent across time and normally distributed with mean 0 and standard deviations σe and σν , respectively. The homeowner understands that her current log house price is, from her perspective, a random variable with a mean and a variance. Given the processes for true house prices and what the homeowner observes, the homeowner optimally updates her guess of the mean and variance of her current log house price using a Kalman Filter, implying the log-normality of the distribution of current log house prices is maintained over time. Standard arguments show that the mean and variance are updated with a simple algorithm involving the Kalman gain discussed later. We assume the homeowner has held her mortgage for a few periods already, implying for given values of ρ, σe and σν that the Kalman gain has essentially converged to its steady-state value; we discuss this later as well.

7

Given the Kalman gain has converged, the homeowner’s estimate of the variance of log true house prices is fixed; call this variance σb2 . Given last period’s mean for house prices, which we call her “beliefs,” and the signal in the current period, the homeowner optimally updates her beliefs according to the formula hbt = (1 − κ) ρhbt−1 + κhst

(3)

where κ is the converged value of the Kalman gain. Equation (3) shows that optimally held beliefs in the current period are a weighted average of beliefs held last period, adjusted for the rate at which house prices mean-revert, and the signal observed at the start of the current period. While this looks like appraisal smoothing (Geltner, 1991), it is also consistent with the optimal updating of beliefs given noisy signals of truth. The household can sell (“terminate”) her house in any period. The value of this decision given beliefs hb and debt amount d is given by: V T (hb ; d) = E [max (h∗ − d, −c) | hb ; d] + V o

(4)

where E is the standard expectation operator and V o is the value of the best outside option after a sale which, for simplicity, we take to be a parameter in this exercise. Note that we have suppressed time subscripts for convenience since our environment is stationary. This formulation embodies key assumptions. First, the homeowner has the option to default on her mortgage, meaning that she can put the house back to the lender as full remediation for the loan. The total cost of defaulting is the parameter c ≥ 0; c captures the net present value of all utility and financial costs associated with default. Second, selling carries no transaction costs although nothing of importance would change if we introduced them in the problem. Finally and more subtly, we assume that the sale decision is made prior to discovering the true market value of the home and that this decision is irreversible.9 Given this formulation, we model a default as a sale with negative equity, h∗ < d. Of course, a homeowner who sells 9

Allowing for the possibility that a homeowner can put her house on the market with the (costly) option to not sell if she does not like the observed price would complicate the analysis but would not alter the basic economics of the model.

8

with beliefs that she is underwater, i.e. hbt < d does so expecting to default, but she may also be positively surprised since the sale price is not known. Although hbt is represents the homeowner’s best guess as to the current home value, the homeowner understands that it is a guess; she knows the the standard deviation of possible home price realizations given this guess. The expectation is taken with respect to the distribution of possible home prices. Instead of selling, the homeowner can choose to remain in her home for at least one more period. We denote the net period utility she receives from living in the home period t by ǫt and model it as a draw from a normal distribution with mean zero and standard deviation σǫ . We assume that the net draw of ǫ is independent of the mortgage balance and i.i.d across time. Whether and how the owner’s mortgage balance should affect serice flows net of user cost – which ǫ is meant to capture – depends on the difference between borrowing costs and the return on the funds she can invest outside of the home. Here we simply assume that the net benefit of staying in the home is indepedent of the level of d.10 Homeowners choose whether or no to sell after observing ǫt . The value of staying, then, conditional on a net-benefits realization ǫ, beliefs hb and a debt level d is: V C (hb , ǫ; d) = ǫ + βE [W (h′b , d) | hb ]

(5)

where W is the value of being in the home given prior beliefs at the start of the next period before the value of the taste shock is realized. While homeowners do not know precisely what their priors will be next period, they understand all processes involved in the updating of that object. In performing our value function iteration, we compute the expectation over future values of W involved in functional equation (5) by, first, taking many draws from the homeowner’s prior, second and for each of these draws, drawing a signal from the distribution implied by this assumed realization of truth given expected mean-reversion in one period and, finally, averaging over the resulting realizations of W. 10 When we estimate the model, we include only data with loan-to-value ratios at origination of 80%. In this sense, the assumption of independence of d with ǫ should not be a problem in estimation as d is essentially fixed in our data.

9

This, of course, presumes that the function W is known. In the algorithm we employ, the expected value of W is computed recursively as follows. Given a candidate for W , equation (5) makes it obvious that the optimal selling rule given hb and d is characterized by a threshold ǫ∗ (hb , d) above which the homeowner chooses to stay in the home for an additional period. Recursively then, W (hb ; d) =

(6)

P [ǫ < ǫ∗ (hb , d)] V T (hb ; d) + P [ǫ ≥ ǫ∗ (hb , d)]

R +∞

V C (hb , ǫ; d) dǫ . R +∞ dǫ ∗ ǫ (hb ,d)

ǫ∗ (hb ,d)

Given this functional equation, the algorithm we employ in our computations is standard: 1. Compute V T , using (4); 2. Starting from the guess that W = V T (a natural starting point since it would be exactly correct under the assumption that agents have after a fixed number of period), iterate on (5) and (6) to get a new guess for W ; 3. Iterate until W becomes approximately invariant. This simple environment illustrates how uncertainty about the true state of home prices matters for homeowners’ propensity to default. As equation (4) makes clear, termination (sale) is a one-sided put option. Terminations should be high when an accumulation of bad signals has caused hbt < d . These terminations by pessimistic agents should be associated in many cases with negative equity, hence default. The propensity to default is mitigated by the option value of staying in the home. There are three sources of optionality. First, house prices may experience positive shocks, eventually leading to a case where hbt > d. Second, homeowners have optionality with respect to the utility shocks, ǫ, even when these shocks are iid with a zero mean.11 Finally, when the signal is imprecise, home-owners have less 11

Imagine a game where the player draws a random variable ǫ from a mean-0 distribution, but has the option of ending the game after every draw with a 0 payoff. The value of staying in the game is V = max[e + βEV, 0], rewritten as V = max[e, −βEV ] + βEV . The only solution is EV > 0, which means the player should stay in the game with some negative draws, and thus the termination probability is less than 50 percent. To see this, suppose that EV = 0. This implies V = max[e, 0] which is greater than 0, a contradiction. Then suppose βEV < 0. Then V is always greater than 0, a contradiction.

10

confidence in their pessimistic beliefs. This is like saying the volatility of the value of the underlying asset (and hence the option value of not defaulting) is higher than in a model where current home value is known with certainty. When current home prices are uncertain, all else equal home-owners should be more reluctant to terminate and default even when their unbiased estimate of home equity is quite negative. In the next section of the paper, we estimate the parameterize the model and illustrate the quantitative relevance of this uncertainty for default behavior

4

Estimation Our estimation of the parameters of the model proceeds in two parts. First, we use the

CSW and self-report data to estimate the parameters in the model related to homeowner uncertainty of the current level of home prices. Then, given these parameters, we use panel data from Freddie Mac on mortgage originations and defaults to estimate the remaining parameters of the model, those related to the decision to sell the home or continue in it. At each step, we highlight assumptions we make to map the data to the model.

4.1

Estimation of Uncertainty Parameters

We start with the process for true but unobserved log real house prices for a homeowner i in period t in a given metro area. We assume the true log price of the home, denoted h∗it , follows a first-order autoregressive process with autoregressive parameter ρ and shock eit . Unlike the default model, we allow the process to have a non-zero mean that can vary across ¯ i / (1 − ρ) homeowners of h ¯ i + ρh∗ + eit . h∗it = h it−1

11

(7)

h∗it is not directly observable but homeowner i observes a noisy signal of the true log price, denoted hsit hsit = h∗it + νit .

(8)

We assume eit and vit are independently Normally distributed with mean 0 and variances σe2 and σν2 . eit and νit are independent of each other, and independently drawn over time, but for the time being we allow eit and ejt to be correlated and νit and νjt to be correlated for any two homeowners i and j in the same metro area at the same time. Denote homeowner i’s self-assessed value of her house as of date t − 1 as hbit−1 . Given the model structure and assumptions, homeowner i optimally updates her assessment of the value of her home in period t using a Kalman Filter,   ¯ i + ρhbit−1 + κit hsit . hbit = (1 − κit ) h

(9)

κit is the Kalman gain which is updated each period using the following recursion (Hamilton, 1994) Vitp = ρ2 Vit + σe2 Vitp κit = Vitp + σν2 Vit+1 = (1.0 − κit ) Vitp For any 0 < ρ ≤ 1, κit , Vitp and Vit converge monotonically to fixed values. The steady state value σe2 V = (1 − κ) 1 − (1 − κ) ρ2 



is the standard deviation of the homeowners prior for the decision model of the previous section. V is the standard deviation around the current state after the current signal has been revealed.12 12

V p is the standard deviation around the current state before the current signal has been revealed. It is

12

Importantly, we assume that κit has converged to its steady-state value κ for each homeowner in our sample. We evaluate the plausibility of this assumption later. Given this, equation (9) can be written as   ¯ i + ρhbit−1 + κhsit . hbit = (1 − κ) h

(10)

Since (10) holds for any homeowner i, it holds for the average of all homeowners i = 1, . . . , N. Denote the cross-sectional average of a variablein a given  metro area at time t using a capital N P letter and a subscript m, for example Hbmt = hbit for all the i = 1, . . . , N homeowners i=1

in metro area m. After taking averages and substituting notation, equation (10) can be

written as an expression in MSA-level cross-sectional averages   ¯ m + ρHbmt−1 + κHsmt Hbmt = (1 − κ) H

(11)

We do not directly observe the cross-sectional average signal, Hsmt , but instead observe the log of the CSW index. We assume the log CSW, denoted Hmt , is an unbiased estimate of the average signal Hsmt up to an metro-specific additive scale factor αm and Normally distributed error umt Hmt = Hsmt − αm + umt .

(12)

Insert (12) into (11) to get Hbmt = am + (1 − κ) ρHbmt−1 + κHmt − κumt ¯ m − καm . where am = (1 − κ) H

(13) (14)

It is useful to rearrange (13) as umt = Hmt − κ−1 [Hbmt − am − (1 − κ) ρHbmt−1 ] equal to V/ (1 − κ).

13

(15)

Denote the standard deviation of the measurement error on the signal as σu . We estimate all 23 parameters κ, ρ, σu and am for each of the m = 1, . . . , 20 metro areas using simulated full information maximum likelihood over our sample period. Being slightly formal, denote θ as the vector of parameters and ℓ (θ)mt as the log likelihood of the data for metro area m at year t. This log likelihood is the log of the density of umt from equation (15). Our estimate of θ maximizes " 20 # N 20 X 2011 X 1 X X˜ ℓ (θ)mt ℓ (θ)mt=2005 + L (θ) = N n=1 m=1 m=1 t=2006

(16)

where N is the total number of simulation draws and ℓ˜ denotes a simulated log likelihood. Equation (16) shows we simulate the log likelihood for the year 2005. The values of umt for 2006 − 2011 are directly observable with data on hand from equation (15) and do not depend at all on the simulations, explaning the rightmost term of equation (16). We use simluations because there is a gap in obseved values of Hbmt : We first observe Hbmt in 2000 and then annually from 2005 − 2011. Our simulation procedure fills in this gap. We draw umt from its distribution for each of t = 2001 − 2004. Given this draw, and given a value of Hbmt for the year 2000 and values for the CSW (Hmt ) for 2001 − 2004, we sequentially apply equation (13) to generate simulated values of Hbmt . With a simulated value of Hbmt in hand for the year 2004, we use equation (15) to determine umt for 2005 and compute the log likelilhood at this draw, ℓ˜mt=2005 . We repeat this process N = 25, 000 times and compute the average value of the simulated likelihoods. We report our maximum likelihood parameter estimates and standard errors in the first three rows of table 3.13 We estimate ρ = 0.992 implying house prices are almost a unit-root process; κ = 0.554, meaning people only incorporate a little more than half of a new reading of a house price signal when updating their assessment of their home value; and σu = 0.0453 implying the standard deviation of the measurement error associated with the level of the CSW index is about 4.5%. To get a sense for model fit, in figure 3 we plot the CSW data Hmt against the model-implied signal Hsmt for the years in which the signal is computable 13

We compute standard errors as the square root of the diagonals of the inverse of the outer product of scores. For reference, the maximized log likelihood is 232.04.

14

without simulation, 2006 − 2011. To ensure both series are appropriately scaled, we add

¯ (1 − ρ) from both an estimate of αm to the CSW series and subtract an estimate of H/ series.14 The graphs show that the gaps between the model-implied signal at our parameter

estimates and the CSW indexes are relatively small given the large decline over time in the CSW indexes. A regression of the de-meaned CSW indexes on the de-meaned values of Hsmt for all the MSAs and years shown in figure 3 yields an R2 value of 0.95 with an intercept of nearly exactly 0 and a slope coefficient of exactly 1.0. Another way we get a sense of model fit is to ask how well the model could have predicted the sequence of average self-reports from 2005-2011 given (a) data on self-reports for only the year 2000 and (b) assuming the Case-Shiller-Weiss values for 2001-2011 were exactly equal to household signals. Given our estimates of am , a sequence of Hmt , and a starting value for Hbmt−1 (the year-2000 value, but no other values), we generate a model-predicted sequence of values for Hbmt for the years 2001-2011. To be crystal clear, data on Hbmt−1 from only the year 2000 is used to generate predicted values. No other data on self-reports are used in this exercise. The possibility for error late in the sample is large as there is nothing inherent in this exercise that pulls out-of-sample predictions of Hbmt towards the data. Figure 4 shows a scatter diagram of the the self-reported home value data (y-axis) against the predicted self-reported values (x-axis) for all 20 metro areas over the years 2005-2011. ¯ m / (1 − ρ) from data such that goodness of fit abstracts from We subtract our estimate of H

across-MSA differences in the average level of self-reported values. The R2 of the pictured

regression line of the self-report data on predicted values is 0.97 and the intercept of the regression is nearly exactly zero. The point estimate of the slope coefficient is 0.96 with a standard error of 0.014, implying that the actual self-reports vary a bit less than the outof-sample predictions when the CSW is considered to be the true signal, measured without error. As a final step, we estimate the variance parameters σν2 and σe2 . Additional assumptions on the nature of the correlation of shocks across households in an MSA are required to ¯ (1 − ρ) as the average value of Hbmt over the 2005-2011 period, and given our maximum We set H/ likelihood estimates of am we set αm such that equation (14) holds. 14

15

estimate these parameters. To make progress, we assume that homeowners in each metro area experience identical values of e and ν – that is, eimt = ejmt and νimt = νjmt for all homeowners i and j in metro area m in period t – but allow values of e and ν to vary across metro areas. The assumption that all agents in a given MSA receive the same sized shocks is not innocuous; in fact, one of us has written a paper documenting that the magnitude of house price declines during the housing bust varied quite a bit within some metro areas. On the other hand, a representative agent in each MSA is often assumed in models of Urban Economics. Given this assumption, it can be shown that    ¯ m = σ 2 + 1 + ρ2 σ 2 . var Hsmt − ρHsmt−1 − H e ν

(17)

Additionally, it is possible to show that once the Kalman gain converges to its steady-state value it satisfies 

   σν2 ρ2 κ2 + σe2 + σν2 1 − ρ2 κ − σe2 = 0 .

(18)

Although equation (18) is quadratic in κ, it is linear in σν2 and σe2 . Given estimates of ρ,  ¯ m , equations (17) and (18) uniquely determine σ 2 and σ 2 , κ, and var Hsmt − ρHsmt−1 − H ν e with the closed-form expression for σν2 as σν2 =

  ¯ m (1 − κ) var Hsmt − ρHsmt−1 − H 1 + ρ2 (1 − κ)2

(19)

and the expression for σe2 naturally following from (17) and (18).   ¯ m using data across all metro areas and years We estimate var Hsmt − ρHsmt−1 − H

(2007-2011)15 to be 0.008. At our estimated values for ρ and κ, and given the computed values ¯ in each period and metro area, we compute σν = 0.0556 and σe = 0.0464, of Hsmt and H shown in rows 4 and 5 of table 3. The interpretation of these findings is that the standard deviation of shocks to home prices is 4.64 percent per year; and homeowners understand the 15

The first year we can estimate umt and thus Hsmt is 2006.

16

standard deviation of the gap between the signal they receive on the value of their home and the true value of their home is about 5-1/2 percent. These estimates imply a value of √ V, the standard deviation of homeowners’ uncertainty about the the value of their home, of 4.14 percent. In other words, we estimate that homeowners have a two standard error confidence interval around the current guess of their house price of ±8.28 percent. One view of our research is that we determine the importance of this confidence interval on optimal default decisions. Given estimates of σν2 and σe2 , we can compute the sequence of optimal Kalman gains for a homeowner starting the year she knows her true log house price with certainty.16 In year 1, the Kalman gain is 0.410; year 2 it is 0.524; year 3 it is 0.548; in year 4 it is 0.552 percent; and so forth. The convergence of the Kalman gain to near its steady state value by year 1 or 2 supports the assumption that the Kalman gain has converged for the homeowners in the sample. In conclusion, by comparing the average of homeowner self-reports with Case-ShillerWeiss house price indices, and imposing that the Kalman gain has converged for everyone in our sample and that the shocks and measurement error hitting house prices and the signal are common to all people in a given metro area, we estimate ρ = 0.992, κ = 0.554, σν = 0.0556 and σe = 0.0464. Additionally, we uncover the mean of true but unobserved house prices in ¯ m / (1 − ρ). In table 4 we report values of Hbmt − H ¯ m / (1 − ρ) for each metro each MSA, H area in our sample. These values are appropriate for our default model in which we have assumed that the average value of the log true house price process is zero.

4.2

Estimation of Default Parameters

In this section, we take as given the parameters estimated in the previous section, set the annual discount factor β = 0.95, and use data on the default rates of mortgages to determine the remaining parameters of the model: The net present value of the utilty and financial cost of default c, the value of the outside option in the event of a sale or default V o , and the 16

For convenience, think of this as the year that she bought her home.

17

standard deviation of taste shocks, σǫ . The data we use on on mortgage defaults are from Freddie Mac and are publicly available on the web.17 The Freddie Mac data track the history of each mortgage purchased by Freddie Mac from an originator, and cover the experiences of the mortgage from origination to either default or payment in full from prepayment. The data set keeps track of mortgages and not borrowers. We cannot link mortgages originated to the same buyer at two different times, as would occur if a homeowner were to sell one house and purchase a different one. We use the experiences of mortgages in each of our 20 MSAs, all of which we restrict to be issued in one of the 1st three quarters of 2006. We analyze mortgages with combined loan-tovalue (CLTV) at origination of 80%. We keep only 30-year mortgages backing single-family owner-occupied properties with debt-to-income ratio recorded at origination. We track the performance of the mortgages originated in 2006 from 2007-2011. We focus on mortgages originated in 2006 because that was the year house prices peaked, minimizing the possibility households refinance an existing mortgage in later years in order to extract equity. The mortgages in the Freddie Mac data are tracked monthly; we aggregate the monthly data to annual. There are two complications to the mortgage data. First, many people miss a mortgage payment or two, but ultimately do not default on their loan. We ignore spells of missed payments that are quickly cured. We define the specific date of a default as the initial month of a spell of continuously missed months of payments that ultimately leads to a recording in the final month of the data set for the mortgage of “short sale”, “deed-in-lieu of foreclosure,” “REO acquisition,” or “180 days delinquent,” the longest duration of deliquency recorded. We define annual default rates in each MSA as the total number of defaults during the year – that is, the total number of defaults where each spell of continuously missed payments ending in default begins during the calendar year – divided by the total number of loans in the sample as of January of that year. This computation assumes that households that refinance their mortgage (and thus exit the sample of mortgages with a termination code of “Prepaid or Matured”) do not immediately default on their new mortgage. 17

See http://www.freddiemac.com/news/finance/sf loanlevel dataset.html

18

Table 5 shows unconditional statistics from these data by year. Columns (1) and (2) show the average and standard deviation of the sample size and columns (3) and (4) show the average and standard deviation of the default rate. The table clearly shows attrition over time from the sample, as households prepay their mortgage or default; and, it shows a hump-shaped pattern to default, with default rates rising until 2009 and then declining afterwards. As the third column of table 5 shows, the average default rate in our data in 2009 was 8.36 percent with a standard deviation of 6.65 percent. To give the reader some idea of the range: Annual defaults varied in 2009 from 1.4 percent in Dalls to more than 15 percent in Los Angeles, Miami, Phoenix and San Francisco. The largest default rate in our data is 21 percent in Las Vegas in 2009. Generally speaking, we estimate the parameters of our model by matching model-predicted default rates to actual defaults. We compute model-predicted default rates as follows: For each MSA, we take the 2006 average level of home prices hbmt listed in table 4 and compute the starting mortgage balance d. Then, given d and the sequence of hbmt from 2007-2011 in table 4 we use the model to compute the probability of a sale in each year. Conditional on a sale, we use the homewoners’ subjective probability distribution over home values in each year, hbmt and σb , to compute the probability the sale price h∗ is less than d, constituting a default. The model-predicted probability of default in a given year is the probability of a sale times the probability the sale price is less than the mortgage balance in that year, all conditional on hbmt and d. Speaking precisley, we use maximum likelihood to estimate model parameters. One wrinkle is that exits can occur from our data for one of two reasons, default or prepayment. We treat a prepayment as a censored spell. For example, if a loan prepays in 2008, then the likelihood for that loan takes into account there was no default in either 2007 or 2008; if a loan prepays in 2009, then the likelihood accounts for the fact that there was no default 2007, 2008 or 2009; and so forth. An example from our data might help make ideas clear. Our Boston sample includes 1,788 loans originated in 2006. In the table below we show how we account for each of the loans in our likelihood computations. φt denotes model predicted default rates given the 19

state variables at the current set of parameter estimates where t = 1 denotes 2007, t = 2 denotes 2008, and so forth: Year

Outcome

Frequency

Log Likelihood Contribution from Each Obs

2007

Default

21

ln φ1

Prepay

203

ln (1 − φ1 )  1 Q ln (1 − φt ) φ2  1 t=1  Q ln (1 − φt ) (1 − φ2 ) t=1  2  Q ln (1 − φt ) φ3  2 t=1  Q ln (1 − φt ) (1 − φ3 ) t=1  3  Q ln (1 − φt ) φ4  3 t=1  Q ln (1 − φt ) (1 − φ4 ) t=1  4  Q ln (1 − φt ) φ5  4 t=1  Q ln (1 − φt ) (1 − φ5 )

2008

2009

2010

2011

Default

50

Prepay

210

Default

57

Prepay

335

Default

27

Prepay

191

Default

22

Sample End

672



t=1

Total

1788

More formally, denote Dτ as the number of people that default during year τ and Eτ as the number that exit (prepay) during year τ . τ = 1 and t = 1 correspond to the year 2007, as before. The log likelihood of the sample from MSA m is

ℓm =

5 X τ =1

(

Dτ ln

"τ −1 Y t=1

#

(1 − φt ) φτ + Eτ ln

"τ −1 Y t=1

#

(1 − φt ) (1 − φt )

)

(20)

where each term in the square brackets is defined to equal 1 when τ = 1. The log likelihood of the entire sample is the sum of (20), summed across all 20 MSAs, i.e. L =

X

ℓm

(21)

m

The main benefit to using maximum likelihood to estimate model parameters is that it appropriately weighs experiences of each MSA and year by observation counts. This is 20

important in our sample, since as we show in table 5, sample sizes vary widely across MSAs and sample sizes in each MSA fall over time due to prepayments and default. After we fix β = 0.95, shown in row 6 of table 3, rows 7-9 of that table report our maximum likelihood parameter estimates.18 We find c = Vo = 0, implying we could have rewritten the model in terms of one parameter, the standard deviation of taste shocks. Column 5 of table 5 shows model-predicted default rates by year averaged over MSAs in each year,19 and column 6 shows the difference of the data (column 3) and the predictions of the model in column 5. The model cannot match the data in three obvious dimensions: The model cannot predict any defaults at all in 2007 or 2008; model predicted default rates rise continuously over time whereas defaults peak in 2009 in the data; and, in 2009, the model underpredicts default rates by 5 percentage points on average. The model misses the data along these dimensions for a few very explainable reasons. First, the model cannot explain any defaults when households have positive equity. Second, the model cannot reconcile the fact that default rates were higher in 2009 than in later years, since house prices continued to fall after 2009 implying increasing loan-to-value (LTV) ratios and higher propensity to default. The top panel of Figure 5 shows that the model cannot predict any defaults until LTV ratios are at least 95 percent, and between 95 and 100 percent the model systematically underpredicts default rates. This panel demonstrates why the model underpredicts defaults in 2007 and 2008: LTVs were all 80 percent at origination in 2006 and self-reports declined by less than 20 percent in most MSAs between 2006 and 2008.20 Thus, in most MSAs, in expectation households had positive equity and should have been very unlikely to default.21 The bottom panel of Figure 5 shows results from 2009 at every LTV. The model underpredicts default everywhere, including for LTV larger than 100, i.e. homeowners with negative equity. The reason the model cannot fit the data from 2009 for homeowners with negative equity is that the model can fit the data from 2010 and 2011. Figure 6 compares data to 18

Standard errors are forthcoming. Recall that maximum likelihood will underweight MSAs with relatively low sample sizes. 20 This can be seen by comparing the “2006” and “2008” columns in table 4. 21 This analysis suggests to us that some homeowners in our sample had less than 20% equity at origination, confirming results in Piskorski, Seru, and Witkin (2013). 19

21

model prediction for 2010 and 2011 for LTV ratios larger than 100. The fit is not perfect, but it seems close to us. For this sample, if we regress the data on the model predicted, we estimate an intercept of 0 (standard error 2.12), a coefficient on predicted of 0.992 (standard error 0.240) and an R2 value of 41 percent. Recall again from table 4 that home values continue to decline from 2009 through 2011. We cannot fit default rates in 2009 because we can fit default rates in 2010 and 2011, and LTV ratios are higher in 2010 and 2011 than in 2009 because house prices are lower. Without explicitly accounting for unemployment, the model simply cannot fit the reconcile higher default rates in 2009 than in 2010 and 2011 1with the lower LTV ratios observed in the data.

5

The impact of uncertainty on defaults We now estimate the importance of uncertainty on default rates. Holding all other

parameters fixed, we set σν = 0, thus turning off the signal noise entirely, and recompute model predictions. In this experiment, owners have full certainty about their home value, and expect their assessments to be less volatile in the future since they won’t be subject to noise. Table 6 shows the results by MSA for the average of 2010 and 2011. We focus on 2010 and 2011 because these are the years in which the model best fits the data. The table is sorted by average loan to value in those years, the first column. The second column shows the average default rate in the data and the third shows the predicted default rate at our estimated value of the standard error of the noise from the signal, σν = 0.056. Obviously, defaults are increasing in LTV: The correlation of LTV and default-rate data is 0.87 and the correlation of LTV and simulated default rates is 0.91. The fourth column shows predict default rates if house prices are (counterfactually) observed with certainty, σν = 0.0, and all other model parameters are held constant. The fifth column reports the projected increase in default rates under the assupmtion homeowners know their home value with certainty, equal to the fourth column less the third.

22

This table demonstrates that uncertainty about the current value of housing reduced default rates in many metro areas, especially in areas with LTVs between 100 and 120. Uncertainty about the current value of housing did not lower defaults in low LTV places like Charlotte and Dallas because default rates were already low. Uncertainty does not significantly reduce default rates in high LTV areas like Phoenix and Las Vegas because, even with significant uncertainy about the current value of their home, homeowners in those areas assume they are significantly under water. In those high-LTV areas, people do not default more frequently because they enjoy living in their home more than pursuing the outside option. In conclusion, averaging over the all the MSAs in this table, uncertainty about current home prices reduced default rates by 1.54 percentage points in 2010 and 2011. Restated, approximately 25 percent more households would have have defaulted on their homes in 2010 and 2011 if they had full certainty over the level of house prices.

6

Conclusion Given no two houses are exactly alike, it seems reasonable to assume that homeowners

cannot learn the exact price at which their house will sell based on nearby sales of similar homes. Rather, homeowners maintain a guess of the sales price of their home. Available data suggest this guess arises from Kalman Filtering of publicly available sales data. When homeowners are uncertain about the sales price of their home, “optionality” from staying in the home rises and homewoners are less likely to default. Our analysis suggests default rates rates of a sample of prime mortgages issued in 2006 with 20 percent equity at origination would have been 25 percent higher if homeowners were certain about the current value their house.

23

References Bucks, B., and K. Pence (2008): “Do Borrowers Know their Mortgage Terms?,” Journal of Urban Economics, 64(2), 218–233. 4 Burnside, C., M. Eichenbaum, and S. Rebelo (2014): “Understanding Booms and Busts in Housing Markets,” Working Paper, Duke University. 4 Corradin, S., J. L. Fillat, and C. Vergara-Alert (2014): “Portfolio Choice with House Value Misperception,” Working Paper, IESE. 4 Ehrlich, G. (2014): “Price and Time to Sale Dynamics in the Housing Market: the Role of Incomplete Information,” Working Paper, Congressional Budget Office. 4 Follain, J. R. J., and S. Malpezzi (1981): “Are Occupants Accurate Appraisers,” Review of Public Data Use, 9, 47–55. 4 Geltner, D. (1991): “Smoothing in Appraisal-Based Returns,” Journal of Real Estate Finance and Economics, 4(3), 327–345. 4, 8 Genesove, D., and L. Han (2013): “A Spatial Look at Housing Boom and Bust Cycles,” in Housing Markets and the Financial Crisis, ed. by E. Glaeser, and T. Sinai. University of Chicago Press. 4 Gerardi, K., K. F. Herkenhoff, L. Ohanian, and P. S. Willen (2014): “Unemployment, Negative Equity, and Strategic Default,” Working Paper, University of Minnesota. 3 Hamilton, J. D. (1994): Time Series Analysis. Princeton University Press. 12 Han, L., and W. C. Strange (2014): “The Microstructure of Housing Markets: Search, Bargaining, and Brokerage,” in Handbook of Regional and Urban Economics, ed. by G. Duranton, V. Henderson, and W. C. Strange, vol. 5. Elsevier. 1 Kiel, K. A., and J. E. Zabel (1999): “The Accuracy of Owner-Provided House Values: The 1978-1991 American Housing Survey,” Real Estate Economics, 27(2), 263–298. 4 Kuzmenko, T., and C. Timmins (2011): “Persistence in Housing Wealth Perceptions: Evidence from the Census Data,” Working Paper, Duke University. 4

24

Leland, H. E. (1994): “Corporate Debt Value, Bond Covenants, and Optimal Capital Structure,” The Journal of Finance, 49(4), 1213–1252. 2 Malpezzi, S., L. Ozanne, and T. Thibodeau (1980): “Characteristic Prices of Housing in Fifty-Nine Metropolitan Areas,” The Urban Institute Working Paper 1367-1. 1 Piazzesi, M., and M. Schneider (2009): “Momentum Traders in the Housing Market: Survey Evidence and a Search Model,” American Economic Review, 99(2), 406–411. 4 Piazzesi, M., M. Schneider, and J. Stroebel (2014): “Segmented Housing Search,” Working Paper, Stanford University. 1 Piskorski, T., A. Seru, and J. Witkin (2013): “Asset Quality Misrepresentation by Financial Intermediaries: Evidence from RMBS Market,” Journal of Finance, forthcoming. 3, 21 Schelkle, T. (2014): “Mortgage Default during the U.S. Mortgage Crisis,” University of Cologne Working Paper Series in Economics 72. 3

Data Appendix The CSW house price indexes are derived from repeated sales of single-family housing units, which in principle delivers a constant-quality price index. We average the monthly nominal CSW index values in each year and convert the nominal annual index to real by deflating using the personal consumption price index from National Income and Product Accounts (NIPA), line 2 of NIPA table 1.1.4. The CSW discards transactions that occur with 6 months; regression weights used to compute the index are reduced the longer the time between sales; and, price anomalies are down-weighted. For further discussion, see http://www.macromarkets.com/csi housing/documents/tech discussion.pdf. The data on self-assessed home values are from the 5% sample of the 2000 Census and the annual 2005 through 2011 American Community Surveys (ACS).22 2005 is the first year that the ACS includes metropolitan-area data and 2011 is the last year of publicly available ACS data. The geographic boundaries defining metropolitan areas in the 2000 Census and the 2005-2011 ACS are consistent with the 2000 definition and are consistent with the MSA 22

These data are available from the IPUMS web site, see http://usa.ipums.org/usa/. See Ruggles et. al. (2010).

25

boundaries in the CSW data.23 From the Census and ACS data, we include only nonfarm, single-family detached or attached, owner-occupied housing units. The large majority of these units are detached. We keep any housing unit where the reported house value is specified in a flag variable as “unaltered.”24 The Census and the ACS continually sample throughout the survey year. The total number of housing units in the Census and ACS sample that meet the criteria listed above are reported in Table 7. Unlike other commonly used surveys of self-reported house prices, such as the American Housing Survey and the Survey of Consumer Finances, the sample sizes in these data are large. The Census and ACS data include information on whether or not the units are detached or attached; the age of the units; and, the number of bedrooms in each unit. We assign every unit in the sample to a bin based on these observable characteristics.25 Attached housing units account for a small percentage of the overall sample and we bin them together by MSA. For the detached units, the bins are based on bedrooms (1 or 2, 3, or 4 or more) and age of structure: Built before 1940; decade-by-decade from 1940-1949 through 1990-1999; 2000-2004; and 2005-2011. We compute sampling weights for each bin prior to discarding any missing or imputed observations on house prices. The sum of the sampling weights across bins is 1.0 in each metro area in every year. We calculate the average value of housing as the sampling weight for each bin multiplied by the average of the non-missing house values in that bin. We then adjust for inflation using the NIPA price index described earlier. We compute the average value of the log of 23

With the exception of New York and Chicago, the CSW indexes cover the full set of counties in each of the 20 metro areas. The Chicago index does not cover properties sold in the Kenosha, WI and the Gary, IN metropolitan divisions. The CSW index for New York samples all 23 counties included as part of the New York Metropolitan Statistical Area and 6 others: Fairfield and New Haven counties in Connecticut, Mercer and Warren counties in New Jersey, and Dutchess and Orange counties in New York. In 200, the excluded divisions in Chicago account for 16 percent of the MSA population and the additional 6 counties in the New York metro area increase the population by 15 percent. 24 Of all metro areas in all years, there are 150 total instances where $0 is reported as the house value. These are treated as missing. In addition, we discard from the sample in the 2005 ACS any house reported as built in 2005. Finally, note that housing values in the Census and ACS data are top-coded. The top code is $1,000,000 in 2000 and 2005-2007 and varies by state after 2008. The bottom row of Table 7 shows the percentage of the sample (after accounting for sampling weights) with a top-coded observation for house value. We do not adjust any top-coded responses. 25 Bins are chosen such that each bin in every metro area in every year contains at least one house price observation.

26

real house prices using an analogous procedure.

27

Table 1: Average of Real Self Reported (SR) House Values 2000 5% Census, 2005-2011 American Community Survey

MSA Atlanta Boston Charlotte Chicago Cleveland Dallas Denver Detroit Las Vegas Los Angeles Miami Minneapolis New York Phoenix Portland San Diego San Francisco Seattle Tampa Washington, DC Median Standard Dev

2000 193 310 175 224 162 152 242 183 184 328 184 189 296 184 232 314 434 305 137 258 209 76

2005 238 480 205 301 183 178 308 217 355 585 338 299 486 292 293 610 663 398 233 471 305 147

Thousands of $2005 Dollars 2006 2007 2008 2009 2010 245 249 248 230 207 475 464 455 432 423 211 217 232 219 211 314 315 312 289 270 181 180 174 160 158 183 187 190 188 184 306 304 309 299 283 213 206 190 159 142 374 369 317 236 198 626 618 637 560 539 378 381 377 306 272 297 297 285 263 242 508 502 510 483 457 349 341 321 261 233 330 348 341 316 288 615 590 572 500 472 682 663 716 618 593 438 466 474 438 410 275 265 251 210 196 507 500 480 439 418 340 345 319 294 271 151 146 154 137 134

28

2011 189 401 204 250 147 178 277 132 176 513 241 226 437 200 274 454 556 373 173 399 246 130

SR 2007 2005 2008 2007 2005 2008 2008 2005 2006 2008 2007 2005 2008 2006 2007 2006 2008 2008 2006 2006 2007 1

Peak Date CSW SR-CSW 2006 1 2005 0 2007 1 2006 1 2005 0 2002 6 2006 2 2005 0 2006 0 2006 2 2006 1 2006 -1 2006 2 2006 0 2007 0 2005 1 2006 2 2007 1 2006 0 2006 0 2006 1 1 1

Table 2: Real Percent change in CSW and Self Reports (SR), CSW peak date to 2011

MSA Atlanta Boston Charlotte Chicago Cleveland Dallas Denver Detroit Las Vegas Los Angeles Miami Minneapolis New York Phoenix Portland San Diego San Francisco Seattle Tampa Washington, DC Median

Percent Chg. SR CSW -22.7 -29.7 -16.5 -25.1 -6.1 -22.9 -20.4 -37.5 -19.5 -27.7 -2.6 -15.9 -9.7 -18.5 -39.2 -51.4 -52.8 -63.1 -18.0 -43.9 -36.3 -54.7 -23.8 -40.7 -14.1 -30.3 -42.7 -60.0 -21.4 -32.9 -25.6 -44.5 -18.5 -44.2 -20.0 -33.6 -37.0 -51.2 -21.3 -34.0

Difference 7.0 8.6 16.8 17.1 8.3 13.4 8.8 12.2 10.2 25.9 18.5 16.8 16.2 17.4 11.5 18.9 25.7 13.6 14.1 12.7 13.9

Table 3: Maximum Likelihood Estimates of Parameters

(1) (2) (3) (4) (5) (6) (7) (8) (9)

Parameter ρ κ σu σν σe β c Vo σǫ

Estimate 0.9918 0.5537 0.0453 0.0556 0.0464 0.95 0.0 -0.001 0.0711

29

Standard Error 0.0352 0.0121 0.0030

NA

Table 4: Estimates of demeaned values of Hbmt

Atlanta Boston Charlotte Chicago Cleveland Dallas Denver Detroit Las Vegas Los Angeles Miami Minneapolis New York Phoenix Portland San Diego San Francisco Seattle Tampa Bay Washington DC

2006 0.104 0.111 0.017 0.109 0.097 0.032 0.081 0.249 0.362 0.166 0.239 0.126 0.102 0.286 0.074 0.206 0.167 0.064 0.243 0.153

2007 2008 2009 2010 2011 0.117 0.062 -0.018 -0.111 -0.235 0.082 -0.013 -0.057 -0.089 -0.150 0.048 0.039 0.003 -0.019 -0.074 0.118 0.047 -0.031 -0.106 -0.198 0.092 0.009 -0.056 -0.090 -0.167 0.046 0.005 -0.006 -0.017 -0.061 0.062 -0.007 -0.028 -0.083 -0.113 0.209 0.069 -0.139 -0.273 -0.370 0.346 0.109 -0.215 -0.376 -0.522 0.157 0.059 -0.114 -0.143 -0.201 0.254 0.139 -0.091 -0.241 -0.395 0.124 0.014 -0.057 -0.123 -0.211 0.093 0.024 -0.036 -0.084 -0.141 0.272 0.118 -0.104 -0.248 -0.407 0.131 0.084 0.008 -0.087 -0.160 0.159 0.012 -0.143 -0.190 -0.228 0.134 0.077 -0.114 -0.156 -0.229 0.137 0.088 0.001 -0.070 -0.174 0.211 0.080 -0.076 -0.200 -0.322 0.148 0.032 -0.070 -0.131 -0.188

Table 5: Default Data and Model Predictions, 2007-2011

year 2007 2008 2009 2010 2011

Freddie Mac Data Sample Size Default Rate Average Std. Dev. Average Std. Dev. (1) (2) (3) (4) 2322 1610 0.90 0.49 2162 1490 3.63 2.88 1885 1273 8.36 6.65 1385 926 6.93 5.10 1040 682 6.26 4.26

30

Model Default Rate (5) 0.00 0.39 3.41 4.93 6.96

Error (6) 0.90 3.25 4.96 2.00 -0.70

Table 6: Impact of Uncertainty About House Prices on Default, 2009

MSA Charlotte Dallas Denver Seattle Portland New York Cleveland Boston Chicago Atlanta Minneapolis Washington, DC Los Angeles San Francisco San Diego Tampa Miami Detroit Phoenix Las Vegas Average

LTV (1) 85 86 96 96 97 99 100 101 104 106 107 109 112 115 121 133 140 142 148 180

Data Default Rate (2) 2.64 1.18 2.47 4.71 4.76 4.06 3.14 3.07 4.90 3.74 4.39 4.01 10.34 11.73 6.47 9.19 11.74 5.74 15.39 18.24 6.60

Simulated Default Rates σnu = 0.0556 σν = 0.0 (3) (4) 0.00 0.01 0.01 0.01 1.10 1.77 1.86 3.03 1.91 3.14 2.45 4.09 3.02 4.90 3.11 5.15 4.54 6.98 5.14 7.47 5.85 8.70 6.72 9.89 7.64 10.69 8.26 11.07 9.73 11.82 10.90 12.10 11.32 12.17 11.52 12.20 11.67 12.22 12.15 12.25 5.95 7.48

Increase (5) 0.00 0.00 0.67 1.16 1.23 1.64 1.89 2.03 2.44 2.32 2.85 3.17 3.05 2.81 2.09 1.20 0.85 0.68 0.55 0.10 1.54

Figure 1: Comparison of CSW to Average of Self Reports (SR)

40

60

80

100

Los−Angeles

2000

2005 CSW

31

2010 SR

Figure 2: Comparison of Real Growth, CSW and Average of Self Reports (SR)

20.0

40.0

SR 60.0 80.0

100.0

120.0

Percent Change, 2000 to CSW Peak Date

0.0

50.0

100.0

150.0

CSW

Const = 15.9, Slope = 0.74

−50.0

−40.0

SR −30.0 −20.0

−10.0

0.0

Percent Change, CSW Peak Date to 2011

−60.0

−50.0

−40.0 CSW

Const = 9.4, Slope = 0.86

32

−30.0

−20.0

Figure 3: Comparison of Hst (square) and estimate of Hsmt (triangle) Boston

2006

−.2

−.3

−.2

−.1

−.1

0

0

.1

.1

Atlanta

2007

2008

2009

2010

2011

2006

2007

2010

2011

2009

2010

2011

2009

2010

2011

2009

2010

2011

2010

2011

.2 .1

.05

0

0

−.3

−.2

−.1

−.15 −.1 −.05 2006

2009

Chicago

.1

Charlotte

2008

2007

2008

2009

2010

2011

2006

2007

2008

Dallas

2006

2007

2008

2009

2010

−.15

−.2

−.1

−.1

−.05

0

0

.05

.1

Cleveland

2011

2006

2007

2008

2006

−.2 −.4

−.15 −.1 −.05

0

0

.2

.05

.4

Detroit

.1

Denver

2007

2008

2009

2010

2011

2006

2007

Los−Angeles

2006

−.4

−.6

−.4

−.2

−.2

0

0

.2

.2

.4

.4

Las−Vegas

2008

2007

2008

2009

2010

2011

2006

33

2007

2008

2009

Comparison of Hst (square) and estimate of Hsmt (triangle) Minneapolis

−.6

−.3

−.4

−.2

−.2

−.1

0

0

.2

.1

.2

.4

Miami

2006

2007

2008

2009

2010

2011

2006

2007

2009

2010

2011

2010

2011

2009

2010

2011

2009

2010

2011

2010

2011

Phoenix

2006

−.6

−.2

−.4

−.1

−.2

0

0

.1

.2

.4

.2

New−York

2008

2007

2008

2009

2010

2011

2006

2007

.4 .2

.1

0

0

−.2

−.1

−.4

−.2 2006

2009

San−Diego

.2

Portland

2008

2007

2008

2009

2010

2011

2006

2007

2008

2006

−.3

−.4

−.2

−.2

−.1

0

0

.2

.1

.2

Seattle

.4

San−Francisco

2007

2008

2009

2010

2011

2006

2007

Washington−DC

−.4

−.2

−.2

−.1

0

0

.1

.2

.2

.4

Tampa

2008

2006

2007

2008

2009

2010

2011

2006

34

2007

2008

2009

−.6

−.4

−.2

Data

0

.2

.4

Figure 4: Comparison of data on Hbt to predicted values, 2005-2011

−.4

Data =

−.2

0 Predicted Values

0.001 + 0.959 ∗ Predicted (0.002) (0.014)

35

.2

.4

0

Default Start Percentage 5 10

15

Figure 5: Model Fit

75

80

85

90

95

100

LTV

0

5

Default Start Percentage 10 15

20

LTV < 100, All Years

80

100

120 LTV

2009, All LTVs Blue = data, red = model predicted 36

140

0

Default Start Percentage 5 10 15

20

Figure 6: LTV > 100, 2010 and 2011

100

120

140

160 LTV

Blue = data, red = model predicted

37

180

200

Table 7: Sample Sizes for Self-Reported House Values, 1-Family Owner-Occupied Units 2000 5% Census, 2005-2011 American Community Survey

MSA Atlanta Boston Charlotte Chicago Cleveland Dallas Denver Detroit Las Vegas Los Angeles Miami Minneapolis New York Phoenix Portland San Diego San Francisco Seattle Tampa Washington, DC top-code percent

5% Census 2000 34,865 30,588 13,125 66,584 24,888 42,525 20,847 39,689 11,319 82,064 14,148 24,213 100,842 29,324 16,173 20,526 38,864 20,524 24,491 46,335 1.2

2005 10,986 7,952 4,147 16,969 6,021 12,850 5,958 9,937 3,767 20,354 3,475 5,883 24,307 8,987 4,407 5,603 9,583 5,498 7,219 12,175 4.3

American 2006 2007 11,180 11,462 7,875 7,873 4,268 4,501 16,875 17,108 6,051 6,028 12,908 13,306 5,957 6,060 9,817 9,742 3,880 3,951 20,040 20,218 3,590 3,637 5,943 5,895 23,735 23,826 8,958 8,958 4,460 4,546 5,529 5,387 9,411 9,371 5,462 5,661 7,189 7,283 12,131 12,369 5.3 5.6

38

Community Survey 2008 2009 2010 11,130 11,398 10,579 7,546 7,706 7,596 4,479 4,486 4,357 16,406 16,665 15,843 5,774 5,825 5,778 13,258 13,315 12,975 5,851 5,856 5,813 9,093 9,176 8,707 3,853 3,832 3,680 18,989 19,247 19,095 3,409 3,379 3,409 5,806 5,730 5,489 23,184 23,231 23,013 8,551 8,503 8,093 4,427 4,412 4,379 5,159 5,194 5,065 8,780 9,052 8,870 5,510 5,569 5,599 6,860 6,828 6,735 11,894 12,052 11,985 0.6 0.6 1.0

2011 8,637 7,572 3,870 14,636 5,910 12,158 5,688 8,742 3,417 20,048 3,410 5,161 23,371 7,761 4,360 5,143 8,774 5,567 7,066 11,505 1.0