DYNAMIC SPECIFICATION TESTS FOR DYNAMIC FACTOR MODELS

DYNAMIC SPECIFICATION TESTS FOR DYNAMIC FACTOR MODELS Gabriele Fiorentini and Enrique Sentana CEMFI Working Paper No. 1306 June 2013 CEMFI Casado ...
Author: Emily Harper
1 downloads 0 Views 360KB Size
DYNAMIC SPECIFICATION TESTS FOR DYNAMIC FACTOR MODELS

Gabriele Fiorentini and Enrique Sentana

CEMFI Working Paper No. 1306

June 2013

CEMFI Casado del Alisal 5; 28014 Madrid Tel. (34) 914 290 551 Fax (34) 914 291 056 Internet: www.cemfi.es

This paper is a substantially revised version of Fiorentini and Sentana (2009), which dealt with static factor models only. We are grateful to Dante Amengual and Andrew Harvey, as well as to seminar audiences at Bologna, Columbia, ECARES/ULB, Georgetown, Penn, Princeton, Salento, Toulouse, the Oxford-Man Institute Time Series Econometrics Conference in honour of Andrew Harvey, the fifth ICEEE Congress (Genova), the first Barcelona GSE Summer Forum and the RCEA Conference (Toronto) for helpful comments, discussions and suggestions. We are also grateful to Máximo Camacho and Gabriel PérezQuirós for allowing us to use their data and assisting us in reproducing their results. The remarks of an associate editor and two anonymous referees also led to a radical overhaul of the paper. Of course, the usual caveat applies. Financial support from MIUR through the project .Multivariate statistical models for risk assessment.(Fiorentini) and the Spanish Ministry of Science and Innovation through grant ECO 201126342 (Sentana) is gratefully acknowledged.

CEMFI Working Paper 1306 June 2013

DYNAMIC SPECIFICATION TESTS FOR DYNAMIC FACTOR MODELS

Abstract

We derive computationally simple and intuitive expressions for score tests of neglected serial correlation in common and idiosyncratic factors in dynamic factor models using frequency domain techniques. The implied time domain orthogonality conditions are analogous to the conditions obtained by treating the smoothed estimators of the innovations in the latent factors as if they were observed, but they account for their final estimation errors. Monte Carlo exercises confirm the finite sample reliability and power of our proposed tests. Finally, we illustrate their empirical usefulness in an application that constructs a monthly coincident indicator for the US from four macro series. JEL Codes: C32, C38, C52, C12, C13. Keywords: Kalman filter, LM tests, Spectral maximum likelihood, Wiener-Kolmogorov filter.

Gabriele Fiorentini Università di Firenze [email protected]

Enrique Sentana CEMFI [email protected]

1

Introduction Dynamic factor models have been extensively used in macroeconomics and …nance since

their introduction by Sargent and Sims (1977) and Geweke (1977) as a way of capturing the cross-sectional and dynamic correlations between multiple series in a parsimonious way. A far from comprehensive list of early and more recent applications include not only business cycle analysis (see Litterman and Sargent (1979), Stock and Watson (1989, 1991, 1993), Diebold and Rudebusch (1996) or Gregory, Head and Raynauld (1997)) and bond yields (Singleton (1981), Jegadeesh and Pennacchi (1996), Dungey, Martin and Pagan (2000) or Diebold, Rudebusch and Aruoba (2006)), but also wages (Engle and Watson (1981)), employment (Quah and Sargent (1993)), commodity prices (Peña and Box (1987)) and …nancial contagion (Mody and Taylor (2007)). The model parameters are typically estimated by maximising the likelihood function of the observed data, which can be readily obtained either as a by-product of the Kalman …lter prediction equations or from Whittle’s (1962) frequency domain asymptotic approximation.1 Once the parameters have been estimated, …ltered values of the latent factors can be extracted by means of the Kalman smoother or its Wiener-Kolmogorov counterpart. These estimation and …ltering issues are well understood (see e.g. Harvey (1989)), and the same can be said of their e¢ cient numerical implementation (see Jungbacker and Koopman (2008)). However, several important modelling issues arise in practice, such as the right number of factors or the identi…cation of their e¤ects. Another non-trivial empirical issue is the speci…cation of the dynamics of common and idiosyncratic factors. When the cross-sectional dimension, N , is very large, one might expect to accurately recover the latent factors using simpler procedures (see Bai and Ng (2008) and the references therein). But in models in which N is small, the …ltered estimates of the state variables are likely to be heavily in‡uenced by the dynamic speci…cation of the model, which thus becomes a …rst order issue. The objective of our paper is precisely to provide diagnostics for neglected serial correlation in those state variables. For that reason, we focus on Lagrange Multiplier (LM) tests, which only require estimation of the model under the null. As is well known, Likelihood ratio (LR), Wald and LM tests are equivalent under the null and sequences of local alternatives as the number of observations increases for a …xed cross-sectional dimension, and therefore they share their optimality properties.2 In addition to computational considerations, 1 Watson and Engle (1983) and Quah and Sargent (1993) discuss the application of the EM algorithm of Dempster, Laird and Rubin (1977) in this context, which avoids the computation of the likelihood function. As is well known, though, this algorithm slows down considerably near the optimum, so it is best used as a procedure for obtaining good initial values. 2 Extensions to situations in which both data dimensions simultaneously grow are left for further research.

1

which are particularly relevant when one is concerned about several alternatives, an important advantage of LM tests expressed as score tests is that they often coincide with tests of easy to interpret moment conditions (see Newey (1985) and Tauchen (1985)), which will continue to have non-trivial power even in situations for which they are not optimal. As we shall see, our proposed tests are no exception in that regard. Earlier work on speci…cation testing in dynamic factor models include Engle and Watson (1980), who explained how to apply the LM testing principle in the time domain for models with static factor loadings, Geweke and Singleton (1981), who studied LR and Wald tests in the frequency domain, and Fernández (1990), who applied the LM principle in the frequency domain to a multivariate “structural time series model”(see Harvey (1989) for a comparison of time domain and frequency domain testing methods in that context). Aside from considering a general class of models, our main contribution is that our proposed tests are very simple to implement, and even simpler to interpret. Once a model has been speci…ed and estimated, score tests focusing on several departures from the null can be routinely computed from simple statistics of the estimated state variables. And even though our theoretical derivations make extensive use of spectral methods for time series, we provide both time domain and frequency domain interpretations of the relevant scores, so researchers who strongly prefer one method over the other could apply them without abandoning their favourite estimation techniques. The rest of the paper is organised as follows. In section 2, we review the properties of dynamic factor models, their estimators and …lters. Then, we derive our tests in section 3, and present a Monte Carlo evaluation of their …nite sample behaviour in section 4. This is followed in section 5 by an empirical illustration that revisits the dynamic factor model used by Camacho, Pérez-Quirós and Poncela (2012) to construct a coincident indicator for the US. Finally, our conclusions, together with several interesting extensions, can be found in section 6. Auxiliary results are gathered in appendices.

2

Theoretical background

2.1

Dynamic factor models

To keep the notation to a minimum, we focus on single factor models, which su¢ ce to illustrate our main results. A parametric version of a dynamic exact factor model for a …nite dimensional vector of N observed series, yt , can be de…ned in the time domain by the system

2

of equations yt =

+ c(L)xt + ut ;

x (L)xt

=

x (L)ft ;

ui (L)ui;t

=

ui (L)vi;t ;

(ft ; v1;t ; : : : ; vN;t )jIt

1;

;

N [0; diag(1;

where xt is the common factor, ut the N speci…c factors, c(L) =

1; : : : ;

PM

`= F

N )];

c` L` a vector of possibly

two-sided polynomials in the lag operator,

x (L)

and

px and pui , respectively, while

ui (L)

are one-sided (coprime) polynomials of orders

qx and qui , It time t

1,

1

x (L)

and

ui (L)

i = 1; : : : ; N;

are one-sided polynomials of orders

is an information set that contains the values of yt and ft up to, and including refers to all the remaining model parameters.3

is the mean vector and

A speci…c example would be 0 1 0 1 0 1 0 1 c1;0 c1;1 y1;t 1 B .. C B . C B . C B . C @ . A = @ .. A + @ .. A xt + @ .. A xt yN;t cN;0 cN;1 N xt = x1 xt 1 + ft ; uit =

ui 1 uit 1

+ vit ;

0

1 u1;t B .. C 1+@ . A; uN;t

(1)

i = 1; : : : ; N:

Note that the dynamic nature of the model is the result of three di¤erent characteristics: 1. The serial correlation of the common factor 2. The serial correlation of the idiosyncratic factors 3. The dynamic impact of the common factor on the observed variables. Thus, we would need to shut down all three sources to go back to a traditional static factor model (see Lawley and Maxwell (1971)). Cancelling only one or two of those channels still results in a dynamic factor model. For example, Engle and Watson (1981) considered models with static factor loadings, while Peña and Box (1987) further assumed that the speci…c factors were white noise. To some extent, characteristics 1 and 3 overlap, as one could always write any dynamic factor model in terms of white noise common factors. In this regard, the assumption of Arma(px ; qx ) dynamics for the common factor can be regarded as a parsimonious way of modelling an in…nite distributed lag.4 3

We could relax the assumption of cross-sectional orthogonality in the idiosyncratic terms, but in general we would still need to impose some parametric restrictions for identi…cation purposes given that we maintain the assumption of …xed N . 4 Some dynamic factor models can be written as static factor models with a larger number of factos. For example, in model (1) we could de…ne ft and xt 1 as two “orthogonal” static factors, with factor loading ci;0 and ci;1 + x1 ci;0 respectively. Our tests, though, apply to all factor models, including those without a static factor representation.

3

In this paper we are interested in hypothesis tests for px = dx vs px = dx + kx or pui = dui vs pui = dui + kui , or the analogous hypotheses for qx and qui . To avoid dealing with nonsensical situations, we maintain the assumption that the model which has been estimated under the null is identi…ed (see Geweke (1977) and Geweke and Singleton (1981) for a general discussion of identi…cation in dynamic factor models, and Heaton and Solo (2004) for more speci…c results for the parametric models that we consider in this paper).

2.2

Tests of white noise vs. AR(1) in the common factors

Let us start by quickly reviewing the …rst order serial correlation tests obtained by Fiorentini and Sentana (2012). The baseline model in that paper is the static factor model yt =

ft vt

jIt

1;

s

+ cxt + ut ; xt = ft , ut = vt ; 0 N ; 0

1 0 0

;

which remains rather popular in …nance (except in term structure applications) (see Connor, Goldberg and Korajczik (2010) and the references therein). The Kalman smoother yields the same factor estimates as the Kalman …lter updating equations, which have simple closed form expressions: ftjt = ftjT = c0

1

vtjt = vtjT =

1

c0 1 + c0 ) = yt

(yt

)=

(yt

1 1c

(yt

);

cxtjt :

A potentially interesting alternative would be: yt = + cxt + ut ; xt = xt 1 + ft ; ut = vt : This alternative reduces to the static speci…cation under the null H0 :

= 0. Otherwise,

it has the autocorrelation structure of a Varma(1,1). Fiorentini and Sentana (2012) show that testing the null of multivariate white noise against such a complex Varma(1,1) speci…cation is extremely easy. Speci…cally, they show that the average score with respect to s

T

T 1X = ftjT ft T

under H0 is

1jT ;

t=2

which is entirely analogous to the score that one would use to test for …rst order serial correlation in ft if the latent factors were observed (see Breusch and Pagan (1980) or Godfrey (1989)). The main di¤erence is that the asymptotic variance of this score is [c0 Sentana (2012) interpret c0

1c

1 c]2

< 1. Fiorentini and

as the R2 in the theoretical least squares projection of ft on 4

a constant and yt . Therefore, the higher the degree of observability of the common factor, the closer the asymptotic variance of the average score will be to 1, which is the asymptotic variance of the …rst sample autocorrelation of ft . Intuitively, this convergence result simply re‡ects the fact that the common factor becomes observable as the “signal to noise”ratio c0

1c

approaches

1. Before the limit, though, the test takes into account the unobservability of ft . Given that c0

1c

= (c0

1 c)=[1+(c0

1 c)]

under the assumption that

has full rank, the aforementioned

R2 will typically be close to 1 for N large due to the pervasive nature of the common factor (see e.g. Sentana (2004)). When we move to testing say Ar(1) vs Ar(2) in the unobservable factors, the model is already dynamic under the null and the Kalman …lter and smoother equations no longer coincide. More importantly, those equations are recursive and therefore di¢ cult to characterise without solving a multivariate algebraic Riccati equation. Although a Lagrange Multiplier test of the new null hypothesis in the time domain is conceptually straightforward, the algebra is incredibly tedious and the recursive scores di¢ cult to interpret (see Appendix A). An alternative way to characterise a dynamic factor model is in the frequency domain. As we shall see, the (non-recursive) frequency domain scores remain remarkably simple, since they closely resemble the scores of a static factor model.

2.3

Maximum likelihood estimation in the frequency domain

In what follows, we maintain the assumption that yt is a covariance stationary process, possibly after suitable transformations as in section 5. Under stationarity, the spectral density matrix of the observed variables is proportional to Gyy ( ) = c(e

i t

)Gxx ( )c0 (ei t ) + Guu ( );

Gxx ( ) =

x (e

i t)

x (e

i t)

x (e

i t)

x (e

i t)

;

Guu ( ) = diag[Gu1 u1 ( ); : : : ; GuN uN ( )]; Gui ui ( ) =

i

ui (e

i

ui (e

i

) t)

i t ui (e ) ; i t ui (e )

which inherits the exact single factor structure of the unconditional covariance matrix of a static factor model. Let

T T 1 XX Iyy ( ) = (yt 2 T

)(ys

)0 e

i(t s)

(2)

t=1 s=1

denote the periodogram matrix and

j

= 2 j=T (j = 0; : : : T

1) the usual Fourier frequencies.

If we assume that Gyy ( ) is not singular at all frequencies,5 the so-called Whittle (discrete Otherwise, there would be a linear combination of the components of the yt0 s at frequency identically 0. 5

5

that would be

spectral) approximation to the log-likelihood function is6 NT ln(2 ) 2

T 1 1X ln jGyy ( j )j 2 j=0

T 1 1X tr Gyy1 ( j )[2 Iyy ( j )] : 2

(3)

j=0

If we further assume that Gxx ( ) > 0 and Gui ui ( ) > 0 for all i, computations can be considerably speeded up by exploiting that Gyy1 ( ) = Guu1 ( )

!( )Guu1 ( )c(e

i t

)c0 (ei t )Guu1 ( );

!( ) = [Gxx1 ( ) + c0 (ei t )Guu1 ( )c(e The MLE of

i t

)]

1

:

, which only enters through Iyy ( ), is the sample mean, so in what follows we

focus on demeaned variables. In turn, the score with respect to all the remaining parameters is d( ) =

T 1 1 X @vec0 [Gyy ( j )] M( j )m( j ); 2 @ j=0

m( ) = vec 2 I0yy ( ) M( ) = Gyy1 ( )

G0yy ( ) ; Gyy10 ( ) :

We provide numerically reliable and fast to compute expressions for all the required derivatives in Appendix B. The information matrix is Z 1 Q= 4 where

@vec0 [Gyy ( )] M( ) @

@vec0 [Gyy ( )] @

d ;

denotes the conjugate transpose of a matrix. A consistent estimator will be provided

by either by the outer product of the score or by T 1 1 X @vec0 [Gyy ( j )] ( )= M( j ) 2 @ j=0

@vec0 [Gyy ( )] @

:

Formal results showing the strong consistency and asymptotic normality of the resulting ML estimators under suitable regularity conditions have been provided by Dunsmuir and Hannan (1976) and Dunsmuir (1979), who also show their asymptotic equivalence to the time domain ML estimators.7 6

There is also a continuous version which replaces sums by integrals (see Dusmuir and Hannan (1976)). This equivalence is not surprising in view of the contiguity of the Whittle measure in the Gaussian case (see Choudhuri, Ghosal and Roy (2004)). 7

6

2.4

The (Kalman-)Wiener-Kolmogorov …lter

By working in the frequency domain we can easily obtain smoothed estimators of the latent variables too. Speci…cally, let Z

=

yt

ei t dZy ( );

V [dZy ( )] = Gyy ( )d denote Cramer’s spectral decomposition of the observed process, which is the frequency domain analogue to Wold’s decomposition. The Wiener-Kolmogorov two-sided …lter for the common factor xt at each frequency is given by c0 (ei )Gxx ( )Gyy1 ( )dZy ( ) 8 will be so that the spectral density of the smoother xK tjT as T ! 1

GxK xK ( ) = c0 (ei )Gxx ( )Gyy1 ( )Gxx ( )c(e

i

) = !( )c0 (ei )Guu1 ( )c(e

Hence, the spectral density of the …nal estimation error xt Gxx ( )

i

)Gxx ( ):

(4)

xK tj1 will be given by

c0 (ei )Gyy1 ( j )c(ei ) = !( ):

K , by applying to xK the Having obtained these, we can easily obtain the smoother for ft , ftj1 tj1

one-sided …lter i

x (e

)=

x (e

i

)

Likewise, we can derive its spectral density, as well as the spectral density of its …nal estimation K . Finally, we can obtain the autocovariances of xK , f K and their …nal estimation error ft ftj1 tj1 tj1

errors by applying the usual inverse Fourier transformation Z cov(zt ; zt k ) = ei k Gzz ( )d :

2.5

The minimal su¢ cient statistics for fxt g

In any given realisation of the vector process fyt g, the values of fxt g could be regarded as a

set of T parameters. With this interpretation in mind, we can de…ne xG tj1 as the spectral GLS estimator of xt through the transformation [c0 (ei )Guu1 ( )c(e

i

)]

c (ei )Guu1 ( )dZy ( ):

1 0

8

The main di¤erence between the Wiener-Kolmogorov …ltered values, xK tj1 , and the Kalman …lter smoothed values, xK , results from the dependence of the former on a double in…nite sequence of observations. As shown tjT by Fiorentini (1995) and Gómez (1999), though, they can be made numerically identical by replacing both preand post- sample observations by their least squares projections onto the linear span of the sample observations.

7

Similarly, we can de…ne uG tj1 though fIN

c(e

i

)[c0 (ei )Guu1 ( )c(e

i

)]

c (ei )Guu1 ( )gdZy ( ):

1 0

G It is then easy to see that the joint spectral density of xG tj1 and utj1 will be block-diagonal,

with the (1,1) element being Gxx ( ) + [c0 (ei )Guu1 ( )c(e

i

)]

1

i

)]

1 0

and the (2,2) block Gyy ( ) whose rank is N

c(e

i

)[c0 (ei )Guu1 ( )c(e

c (ei );

1. This orthogonalisation may be regarded as the frequency domain version

of the endogenous factorial representation in Gourieroux, Monfort and Renault (1991). As such, it allows us to factorise the spectral log-likelihood function of yt as the sum of the log-likelihood G 9 function of xG tj1 , which is univariate, and the log-likelihood function of utj1 .

Importantly,

the parameters characterising Gxx ( ) only enter through the …rst component. In contrast, the remaining parameters a¤ect both components. Moreover, we can easily show that 1. xG tj1 = xt +

G tj1 ,

with xt and

G tj1

orthogonal at all leads and lags10

2. The smoothed estimator of xt obtained by applying the Wiener- Kolmogorov …lter to xG tj1 coincides with xK tj1 . This con…rms that xG tj1 constitute minimal su¢ cient statistics for xt , thereby generalising earlier results by Fiorentini, Sentana and Shephard (2004), who looked at the related class of factor models with time-varying volatility, and Jungbacker and Koopman (2008), who considered models in which c(e 9

i

) = c for all .11

The Jacobian of the transformation is 1, as we can write fIN =

1 0

[c0 (ei )Guu1 ( )c(e i )] 1 c0 (ei )Guu1 ( ) c(e i )[c0 (ei )Guu1 ( )c(e i )] 1 c0 (ei )Guu1 ( )g 1=2

0 1=2

Guu ( )

fIN

[c0 (ei )Guu1 ( )c(e i )] 1 c0 (ei )Guu ( ) 1=2 1=2 Guu ( )c(e i )[c0 (ei )Guu1 ( )c(e i )] 1 c0 (ei )Guu ( )g

!

Guu1=2 ( );

where the matrix in the centre is orthogonal. 10 K This implies that E(xG tj1 jxt ) = xt , which con…rms that while xtj1 can be understood as a Bayesian crosssectional GLS estimator of xt that uses the prior xt N [0;Gxx ( )], xG tj1 relies on a di¤use prior instead. 11 It is also possible to relate xG to the …rst spectral principal component extracted from tj1 1=2

1=2

Guu ( )Gyy ( )Guu ( ) along the lines of Appendix 2 in Sentana (2004).

8

2.6

Autocorrelation structure of the factor estimators

As discussed in Maravall (1999), the serial dependence structure of the estimators of the latent variables can be a useful tool for model diagnostic. Large discrepancies between theoretical and empirical autocovariance functions of the factor estimators can be interpreted as indication of model misspeci…cation. As we shall see below, our LM tests carry out this comparison in a very precise statistical sense. Smoothed factors, though, are the result of optimal symmetric two-sided …lters. As a consequence, their serial correlation structure is generally di¤erent from that of the unobserved state variables. Speci…cally, the frequency by frequency orthogonality of predictor and prediction error implies that GxK xK ( )

Gxx ( ) for all , so that the smoothed estimates are smoother than the

latent factors. In addition, the degree of unobservability of xt depends exclusively on the size of [c0 (ei )Guu1 ( )c(e

i

)]

1

relative to Gxx ( ), which is generally di¤erent for di¤erent frequencies. ; ] either GxK xK ( ), [c0 (ei )Guu1 ( )c(e

This can be visualised by representing over [

i

)]

1

and their sum Gxx ( ), or the R2 -type, signal to noise measure GxK xK ( )=Gxx ( ). In our general multivariate setting, the time domain structure of the smoothed components is complicated and di¢ cult to interpret. There are special cases, however, in which the resulting x (L)

models for the unobserved factors are rather simple. Consider …rst the case where u1 (L)

=

=

uN (L)

=

= 1, so that all state variables follow purely autoregressive processes.

Moreover, assume static loadings to simplify the exposition. The Fourier transform of GxK xK ( ) yields the autocovariance generating function (Acgf) of xK , which is given by ACGFxK (L) = = where

xK

PN

2 1 i=1 ci ui (L) ui (L )= i

PN

2 1 i=1 ci ui (L) ui (L )= i

xK (L) xK (L xK (L) xK (L

+

1

x (L) x (L

1)

x (L) x (L

1)

1)

; 1 ) xK

denotes the variance of the univariate Wold innovations in xK tj1 .

Let pu = max(pui ) and p = max(pu ; px ). Then, it is easy to prove that

xK (L)

and

xK (L)

are polynomials of orders pu and p + px respectively. Hence, the factor estimators will display the Acgf of and Arma( p + px , pu ). For example, when both common and speci…c factors follow AR(1) processes the factor estimators will display the autocorrelation of an Arma(2,1). It is also interesting to consider the special case in which the autoregressive polynomials x (L)

and

ui (L)

share some or even all their roots. In this latter case

x (L)

=

ui (L)

= (L),

and ACGFxK (L) =

(L) (L (L) (L

1)

1)

PN

2 i=1 ci = i PN 2 i=1 ci = i +

1 9

1 (L) (L

PN

2 1 i=1 ci = i = P N 1) 2 (L) (L i=1 ci = i + 1

1)

:

In this particular case the model for the common factor estimators is exactly the same as the model for the unobserved factor re-scaled by the static signal to noise ratio PN

c2i = i : PN i=12 c = + 1 i i=1 i

K , will be white noise. Moreover, the smoother of the innovations in the common factor, ftj1

Interestingly,

PN

c2 = GxK xK ( ) = PN i=1 i i Gxx ( ); 2 i=1 ci = i + 1

so that the ratio between the smoothed and the unobservable factor spectra is constant at all frequencies. Intuitively, this is due to the fact that in this special case the observable variables follow a Var(1) with a diagonal companion matrix whose innovations covariance matrix retains the static factor model properties. As a consequence, the quasi-di¤erenced data will have the usual static factor structure. Another way of obtaining the same result is by noticing that the dynamic GLS factor estimators, xG tj1 , will be a static transformation of the observed series in this case. More generally, when the common and speci…c factors follow Arma processes we have that

ACGFxK (L) = P N

PN

2 i=1 ci

2 i=1 ci

=

ui (L) ui ui (L)

[

x (L)

ui (L) ui (L (L 1 )

ui (L

1) 1) u (L) (L u u 1) x (L) u (L) u (L 1) u + u (L) u (L x (L)

+

i

1)

1)

i

1)

u (L) u (L

x (L

1)

x (L

1)

x (L) x (L

1)

x (L)

1)

x (L

x (L) x (L

1)

x (L) x (L

1)

x (L) x (L

1)

x (L) x (L

1)

=

=

1) (L) u u K = [ x (L) x (L 1 ) u (L) u (L 1 )+ x (L) x (L 1 ) u (L) u (L 1 )] x (L) x (L 1 ) x 1) 1) (L) (L (L) (L u u x x 1) 1) 1 (L) (L (L) (L u u x x x (L) x (L ) xK : 1 1 1 1 1 x (L ) u (L) u (L ) + x (L) x (L ) u (L) u (L )] x (L) x (L ) u (L) u (L

=

ui (L) ui (L

1)

x (L) x (L (L 1 )

If we further assume that the moving average polynomials of the speci…c factor processes are coprime, then u (L) =

N X

ci

ui (L) uni (L);

i=1

with

uni (L)

=

uj (L)

j=1 j6=i

and u (L)

Y

=

Y i=1

10

ui (L):

For example, if all the factors are Arma(1,1) then the order of both

u (L)

and

u (L)

will

be N , so the Acgf of the factor estimators is that of an Arma(N + 2; N + 2). Let us now turn to the speci…c factors. The spectral matrix of the idiosyncratic smoother is GuK uK ( ) = Guu ( )Gyy ( )

1

cc0 c0 Guu1 ( )c+Gxx ( )

Guu ( )= Guu ( )

1

:

Similarly, the relation between the smoother and the unobserved factor spectral matrix is GuK uK ( ) = I

cc0 c0 Guu1 ( )c+Gxx ( )

1

Guu ( ):

The Fourier transform of GuK uK ( ) yields the Acgf of the Varma process for uK t , which is generally rather complicated. In the case of purely autoregressive unobserved factors, the generic i th element of vecd [GuK uK ( )] has the form vecd [GuK uK ( )]i = =

=

i

c2i

= 2 i ) i )= i ) i ) c (e (e + (e (e u u x x i i i i=1 i PN 2 i ) i )+ c (e i ) ui (ei ) c2i ui (e i ) ui (ei ) (e (e x i x hP i=1 i ui i = N 2 i ) i i ) i )= i ) i ) c (e (e + (e (e ui (e ui (e ) u u x x i i i i=1 i PN 2 i i ) ui (e ) j=1 ci ui (e j6=i hP i: N 2 i ) i i ) i ) i ) i )= c (e + (e (e (e ui (e ui (e ) u u x x i i i i=1 i ui (e

i

)

i ui (e

PN

)

Hence, if we call pu=i = maxj6=i (puj ; px ), it is easy to see that uK i;t displays the autocorrelation structure of an Arma(p + pi ; pu=i ).

2.7

Testing AR(1) vs AR(2) for observable xt

Although all previous spectral calculations are straightforward, they might seem daunting unless one is familiar with frequency domain methods. Fortunately, they have remarkably simple time domain counterparts. For pedagogical purposes, let us initially assume that xt is observable. The model under the alternative is (1 Therefore, the null is H0 :

x1

x1 L)(1

x1 L)xt

= ft :

= 0.12 Under the alternative, the spectral density of xt is 2 f

(1 12

x1 e

i

)(1

x1

ei

) (1

x1 e

This is a multiplicative alternative. Instead, we could test H0 : (1

x1 L

2

x2 L

i

1 )(1

x2

x1 e

i

)

:

= 0 in the additive alternative

)xt = ft :

In that case, it would be more convenient to reparametrise the model in terms of partial autocorrelations as 12 = 1 = (1 2 ) ; 22 = 2 . We stick to multiplicative alternatives, which cover MA terms too.

11

The derivative of Gxx ( ) with respect to @Gxx ( ) = 2(e @ x1

i

+ ei )

x1

under the null is 2 f

(1

i

x1 e

)(1

Hence the spectral version of the score with respect to T X1

cos

1 j Gxx ( j )[2 Ixx ( j )

x1 e x1

Gxx ( j )] =

j=0

i

)

= 2 cos Gxx ( ):

under H0 is

T X1

cos

j [2

If f ( j )];

j=0

where we have exploited the fact that T X1 j=0

T X1 @Gxx ( j ) 1 Gxx ( j ) = cos @ x1

j

= 0:

j=0

Given that If f ( j ) = ^ f f (0) + 2

T X1

^ f f (k) cos(k

j );

k=1

the spectral version of the score becomes T X1

cos

j [2

If f ( j )] = T [^ f f (1) + ^ f f (T

1)]:

j=0

In turn, the time domain version of the score will be X (xt

x1 xt 1 )(xt 1

x1 xt 2 )

=

t

X

ft ft

1;

t

which is essentially identical because ^ f f (T 1) = T

1x x T 1

= op (1). Therefore, the LM spectral

test is simply checking that the …rst sample (circulant) autocovariance of ft coincides with its theoretical value under H0 , exactly like the usual Breusch-Godfrey serial correlation LM test.

3

Neglected serial correlation tests in dynamic factor models

3.1

Testing ARMA(p,q) vs ARMA(p+d,q) (or ARMA(p,q+d)) in the common factor

We can combine our previous results to test the same null hypothesis when xt is not directly observed. As we saw before, the spectral density of the dynamic GLS estimator of the common factor is GxG xG ( ) = Gxx ( ) + [c0 (ei )Guu1 ( )c(e

i

)]

1

:

As a result, @Gxx ( ) @GxG xG ( ) = : @ x1 @ x1 Hence, the score with respect to

x1

will be given by

T 1 1 X @Gxx ( j ) 1 GxG xG ( j )[2 IxG xG ( j ) 2 @ x1 j=0

12

GxG xG ( j )]:

After some straightforward algebraic manipulations, we can show that under the null of H0 : x1

= 0 this score can be written as

=

XT

1

j=0 XT 1 j=0

cos

1 j Gxx ( j )[2

cos

j [2

IxK xK ( j )

If K f K ( j )

GxK xK ( j )]

Gf K f K ( j )]:

Once again, the time domain counterpart to the spectral score with respect to

x1

is (asymp-

totically) proportional to the di¤erence between the …rst sample autocovariance of ftK and its theoretical counterpart under H0 . Therefore, the only di¤erence with the observable case is that the autocovariance of ftK , which is a forward …lter of the Wold innovations of yt , is no longer 0 when

x1

= 0, although it approaches 0 as the signal to noise ratio increases. In that case,

our proposed tests would converge to the usual Breusch-Godfrey LM tests for neglected serial correlation discussed in section 2.7. Let us illustrate our test by means of a simple example. Imagine that the model under the alternative is: yt = (1 ft vt

+ cxt + ut , ut = vt ; x1 L)xt = ft ; x1 L)(1 1 0 0 ; jIt 1 ; N 0 0

:

The results in section 2.6 imply that xK tj1 will have the autocorrelation structure of an Ar(2) when 2 ), fK

x1

K will follow an Ar(1) with …rst order autocovariance (c0 = 0, while ftj1

1 c)

x1 =(1

where

fK

=

The larger (c0

1+ 1 c)

2 x1

+ (c0

1 c)

p [(1 +

2 x1 )

2

+ (c0

1 c)] [(1

x1 )

2

+ (c0

1 c)]

:

x1

is, the closer this autocovariance will be to 0.

The LM test of H0 :

x1

K = 0 will simply compare the …rst sample autocovariance of ftj1

with its theoretical value above. The advantage of our frequency domain approach is that we obtain those autocovariances without explicitly solving the Riccati equation. Unfortunately, the approach that we have used to obtain a residual correlation test for the common factor cannot be generally applied to the speci…c factors since the parameters in Guu ( ) a¤ect both components of the orthogonalised spectral log-likelihood function. Nevertheless, we can start from …rst principles by exploiting the fact that @vec[Gyy ( )] = [c(e @ x1 We saw before that under the null of H0 :

x1

i

)

c(e

i

)]

=0

@Gxx ( ) = 2 cos Gxx ( ): @ x1 13

@Gxx ( ) @ x1

Not surprisingly, if we introduce these derivatives in the formula for the spectral score with respect to

x1 ,

we end up with exactly the same frequency-domain and time-domain expressions.

Empirical researchers often assume that the common factors are white noise for identi…cation purposes, so that Gxx ( ) = 1 under the null. Since we make no assumptions on p and q, our tests trivially apply in that situation too. Similarly, generalisations to test Arma(p,q) vs Arma(p+k,q) in the common factor are straightforward, since they only involve higher order K . Similarly, it is easy to show that Arma(p+k,q) and Arma(p,q+k) autocovariances of ftj1

multiplicative alternatives are locally asymptotically equivalent, as in the case of univariate tests for serial correlation in observable time series (see e.g. Godfrey (1988)).13 Finally, we could also consider (multiplicative) seasonal alternatives.

3.2

Testing ARMA(p,q) vs ARMA(p+d,q) (or ARMA(p,q+d)) in speci…c factors

Let

0 u1

=(

u1 1 ; : : : ;

uN 1 ).

In this case we have that

@vec[Gyy ( )] @vecd[Guu ( )] = EN ; @ 0u1 @ 0u1 where EN is the “diagonalisation”matrix that maps vecd into vec (see Magnus (1988)). Straightforward algebraic manipulations allow us to write the score with respect to of H0 :

u1

ui 1

under the null

= 0 as

=

XT

1

j=0 XT 1 j=0

cos

1 j Gui ui ( j )[2

cos

j [2

IuK uK ( j )

IvK vK ( j ) i

i

i

i

GuK uK ( j )] i

i

GvK vK ( j )]: i

i

Thus, the time domain counterpart to the spectral score with respect to

ui 1

will be proportional

K and its theoretical value under H , to the di¤erence between the …rst sample autocovariance of vit 0

as expected. Joint tests that look at several idiosyncratic terms together, as well as the common factor, can be easily obtained by combining the scores involved. As we shall see in sections 4.2 and 5, the component tests are rather good at identifying the source of the rejection.

3.3

Parameter uncertainty

So far we have implicitly assumed known model parameters under the null. In practice, some of them will have to be estimated. Maximum likelihood estimation of the dynamic factor model parameters can be done either in the time domain using the Kalman …lter or in the frequency domain. 13

It would also be possible to develop tests of Arma(p,q) against Arma(p+k,q+k) along the lines of Andrews and Ploberger (1996). We leave those tests, which will also depend on the di¤erences between sample and K population autocovariances of ftj1 , for future research.

14

The sampling uncertainty surrounding the sample mean

is asymptotically inconsequential

because the information matrix is block diagonal. The sampling uncertainty surrounding the other parameters is not necessarily so. In fact, block diagonality of the components of the information matrix corresponding to the parameters that de…ne the alternative hypothesis, and the parameters that de…ne the null, cases arises when c(e

i

,

, is only obtained in some special cases. One such

) = c and both common and idiosyncratic factors follow Ar(1) processes

with a common autoregressive coe¢ cient. An important example are the static factor models considered by Fiorentini and Sentana (2012). In that situation, all …nal prediction errors are white noise, and one can safely ignore the estimation error in . More generally, the solution is the standard one: replace the inverse of the ( ; ) block of the information matrix by the ( ; ) block of the inverse information matrix in the quadratic form that de…nes the LM test. For this reason, we provide computationally e¢ cient expressions for the scores required to compute the information matrix in Appendix B. Importantly, the dual nature of our proposed tests implies that they can be applied regardless of whether we have estimated the model using a time domain or frequency domain log-likelihood.

4

Monte Carlo simulation

4.1

Size experiment

To evaluate possible …nite sample size distortions, we generate 10,000 samples of length 500, plus 50 for initialization, (roughly 4 decades of monthly data). The exact model that we simulate and estimate under the null is 3 2 3 3 2 2 3 2 u1;t 0:7 y1;t :1 4 y2;t 5 = 4 :1 5 + 4 0:5 5 xt + 4 u2;t 5 ; :1 0:4 u3;t y3;t (1 (1 + :4L)u1;t = v1;t ; V (ft ) = 1;

:4L (1

:2L2 )xt = ft ;

:6L)u2;t = v2;t ;

(1

:2L)u3;t = v3;t ;

vecd0 [V (vt )] = (0:4; 0:3; 0:8):

We compute LM tests against: 1. First order residual autocorrelation in the common factor (

2) 1

2. First and second order residual autocorrelation in the common factor ( 3. First order residual autocorrelation in all the speci…c factors (

15

2) 3

2) 2

4. First order residual autocorrelation in common and speci…c factors (

2) 4

Importantly, all our tests are numerically invariant to whether in estimating the model we normalise the variance of the common factor xt or its innovation ft to 1 because of the way we compute the information matrix (see Dufour and Dagenais (1991)). The p-value discrepancy plots (see Davidson and McKinnon (1998)) for the four test are displayed in Figure 1. As can be seen, all tests have virtually no size distortions, with the joint tests showing even smaller distortions than the tests that focus on the common factor only.

4.2

Power experiments

We …rst simulate and estimate 2,000 samples of length 500, plus 50 for initialization, in which the DGP for the common factors has

x (L)

= (1

:5L

:25L2

:125L3

: : :) = (1

:5L)

1

but the same …rst and second-order autocorrelation as under the null, so that xt = 0:874xt

1

+ 0:037xt

2

+ ft

0:5ft

1:

(5)

We also re-scale the loadings so as to maintain the same unconditional signal to noise ratio as under the null in an attempt to isolate our power results from changes in the degree of observability of the factors. Anything else is left unchanged. The results are reported in Figure 2. As expected, the test that focuses on the correct alternative hypothesis has the largest power, followed by the test that also focuses on second order residual correlation in the common factor, which wastes one degree of freedom. Not surprisingly, the least powerful test is only looking at the speci…c factors, which nevertheless retains some small power because their estimators are a¤ected by the neglected serial correlation in the common factor. We also simulate and estimate 2,000 samples of the same length as above in which the DGP for the speci…c factors has

ui (L)

= (1 + :2L), for i = 1; 2; 3, but the same …rst-order

autocorrelation as under the null, so that u1;t = u2;t = u3;t =

9 0:418u1;t 1 0:044u1;t 2 + v1;t ; = 0:514u2;t 1 + 0:143u2;t 2 + v2;t ; ; 0:185u3;t 1 + 0:077u3;t 2 + v3;t :

(6)

Again, we re-scale V (vt ) in order to match the same unconditional signal to noise ratio as under the null, leaving everything else unchanged. The results are reported in Figure 3. Not surprisingly, the test that focuses on the correct alternative hypothesis has the largest power, followed by the joint test. This time, the tests that focus on the common factor have power essentially equal to nominal size. However, in the experiment reported in Figure 4 we …nd that when the neglected serial correlation in the speci…c factor is larger (

ui 1

=

:6) the tests that

focus on the common factor regain some non-trivial power because their estimators under the null 16

are contaminated by some of the neglected serial correlation in the speci…c factors. Nevertheless, given that the rejection rates of the tests that look at the speci…c terms are essentially 100% for all con…dence levels, the results con…rm once again that our test correctly identify the source of the rejection. Finally, we combine the DGPs (5) and (6) above, so that both common and the speci…c factors are simulated under the alternative. As can be seen in Figure 5 the joint test is the most powerful followed by the test on the speci…c factors. As expected, though, all test show non-neglegible power in this case.

5

Empirical illustration We initially replicate the results in Camacho, Pérez-Quirós and Poncela (2012), who con-

struct a monthly US coincident index by combining the indicators of economic activity previously analysed by Stock and Watson (1991), Chauvet (1998) and Chauvet and Pigier (2008). Specifically, they use the industrial production index (IPI), nonfarm payroll employment (EMP), personal income less transfer payments (INC) and real manufacturing and trade sales (SAL). The sample covers the period January 1967 to November 2010 for a total e¤ective sample length of 526 observations. As usual, the seasonally adjusted series are log-transformed and di¤erenced to achieve stationarity. Their basic single factor model speci…cation is 2 2 3 2 3 3 IP It b1 u1;t 6 EM Pt 7 6 b2 7 6 u2;t 7 6 6 7 6 7 7 4 IN Ct 5 = 4 b3 5 xt + 4 u3;t 5 ; SALt b4 u4;t xt = ui;t =

x;1 xt 1

+

i;1 ui;t 1

+

x;2 xt 2

+ ft ;

i;2 ui;t 2

+ vi;t ;

i = 1; : : : ; 4:

Each variable is individually standardised, the …rst two observations are discarded and the scale indeterminacy is eliminated by setting V ar(ft ) = 1. We report the spectral maximum likelihood estimates of the parameters in Table 1.

Table 1: Spectral maximum likelihood estimates bi 1 2 2

x 0.43 0.22 1

IPI 0.68 -0.25 -0.21 0.27

EMP 0.50 0.24 0.52 0.25

17

INC 0.28 -0.20 -0.05 0.85

SAL 0.45 -0.36 -0.16 0.59

These estimates are very close to the estimates obtained on the basis of the usual time domain log-likelihood. Our spectral LM test against …rst order neglected residual serial correlation in the common factor takes the value of 4.28 with a p-value of 3.9%. The same speci…cation test for all three idiosyncratic factors is equal to 34.01, whose p-value is essentially zero. Not surprising, the joint test (36.3) rejects the null of correct speci…cation at all conventional levels of signi…cance. As we saw in section 4.2, though, the massive rejection of the null in the case of the idiosyncratic factors might partly explain the mild rejection observed for the common factor. For that reason, we re-estimate the model with the same Ar(2) speci…cation for the common factor, but allowing for Arma(2,1) idiosyncratic terms. This time we no longer reject when we look at either the common factor or the idiosyncratic terms. Camacho, Pérez-Quirós and Poncela (2012) argue that many features of the business cycle are better represented by a Markov switching model than by a linear model. In this regard, we prove in Appendix C that a simple two-regime Markov model for the mean of the common factor would generate the autocorrelation structure of an Arma(1,1) process. Therefore, the Ar(2) speci…cation for xt should have been rejected. Nevertheless, it is conceptually possible that the implied Arma(1,1) process could be such that an Ar(2) still provides a reasonable linear approximation. In any case, our results suggest that their Markov switching model should allow for more ‡exible dynamics in the idiosyncratic terms.

6

Conclusions and extensions We derive computationally simple and intuitive expressions for score tests of neglected se-

rial correlation in common and idiosyncratic factors in dynamic factor models using frequency domain methods. Our tests can focus on all state variables, the common factors only, the speci…c factors only, or indeed some of their elements. The implicit orthogonality conditions are analogous to the conditions obtained by treating the Wiener-Kolmogorov-Kalman smoothed estimators of the innovations in common and idiosyncratic factors as if they were observed, but they account for their …nal estimation errors. We conduct Monte Carlo exercises to study the …nite sample reliability and power of our proposed tests. Our simulation results suggest that they have rather accurate sizes in …nite samples. They also con…rm that our tests have power to detect neglected serial correlation in common or speci…c factors, and that they are also systematically able to correctly identify the source of the rejection. Finally, we evaluate the empirical usefulness of our tests by assessing the dynamic factor

18

model used by Camacho, Pérez-Quirós and Poncela (2012) to construct a coincident indicator for the US. The testing procedures developed in the previous sections can be extended in several interesting directions. One obvious possibility would be to consider models with multiple common factors. Although this would be intensive in notation, such an extension would be otherwise straightforward after dealing with the usual identi…cation issues before estimating the model under the null. It should also be possible to extend our procedures to multivariate regressions whose residuals follow a dynamic factor model. In that regard, Fiorentini and Sentana (2012) provide a thorough comparison of the LM tests that we have considered with serial correlation tests based on reduced form residuals, showing that there are clear power gains from exploiting the cross-sectional dependence structure implicit in factor models. Another worthwhile extension would cover restrictions on the dynamic factor loadings. Examples of interesting null hypotheses of this sort would be the equality of the impulse responses of a common factor for two or more observed series, or a …nite lag limit for those responses. Given that we show in Appendix B that the scores of the dynamic loadings can be related to the normal equations in a distributed lag regression of yt on xK tj1 , it should be fairly straightforward to derive those tests. Relatedly, we could also study the asymptotic power properties of our proposed tests against such alternatives, or indeed alternatives in which there are missing dynamic factors under the null. Throughout the paper we have maintained the assumption of normality. To understand its implications, let

t

and

t

denote the conditional mean vector and covariance matrix of yt

given its past alone, which can be obtained from the prediction equations of the Kalman …lter. Given that the serial correlation parameters

e¤ectively enter through

t

only, the information

matrix equality should continue to hold for their scores. In any case, Dunsmuir and Hannan (1976) and Dunsmuir (1979) already provided sandwich formulas for the asymptotic variances of estimators obtained by maximising the spectral log-likelihood function (3). Similarly, it would be straightforward to exploit the asymptotic orthogonality of the frequency components of the Whittle likelihood to devise suitable bootstrap procedures (see Dahlhaus and Janas (1996) or Kirch and Politis (2011)). Although we have only considered state variables with rational spectral densities, in principle our methods could be applied to long memory processes too. In this regard, it would be worth exploring the long memory alternatives considered by Robinson (1991). More generally, it would also be interesting to consider non-parametric alternatives such as the ones studied by Hong (1996), in which the lag length is implicitly determined by the choice of bandwidth parameter

19

in a kernel-based estimator of a spectral density matrix. Another potential extension would directly deal with non-stationary factor models, such as the common stochastic trends models in Peña and Poncela (2006), without transforming the observed variables to achieve stationarity. In this regard, we would expect our proposed tests to remain valid in those circumstances too because they focus on the stationary components of dynamic factors models. Given their ubiquitousness in the recent empirical literature (see e.g. Bai and Ng (2008) and the references therein), the extension of our methods to approximate factor models in which (i) the cross-sectional dimension is non-negligible relative to the time series dimension; and (ii) there is some mild contemporaneous and dynamic correlation between the idiosyncratic terms would constitute a very valuable addition. Although Doz, Giannone and Reichlin (2012) have recently proved the consistency of the Gaussian pseudo ML estimators that we have used in such contexts, the extension of our tests would require asymptotic distributions under suitable rates, which are as yet unknown. Finally, it is worth mentioning that although we have exploited some speci…cities of dynamic factor models, our procedures can be easily extended to most unobserved components time series processes, and in particular to Ucarima models (see Maravall (1999)) and the statespace models underlying the recent nowcasting literature (see Banbura, Giannone and Reichlin (2010) and the references therein). We are currently pursuing some of these research avenues.

20

References Andrews, D.W.K. and Ploberger, W. (1996): “Testing for serial correlation against an Arma(1,1) process”, Journal of the American Statistical Association 91, 1331–1342. Bai, J. and Ng, S. (2008): “Large dimensional factor analysis”, Foundations and Trends in Econometrics 3, 89–163. Banbura, M., Giannone, D., and Reichlin, L. (2010): “Nowcasting”, forthcoming in M.P. Clemens and D.F. Hendry (eds.), Oxford Handbook on Economic Forecasting. Breusch, T. S. and Pagan, A.R. (1980): “The Lagrange Multiplier test and its applications to model speci…cation in econometrics”, Review of Economic Studies 47, 239-253. Camacho, M., Pérez-Quirós, G. and Poncela, P. (2012), “Extracting nonlinear signals from several economic indicators”, CEPR Discussion Paper 8865. Chauvet, M. (1998): “An econometric characterization of business cycle dynamics with factor structure and regime switches”, International Economic Review 39, 969-96. Chauvet, M., and Piger, J. (2008): “A comparison of the real-time performance of business cycle dating methods”, Journal of Business and Economic Statistics 26, 42-49. Choudhuri , N., Ghosal S. and Roy, A. (2004): “Contiguity of the Whittle measure for a Gaussian time series”, Biometrika 91, 211-218. Connor, G., Goldberg, L.R. Korajczik, R.A. (2010): Portfolio risk analysis, Princeton. Dahlhaus, R. and Janas, D. (1996): “A frequency domain bootstrap for ratio statistics in time series analysis”, Annals of Statistics 24, 1934-1963. Davidson, R. and MacKinnon, J.G. (1998): “Graphical methods for investigating the size and power of tests statistics”, The Manchester School 66, 1-26. Dempster, A., Laird, N. and Rubin, D. (1977): “Maximum likelihood from incomplete data via the EM algorithm”, Journal of the Royal Statistical Society B 39, 1-38. Diebold, F.X. and Rudebusch, G.D. (1996), “Measuring business cycles: a modern perspective”, Review of Economics and Statistics, 78, 67-77. Diebold, F.X., Rudebusch, G.D. and Aruoba, B. (2006), “The macroeconomy and the yield curve: a dynamic latent factor approach”, Journal of Econometrics 131, 309-338. Doz, C., Giannone, D. and Reichlin, L. (2012): “A quasi maximum likelihood approach for large approximate dynamic factor models”, Review of Economics and Statistics 94, 1014-1024. Dufour, J.M. and Dagenais, M.G. (1991): “Invariance, nonlinear models and asymptotic tests”, Econometrica 59, 1601-1615. Dungey, M., Martin, V.L. and Pagan, A.R. (2000): “A multivariate latent factor decomposition of international bond yield spreads”, Journal of Applied Econometrics 15, 697-715.

21

Dunsmuir, W. (1979): “A central limit theorem for parameter estimation in stationary vector time series and its application to models for a signal observed with noise”, Annals of Statistics 7, 490-506. Dunsmuir, W. and Hannan, E.J. (1976): “Vector linear time series models”, Advances in Applied Probability 8, 339-364. Engle, R.F. and Watson, M.W. (1980): “Formulation générale et estimation de models multidimensionnels temporels a facteurs explicatifs non observables”, Cahiers du Séminaire d’Économétrie 22, 109-125. Engle, R.F. and Watson, M.W. (1981): “A one-factor multivariate time series model of metropolitan wage rates”, Journal of the American Statistical Association 76, 774-781. Fernández, F.J. (1990): “Estimation and testing of a multivariate exponential smoothing model”, Journal of Time Series Analysis 11, 89–105. Fiorentini, G. (1995): Conditional heteroskedasticity: some results on estimation, inference and signal extraction, with an application to seasonal adjustment, unpublished Doctoral Dissertation, European University Institute. Fiorentini, G. and Sentana, E. (2009): “Dynamic speci…cation tests for static factor models”, CEMFI Working Paper 0912. Fiorentini, G. and Sentana, E. (2012): “Tests for serial dependence in static, non-Gaussian factor models”, CEMFI Working Paper 1211. Fiorentini, G., Sentana, E. and Shephard, N. (2004): “Likelihood estimation of latent generalised Arch structures”, Econometrica 72, 1481-1517. Geweke, J.F. (1977): “The dynamic factor analysis of economic time series models,” in D. Aigner and A. Goldberger (eds.), Latent Variables in Socioeconomic Models, 365-383, NorthHolland. Geweke, J.F. and Singleton, K.J (1981): “Maximum likelihood "con…rmatory" factor analysis of economic time series”, International Economic Review 22, 37-54. Godfrey, L.G. (1988): Misspeci…cation tests in econometrics: the Lagrange multiplier principle and other approaches, Econometric Society Monographs. Gómez, V. (1999): “Three equivalent methods for …ltering …nite nonstationary time series”, Journal of Business and Economic Statistics 17, 109-116. Gourieroux, C., Monfort, A. and Renault, E. (1991): “A general framework for factor models”, mimeo, INSEE. Gregory, A.W., Head, A.C. and Raynauld, J. (1997): “Measuring world business cycles”, International Economic Review 38, 677-701.

22

Harvey, A.C. (1989): Forecasting, structural models and the Kalman …lter, Cambridge University Press, Cambridge. Heaton, C. and Solo, V. (2004): “Identi…cation of causal factor models of stationary time series”, Econometrics Journal 7, 618-627. Hong, Y. (1996): “Consistent testing for serial correlation of unknown form” Econometrica 64, 837-864. Jegadeesh, N. and G.G. Pennacchi (1996): “The behavior of interest rates implied by the term structure of eurodollar futures”, Journal of Money, Credit and Banking 28, 426-446. Jungbacker, B. and S.J. Koopman (2008): “Likelihood-based analysis for dynamic factor models”, Tinbergen Institute Discussion Paper 2008-0007. Kirch, C. and Politis, D.N. (2011): “TFT-Bootstrap: Resampling time series in the frequency domain to obtain replicates in the time domain”, Annals of Statistics 39, 1427-1470. Landefeld, J.S., Seskin, E.P. and Fraumeni, B.M. (2008): “Taking the pulse of the economy: measuring GDP”, Journal of Economic Perspectives 22, 193–216. Lawley, D.N. and Maxwell, A.E. (1971): Factor analysis as a statistical method, 2nd ed., Butterworths, London. Litterman, R. and Sargent, T.J. (1979): “Detecting neutral price level changes and the e¤ects of aggregate demand with index models”, Research Department Working Paper 125, Federal Reserve Bank of Minneapolis. Magnus, J.R. (1988): Linear structures, Oxford University Press, New York. Magnus, J.R. and Neudecker, H. (1988): Matrix di¤ erential calculus with applications in Statistics and Econometrics, Wiley, Chichester Maravall, A. (1999): “Unobserved components in economic time series”in M.H. Pesaran and M.R. Wickens (eds.), Handbook of Applied Econometrics. Volume I: Macroeconomics, Blackwell. Mody, A. and Taylor, M.P. (2007): “Regional vulnerability: The case of East Asia”, Journal of International Money and Finance 26, 1292-1310. Newey, W.K. (1985): “Maximum likelihood speci…cation testing and conditional moment tests”, Econometrica 53, 1047-70. Peña, D. and Box, G.E.P. (1987): “Identifying a simplifying structure in time series”, Journal of the American Statistical Association 82, 836-843. Peña, D. and Poncela, P. (2006): “Nonstationary dynamic factor analysis”, Journal of Statistical Planning and Inference 136, 1237-1257. Quah, D. and Sargent, T. (1993): “A dynamic index model for large cross sections”, in J.H. Stock and M.W. Watson (eds.), Business cycles, indicators and forecasting, 285-310, University

23

of Chicago Press. Robinson, P. M. (1991): “Testing for strong serial correlation and dynamic conditional heteroskedasticity in multiple regression”, Journal of Econometrics 47, 67-84. Sentana, E. (2000): “The likelihood function of conditionally heteroskedastic factor models”, Annales d’Economie et de Statistique 58, 1-19. Sentana, E. (2004): “Factor representing portfolios in large asset markets”, Journal of Econometrics 119, 257-289. Singleton, K.J. (1981): “A latent time series model of the cyclical behavior of interest rates”, International Economic Review 21, 559-575. Stock, J.H. and Watson, M.W. (1989): “New indexes of coincident and leading economic indicators”, NBER Macroeconomics Annual 4, 351-394. Stock, J., and Watson, M.W. (1991): “A probability model of the coincident economic indicators”, in K. Lahiri and G. Moore (eds.), Leading economic indicators: new approaches and forecasting records, Cambridge University Press. Stock, J.H. and Watson, M.W. (1993): “A procedure for predicting recessions with leading indicators: econometric issues and recent experience”, in J.H. Stock and M.W. Watson (eds.), Business cycles, indicators, and forecasting, 95-153, University of Chicago Press. Tauchen, G. (1985): “Diagnostic testing and evaluation of maximum likelihood models”, Journal of Econometrics 30, 415-443. Watson, M.W. and Engle, R.F. (1983): “Alternative algorithms for estimation of dynamic MIMIC, factor, and time varying coe¢ cient regression models”, Journal of Econometrics 23, 385-400. Watson, M.W. and Kraft, D.F. (1984): “Testing the interpretation of indices in a macroeconomic index model”, Journal of Monetary Economics 13, 165-182. Whittle, P. (1962): “Gaussian estimation in stationary time series”, Bulletin of the International Statistical Institute 39, 105-129.

24

Appendix A

Time domain tests The simplest state space representation of a dynamic factor model with an AR(2) common

factor is: 1. Measurement equation: xt xt 1

yt = (cj0)

+ ut ;

2. Transition equation: xt xt 1

=

1

2

1

0

xt xt

1

1 0

+

2

ft :

Therefore, the Kalman …lter prediction equations will be: xtjt 1 xt 1jt 1

=

t(

1

2

1

0

) = ytjt

xt xt

= =

1

2

1

0

2! 1 11t 1jt 1

! 11t ! 21t

! 21t ! 22t

1jt 1 1jt 1

+ 2 1 2 ! 21t 1jt 1 + 1 ! 11t 1jt 1 + 2 ! 21t

xt

xtjt 1 xt 1jt 1

! 11tjt ! 21tjt

=

1 xt 1jt 1

=

2jt 1

= (cj0)

1

tjt 1

1jt 1

! 21tjt ! 22tjt

1 1

1jt 1

1

1jt 1

2

2! 2 22t 1jt 1

+1

= cxtjt

+

2 xt 2jt 1

;

(A1)

1jt 1

1;

1 1

1 0

1 0

+

1 ! 11t 1jt 1

+

! 11t

1jt 1

1 0

1

2 ! 21t 1jt 1

(A2)

1jt 1

and tjt 1 (

) = (cj0)

c0 00

tjt 1

+

= c! 11tjt

1c

0

+ :

Similarly, the updating equations will be: xtjt xt 1jt

=

xtjt 1 xt 1jt 1 =

+

! 11tjt ! 21tjt

! 21tjt ! 22tjt

1 1

xtjt 1 + ! 11tjt 1 c0 xt 1jt 1 + ! 21tjt 1 c0

1 1

1 tjt 1 ( )(yt 1 tjt 1 ( )(yt

c0 00

1 tjt 1 (

cxtjt 1 ) cxtjt 1 )

)(yt

cxtjt

1)

!

(A3)

and tjt

=

=

tjt 1

! 11tjt ! 21tjt

1 1

! 21tjt ! 22tjt

1

1 1 ! 11tjt 1 tjt 1 ( )c ! 21tjt 1 ! 21tjt 1 ! 11tjt 1 c0 tjt1 1 (

! 211tjt 1 c0

c0 00

1 tjt 1 (

) (cj0)

! 11tjt ! 21tjt

! 21tjt 1 ! 21tjt 1 ! 11tjt )c ! 22tjt 1 ! 221tjt 1 c0

25

1

c0

1 1

! 21tjt ! 22tjt

1 tjt 1 (

1 tjt 1 (

)c

)c

1 1

!

:

(A4)

If we call 1 tjt 1 (

ftjt = c0

)(yt

cxtjt

1)

(A5)

0

1 tjt 1 (

)c;

(A6)

and $tjt = 1

! 11tjt

1c

where we can interpret ftjt as the conditional expectation of ft given yt ; yt the covariance between xt and ft conditional on yt ; yt

1 ; : : :,

1 ; : : :,

and $tjt as

then we can write the previous

expressions as xtjt xt 1jt

xtjt 1 + ! 11tjt 1 ftjt xt 1jt 1 + ! 21tjt 1 ftjt

=

and tjt

! 11tjt ! 21tjt

=

1 $ tjt 1 $ tjt

(! 22tjt

! 221tjt

1

! 21tjt 1 ! 1 11tjt

1 $ tjt 1)

+ ! 221tjt

!

1 1 ! 11tjt 1 $ tjt

:

These expressions simplify considerably further when j j > 0, in which case the Woodbury formula yields 1 tjt 1 (

1 cc0

1

)=

1 ! 11tjt

1

1

+ c0

1c

:

Speci…cally, 1 ! 11tjt

$tjt =

1 ! 11tjt 1

+

1 c0

1c

and ftjt = $tjt c0

1

(yt

1 );

cxtjt

where we have used the fact that c

0

1 tjt 1 (

)=c

0

c0

1

1 ! 11tjt

and 1 tjt 1 (

c0

)c =

1 cc0 1

1 ! 11tjt

1

+ c0

1c

1 ! 11tjt 1 ! 11tjt

1c

=

0

1 ! 11tjt

1

1c

0

1

+ c0

1c

1c

0 1+c

1c

:

In order to …nd the log-likelihood score, it is convenient to write d d

t(

tjt 1 (

) = dc xtjt

1

) = dc ! 11tjt

+ c dxtjt 1c

0

1;

+ c d! 11tjt

1

c0 + c! 11tjt

1

dc0 + d ;

whence @

t( 0

)

@ @vec[ t ( )] @ 0 xtjt 1 @ xt 1jt 1 @ 0

= (xtjt = (IN 2

1

IN )

@xtjt 1 @c ; 0 +c @ @ 0

KN N )(c! 11tjt

1

IN )

@c + (c @ 0

@vec = [( xt

1jt 1

xt

2jt 1

)

I2 ] 26

@

c)

1

2

1

0 0

@! 11tjt @ 0

1

+ EN

@ ; @ 0 @

+

1

2

1

0

xt xt @

1jt 1 2jt 1 0

and @vec( @

@vec

1)

tjt 0

= (I4

K22 )

+

1

2

1

0

I2

t 1jt 1

1

2

1

2

1

0

1

0

@vec(

2

0 0

@ 1) :

t 1jt 0

@

1

1

These last two expressions can be considerably simpli…ed if we di¤erentiate (A1) and (A2) directly. Speci…cally, @xtjt @ 0 @! 11tjt @ 0

1

1

= xt

= 2( 1 ! 11t +

@! 21tjt @ 0

1

@ @

1 0

1jt 1

+

1jt 1

2 @! 11t 1jt 1 1 0

@

= ! 11t

1jt

@ 1 @

1 0

+ xt

2jt 1

@ @

2 0

2 ! 21t 1jt 1 )

+2

+

@ @

1 0

@! 21t 1jt @ 0 @ 2 + 1jt 1 @ 0

1

1

+

+ 2( 1 ! 21t 1

1 2

+ ! 21t

@xt 1jt @ 0

2

1jt 1

@xt 2jt @ 0

+

1

;

2 ! 22t 1jt 1 )

@ @

2 0

2 @! 22t 1jt 1 ; 2 0

+

@

@! 11t 1 @

1jt 1 0

+

2

@! 21t @

1jt 1 ; 0

1 tjt 1 (

)(yt

and @! 22tjt @ 0

1

@! 11t @

=

1jt 1 : 0

In any case, we require expressions for @( xt

xt

1jt 1

2jt 1

)

@ and

@vec0 (

t 1jt 1 )

@

;

which we can obtain by di¤erentiating (A3) and (A4). Speci…cally, dxtjt = dxtjt

1

+ d! 11tjt

1

! 11tjt ! 11tjt

c0

1c

1c

0

1 tjt 1 ( 1 tjt 1 (

1 tjt 1 (

0

)(yt ) d

1)

cxtjt tjt 1 (

) dc xtjt

)

+ ! 11tjt 1 tjt 1 (

! 11tjt

1

1c

0

dc0

1

)(yt

cxtjt

1 tjt 1 (

)c dxtjt

cxtjt

1)

1) 1

and dxt

1jt

= dxt

1jt 1

+ d! 21tjt

0 1c

! 21tjt ! 21tjt

1c

1

0

c0

1 tjt 1 (

1 tjt 1 (

1 tjt 1 (

)(yt

) d

cxtjt

tjt 1 (

) dc xtjt

1

27

)

1)

+ ! 21tjt

1 tjt 1 (

! 21tjt

0 1c

1

dc0

1 tjt 1 (

)(yt

cxtjt

1)

1 tjt 1 (

)c dxtjt

1;

)(yt

cxtjt

1)

whence @xtjt @xtjt 0 = @ @ 0

1

+ c0

1 tjt 1 (

[(yt

cxtjt

xtjt

)(yt

1)

cxtjt

0 1)

1 tjt 1 (

)

0

1 tjt 1 (

)

1 ! 11tjt 1 c

@! 11tjt @ 0 ! 11tjt

@c @ 0

1

+ ! 11tjt

1 (yt

1 tjt 1 (

0 1c

! 11tjt

1c

1)

0

1 tjt 1 (

)

@c @ 0

@vec[ t ( )] @ 0 @xtjt 1 1 ( )c @ 0

)]

1 tjt

0

cxtjt

(A7)

and @xt @

1jt 0

=

@xt 1jt @ 0

1

1 tjt 1 (

+ c0

[(yt xtjt

cxtjt

)(yt

0 1)

1 tjt 1 (

)

0

1 tjt 1 (

)

1c

0

1 ! 21tjt 1 c

1)

cxtjt

@! 21tjt @ 0

! 21tjt @c @ 0

1

+ ! 21tjt

@c @ 0

)

)]

1 tjt

0 1c

! 21tjt

1 tjt 1 (

0 1)

cxtjt

@vec[ t ( )] @ 0 @xtjt 1 1 ( )c @ 0 :

1 tjt 1 (

0 1c

1 (yt

Similarly, d! 11tjt = (1 +! 211tjt

1 tjt 1 (

0 1c

d! 21tjt = (1 ! 21tjt

2! 11tjt

1 ! 11tjt 1

) d

1 tjt 1 ( tjt 1 (

1 tjt 1 (

0

! 11tjt

1c

dc0

1 tjt 1 (

1 tjt 1 (

)

)c)d! 21tjt

)c + ! 21tjt

! 21tjt

1

! 211tjt

1

)c

! 211tjt

0 1c

)c)d! 11tjt

! 21tjt

1

1 ! 11tjt 1 c

1 ! 11tjt 1 c

1 tjt 1 (

0

1 tjt 1 (

0

)c

1 tjt 1 (

) dc;

c0

1 tjt 1 (

d! 11tjt

1

1 tjt 1 (

dc0

) d

1

)c

1 tjt 1 (

tjt 1 (

)

dc0

1 tjt 1 (

)c

) dc

and d! 22tjt = d! 22tjt

2! 21tjt

1

+! 221tjt

1c

1c

1 tjt 1 (

0

1 tjt 1 (

0

) d

)cd! 21tjt

tjt 1 (

1 tjt 1 (

)

! 221tjt

1

1

! 221tjt

)c

1c

1 tjt 1 (

0

)c ) dc;

whence @! 11tjt = (1 @ 0

1 tjt 1 (

+[c0 @! 21tjt = (1 @ 0 2! 21tjt

1 ! 11tjt 1 c

0 1c

2! 11tjt

! 11tjt 0

1c

1 tjt 1 (

)

0

)

1 tjt 1 (

)c)

! 211tjt

1 tjt 1 (

)c)

@c + [c0 @ 0

@! 11tjt @ 0

1c

1 tjt 1 (

0

@! 21tjt @ 0

1 tjt 1 (

1

)

1

2! 211tjt )]

@vec[ @

! 21tjt

! 21tjt

1c

1c

t( 0 0

0

1 tjt 1 (

)

@c @ 0

)]

(A8)

1 tjt 1 (

0 1 ! 11tjt 1 c

1 tjt

@! 11tjt 1 @ 0 @vec[ 1 ( )] @

)c

t( 0

and @! 22tjt @ 0

=

@! 22tjt @ 0 +[c0

1

1 tjt 1 (

2! 21tjt )

1c

! 221tjt

0

1 tjt 1 ( 1c

0

1 tjt

28

@! 21tjt 1 2! 221tjt @ 0 @vec[ t ( )] : 1 ( )] @ 0 )c

0 1c

1 tjt 1 (

)

@c @ 0

)]

However, in order to derive the LM test we only need to evaluate these derivatives under the null of H0 :

2

= 0. In that case, xtjt

1

=

1 xt 1jt 1

@ @

1 0

+ xt

and @xtjt @ 0

1

= xt

1jt 1

2jt 1

@ @

2 0

+

1

@xt 1jt @ 0

1

:

Similarly, tjt 1

and @vec( @

tjt 0

1)

2

2 1 ! 11t ! 11t = D2 4

2! 1 11t 1jt 1

=

+1

1 ! 11t 1jt 1

1 ! 11t 1jt 1

@ @

1jt 1 1jt 1

! 11t

0

1 =@

+ 2 1 ! 21t 0 =@ + ! 21t 1jt 1 @! 11t

@ 2 =@ 0 + 21 @! 11t @ 2 =@ 0 + 1 @! 11t 1jt 0 1 =@

1jt 1 1 1jt

where D2 is the duplication matrix of order 2. Hence, when for @xt

1jt 1 =@

0

and @! 11t

0

1jt 1 =@

1jt 1

1jt 1 =@ 0 1 =@

0

3

5;

= 0 we simply require expressions

2

, which unfortunately we can only obtain recursively on

the basis of expressions (A7) and (A8). We also require expressions for xt

2jt 1

and ! 21t

are associated to the derivatives with respect to

2.

derivatives with respect to this parameter when

2

@

t(

@ @vec[ @

)

2

t ( )] 0

1jt 1

under the null, as these quantities

In this sense, it is interesting to obtain the

= 0, which will be given by

@xtjt 1 ; @ 2 @! 11tjt = (c c) @ 2 = c

1

;

(A9)

with @xtjt @ 2 @! 11tjt @ 2

@xtjt @ 2

= (1

! 11tjt

! 11tjt

1c

1

= xt

1

1 tjt 1 (

0 1c

= (1

! 11tjt

= (1

2! 11tjt

= (1

! 11tjt

1c

)c)

)c (yt

1 tjt 1 (

0

+

= 2 1 ! 21t

1 tjt 1 (

0

2jt 1

)c)

@xtjt @ 2

1jt

1

cxtjt @xtjt @ 2

@xt 1jt 1 ; @ 2 2 @! 11t 1+ 1 @

1

1 tjt 1 (

+ c0

1 tjt 1 (

0 1) 1

+ c0

1jt 1

)(yt )c

;

2

cxtjt

@! 11tjt @ 2

1)

@! 11tjt @ 2

1

1

1 tjt 1 (

)(yt

cxtjt

+ ! 211tjt

0 1 [c

1 tjt 1 (

1)

@! 11tjt @ 2

1

and @! 11tjt @ 2

1c 1c

0

0

@! 11tjt @ 2 @! 11tjt 1 2 1 ( )c) @ 2

1 tjt 1 ( 1 tjt

)c+)

29

1

)c]2

@! 11tjt @ 2

1

;

where we have used (A9). Interestingly, if we use expressions (A5) and (A6), we can …nally write @xtjt = $tjt @ 2

@xtjt @ 2

1

+ ftjt

@! 11tjt @ 2

1

and @! 11tjt @! 11tjt = $2tjt @ 2 @ 2

B

1

:

Spectral scores As we saw before, the spectral approximation to the log-likelihood function requires the com-

putation of the sample periodogram matrix Iyy ( j ). Expression (2), though, is far from ideal from a computational point of view, and for that reason we make use of the Fast Fourier TransN original real data matrix Y = (y1 ; : : : ; yt ; : : : ; yT )0 , the

form (FFT). Speci…cally, given the T

FFT creates the centred and orthogonalised T N complex data matrix Z = (z0 ; : : : ; zj ; : : : ; zT by e¤ectively premultiplying Y

0

`T

by the T

1)

T Fourier matrix W. On this basis, we can

easily compute Iyy ( j ) as 2 zj zj , where zj is the complex conjugate transpose of zj . Hence, the spectral approximation to the log-likelihood function for a non-singular Gyy ( ) becomes NT ln(2 ) 2

T 1 1X ln jGyy ( j )j 2 j=0

T 1 2 X zj Gyy1 ( j )zj ; 2 j=0

which can be regarded as the log-likelihood function of T independent but heteroskedastic complex Gaussian observations. But since zj does not depend on

for j = 1; : : : ; T

…rst column of the orthogonal Fourier matrix and z0 = (yT of yt , it immediately follows that the ML of

1 because `T is proportional to the ), where yT is the sample mean

will be yT . As for the remaining parameters, the

score function will be given by: dj ( ) =

1 @vec0 [Gyy ( j )] Gyy1 ( j ) 2 @

G0yy1 ( j ) vec 2 zcj z0j

G0yy ( j ) ;

where zcj = zj 0 is the complex conjugate of zj . Given that dGyy ( ) = dc(e

i

)Gxx ( )c0 (ei ) + c(e

i

)dGxx ( )c0 (ei ) + c(e

i

)Gxx ( )dc0 (ei ) + dGuu ( )

(see Magnus and Neudecker (1988)), it immediately follows that h i h i dvec [Gyy ( )] = c(ei )Gxx ( ) IN dc(e i ) + IN c(e i )Gxx ( ) dc(ei ) h i + c(ei ) c(e i ) dGxx ( ) + EN dvecd [Guu ( )] ; 30

0

where En is the unique n2

n “diagonalisation” matrix which transforms vec(A) into vecd(A)

as vecd(A) = E0n vec(A), and Kmn is the commutation matrix of orders m and n (see Magnus (1988)). But c(e

i

)=

L X

c` ( )e

L X

dc` ( )e

i`

;

(B10)

`= F

so dc(e

i

)=

i`

:

`= F

Consequently, we can write dvec [Gyy ( )] =

L nh X e

i`

c(ei )Gxx ( )

`= F

h + c(ei )

i

c(e

i h IN + IN

ei` c(e

i

)Gxx ( )

i ) dGxx ( ) + EN dvecd [Guu ( )] :

io

dc` ( )

Hence, the Jacobian of vec [Gyy ( )] will be @vec [Gyy ( )] @ 0

=

L nh X

e

i`

c(ei )Gxx ( )

`= F

h + c(ei )

c(e

i

)

i h IN + IN

ei` c(e

i

io @c ` )Gxx ( ) @ 0

i @G ( ) @vecd [Guu ( )] xx + EN : 0 @ @ 0

If we combine this expression with the fact that Gyy1 ( j )

G0yy1 ( j ) vec zcj z0j

= vec 2 G0yy1 ( )zcj z0j G0yy1 ( )

G0yy ( j )

G0yy1 ( )

and I0yy ( ) = zcj z0j we obtain: 2dj ( ) =

L X @c0` @

Gxx ( )c0 (ei ) IN vec 2 G0yy1 ( )I0yy ( )G0yy1 ( ) + IN ei` Gxx ( )c0 (e i ) `= F i @Gxx ( ) h 0 i + c (e ) c0 (e i ) vec 2 G0yy1 ( )I0yy ( )G0yy1 ( ) G0yy1 ( ) @ @vecd0 [Guu ( )] + EN vec 2 G0yy1 ( )I0yy ( )G0yy1 ( ) G0yy1 ( ) @ e

i`

G0yy1 ( )

L X @c0` = vec @ `=

0 1 ( )c(ei )G ( )e i` 2 G0yy1 ( )I0yy ( )Gyy G0yy1 ( )c(ei )Gxx ( )e i` xx i` 0 i 0 1 0 0 1 +2 e Gxx ( )c (e )Gyy ( )Iyy ( )Gyy ( ) ei` Gxx ( )c0 (e i )G0yy1 ( ) F h i @Gxx ( ) + vec 2 c0 (e i )G0yy1 ( )I0yy ( )G0yy1 ( )c(ei ) c0 (e i )G0yy1 ( )c(ei ) @ @vecd0 [Guu ( )] + EN vec 2 G0yy1 ( )I0yy ( )G0yy1 ( ) G0yy1 ( ) : @

Let us now try to interpret the di¤erent components of this expression. The …rst thing to note is that e

i`

h vec 2 G0yy1 ( )I0yy ( )G0yy1 ( )c(ei )Gxx ( ) 31

i G0yy1 ( )c(ei )Gxx ( )

and

h ei` vec 2 Gxx ( )c0 (e

i

0 1 )Gyy ( )I0yy ( )G0yy1 ( )

i

Gxx ( )c0 (e

i )G0yy1 ( )

are complex conjugates because the conjugate of a product is the product of the conjugates, so it su¢ ces to analyse one of them. The transfer function of the Wiener-Kolmogorov smoothed values of the common factor is given by Gxx ( )c0 (ei )Gyy1 ( ): As a result, the periodogram and spectral density of the smoothed values of the common factor will be IxK xK ( ) = 2 G2xx ( )c0 (ei )Gyy1 ( )Iyy ( )Gyy1 ( )c(e GxK xK ( ) = G2xx ( )c0 (ei )Gyy1 ( )c(e

i

);

);

respectively, while the spectral density of its …nal estimation error xt !( ) = Gxx ( )

i

xK tj1 is

GxK xK ( ):

Similarly, the transfer function of the Wiener-Kolmogorov smoothed values of the speci…c factors will be Guu ( )Gyy1 ( ) = IN

c(e

i

)Gxx ( )c0 (ei )Gyy1 ( ):

As a result, the periodogram and spectral density matrix of the smoothed values of the speci…c factors are given by IuK uK ( ) = 2 Guu ( )Gyy1 ( )Iyy ( )Gyy1 ( )Guu ( ); GuK uK ( ) = Guu ( )Gyy1 ( )Guu ( ); respectively, while the spectral density of their …nal estimation errors ut Guu ( )

GuK uK ( ) = !( )c(e

i

uK tj1 is

)c0 (ei ):

K Finally, the co-periodogram and co-spectrum between xK tj1 and utj1 will be

IxK uK ( ) = 2 Gxx ( )c0 (ei )Gyy1 ( )Iyy ( )Gyy1 ( )Guu ( ); GxK uK ( ) = Gxx ( )c0 (ei )Gyy1 ( )Guu ( ): On this basis, if we further assume that Gxx ( ) > 0 and Guu ( ) > 0 we can write 0 1 2 G0yy1 ( )I0yy ( )Gyy ( )c(ei )Gxx ( )e i` G0yy1 ( )c(ei )Gxx ( )e h i = G0uu1 ( ) 2 e i` I0xK uK ( ) e i` G0xK uK ( ) ;

2 c0 (e

i

)G0yy1 ( )I0yy ( )G0yy1 ( )c(ei ) = Gxx2 ( ) [2 IxK xK ( ) 32

c0 (e

i

)G0yy1 ( )c(ei )

GxK xK ( )]

i`

and 2 G0yy1 ( )I0yy ( )G0yy1 ( )

G0yy1 ( ) = G0uu1 ( ) 2 I0uK uK ( )

G0uK uK ( ) G0uu1 ( ):

Therefore, the component of the score associated to c` will be the sum across frequencies of terms of the form

h 0 1 Guu ( ) 2 e

i`

I0xK uK ( )

e

i`

G0xK uK ( )

i

(and their conjugate transposes) which capture the di¤erence between the cross-periodogram K and cross-spectrum of xK t ` and uit inversely weighted by the spectral density of uit . As a result,

we can understand this term as arising from the normal equation in the spectral regression of yit onto xt

`

but taking into account the unobservability of the regressor.

Similarly, the component of the score associated to the parameters that determine Gxx ( ) will be the cross-product across frequencies of the product of the derivatives of the spectral density of xt with the di¤erence between the periodogram and spectrum of xK t inversely weighted by the squared spectral density of xt . In this case, we can interpret this term as arising from a marginal log-likelihood function for xt that takes into account the unobservability of xt . Finally, the component of the score associated to the parameters that determine Gui ui ( ) will be the cross-product across frequencies of the product of the derivatives of the spectral density of uit with the di¤erence between the periodogram and spectrum of uK it inversely weighted by the squared spectral density of uit . Once again, we can interpret this term as arising from the conditional log-likelihood function of uit given xt that takes into account the unobservability of uti . We can then exploit the Woodbury formula Gyy1 ( ) = Guu1 ( )

!( )Guu1 ( )c(e

i

!( ) = [Gxx1 ( ) + c0 (ei t )Guu1 ( )c(e

)c0 (ei )Guu1 ( ); i t

)]

1

;

which greatly simpli…es the computations (see Sentana (2000)). Speci…cally, we will have that h i Gxx ( )c0 (ei )Gyy1 ( ) = Gxx ( )c0 (ei ) Guu1 ( ) !( )Guu1 ( )c(e i )c0 (ei )Guu1 ( ) h i = 1 !( )c0 (ei )Guu1 ( )c(e i ) Gxx ( )c0 (ei )Guu1 ( ) = !( )c0 (ei )Guu1 ( ); so IxK xK ( ) = 2 ! 2 ( )c0 (ei )Guu1 ( )Iyy ( )Guu1 ( )c(e

33

i

)

and

h = 1

GxK xK ( ) = Gxx ( )c0 (ei )Gyy1 ( )c(e i )Gxx ( ) h i = Gxx ( )c0 (ei ) Guu1 ( ) !( )Guu1 ( )c(e i )c0 (ei )Guu1 ( ) c(e i )Gxx ( ) i !( )c0 (ei )Guu1 ( )c(e i ) G2xx ( )c0 (ei )Guu1 ( )c(e i ) = !( )Gxx ( )c0 (ei )Guu1 ( )c(e

i

):

Similarly, h Guu ( )Gyy1 ( ) = Guu ( ) Guu1 ( ) = IN

!( )c(e

i

!( )Guu1 ( )c(e

)c0 (ei )Guu1 ( ) = IN

c(e

i

i

i )c0 (ei )Guu1 ( )

)Gxx ( )c0 (ei )Gyy1 ( );

so IuK uK ( ) = 2

h

IN

!( )c(e

i

i h )c0 (ei )Guu1 ( ) Iyy ( ) IN

!( )c(ei )c0 (e

i

i )Guu1 ( )

and h GuK uK ( ) = Guu ( )Gyy1 ( )Guu ( ) = Guu ( ) Guu1 ( ) = Guu ( )

!( )c(e

i

!( )Guu1 ( )c(e

)c0 (ei ):

i

i )c0 (ei )Guu1 ( ) Guu ( )

Finally, h IxK uK ( ) = 2 !( )c0 (ei )Guu1 ( )Iyy ( ) IN

!( )c(ei )c0 (e

i

i )Guu1 ( )

and h i GxK uK ( ) = Gxx ( )c0 (ei ) Guu1 ( ) !( )Guu1 ( )c(e i )c0 (ei )Guu1 ( ) Guu ( ) h i = Gxx ( )c0 (ei ) IN !( )Guu1 ( )c(e i )c0 (ei ) = !( )c0 (ei ): We can then use those expressions to e¢ ciently compute the required expressions. In particular, we will get Gxx ( )c0 (ei )Gyy1 ( )Iyy ( )Gyy1 ( ) Gxx ( )c0 (ei )Gyy1 ( ) h i = Gxx ( )c0 (ei )Gyy1 ( )Iyy ( ) Guu1 ( ) !( )Guu1 ( )c(e i )c0 (ei )Guu1 ( ) !( )c0 (ei )Guu1 ( ) h i = Gxx ( )c0 (ei )Gyy1 ( )Iyy ( ) IN !( )Guu1 ( )c(e i )c0 (ei ) Guu1 ( ) !( )c0 (ei )Guu1 ( ) h i = IxK uK ( ) !( )c0 (ei ) Guu1 ( );

34

Gyy1 ( )Iyy ( )Gyy1 ( ) Gyy1 ( ) h i h i = Guu1 ( ) !( )Guu1 ( )c(e i )c0 (ei )Guu1 ( ) Iyy ( ) Guu1 ( ) !( )Guu1 ( )c(e i )c0 (ei )Guu1 ( ) h i Guu1 ( ) !( )Guu1 ( )c(e i )c0 (ei )Guu1 ( ) h i h i Guu1 ( ) IN !( )c(e i )c0 (ei )Guu1 ( ) Iyy ( ) IN !( )Guu1 ( )c(e i )c0 (ei ) Guu1 ( ) h i Guu1 ( ) Guu ( ) !( )c(e i )c0 (ei ) Guu1 ( )

= Guu1 ( ) [IuK uK ( )

GuK uK ( )] Guu1 ( )

and c0 (ei )Gyy1 ( )Iyy ( )Gyy1 ( )c(e

C

i

) c0 (ei )Gyy1 ( )c(e

i

) = Gxx1 ( )[IxK xK ( ) GxK xK ( )]Gxx1 ( ):

Autocorrelation structure of a simple Markov switching model Let st denote a binary Markov chain characterised by the following two parameters P (st = 0jst

1

= 0) = p;

P (st = 1jst

1

= 1) = q:

As is well known, the stationary distribution of the chain is characterised by = P (st = 1) =

1 2

p p

q

:

It is easy to see that we can then write (st

) = (p + q

1)(st

1

)+

t;

= 0) = E( t jst

1

= 1) = 0:

(C11)

where E( t jst

1

The proof of this statement follows from computing the four possible values that and the corresponding probabilities conditional on the relevant value of st tedious algebra shows that

t

1.

t

can take,

In this sense,

is equal to p p q 1

1 when when when q when

st st st st

= 0; st = 1; st = 0; st = 1; st

1 1 1 1

=0 =0 =1 =1

Therefore, it follows from (C11) that st has the autocorrelation structure of an Ar(1) with autoregressive coe¢ cient p + q

1. 35

Now let us de…ne the following process xt = (st ) + "t ; where (st ) =

l h

and "t

2)

N (0;

if st = 0 if st = 1

independently of the past, present and future values of st , as well as of the

past values of "t . Given that we can write (st ) as an a¢ ne transformation of st (i.e.

(st ) =

l +( h

l )st ),

it follows that (st ) also has the autocorrelation structure of an Ar(1). Finally, the results on contemporaneous aggregation of Arma models imply that xt , which is the sum of an Ar(1) and uncorrelated white noise, will have the autocorrelation structure of an Arma(1,1). Speci…cally, given that the autocovariance generating function of an Ar(1) with autoregressive coe¢ cient

is (1

!2 L)(1

L

1)

;

where ! 2 is the variance of the innovations, the autocovariance function of the contemporaneously aggregated process will be

(1 where

!2 L)(1 and

2

L

1)

+

2

=

! 2 + 2 (1 L)(1 L 1 (1 L)(1 L )

1)

2

=

(1 (1

L)(1 L)(1

L 1) ; L 1)

, which are easily obtained by equating coe¢ cients, correspond to the root of

the Ma polynomial and variance of the univariate Wold residuals, respectively.

36

Discrepancy

−0.05

−0.04

−0.03

−0.02

−0.01

0

0.01

0.02

0.03

0.04

0.05

0

0.05 Nominal size

0.1

0.15

First order residual correlation in common factor Second order residual correlation in common factor First order residual correlation in specific factors Joint test of first order residual correlation in all factors

Figure 1: P−value discrepancy plots of dynamic specification tests

Rejection percentage

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

0.05

0.1

0.15

0.2

0.25 Nominal size

First order residual correlation in common factor Second order residual correlation in common factor First order residual correlation in specific factors Joint test of first order residual correlation in all factors

0.3

0.35

Figure 2: Rejection rates for ARMA(2,1) common factor

0.4

(ψ=.5)

0.45

0.5

Rejection percentage

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

0.05

0.1

0.15

0.2

0.25 Nominal size

First order residual correlation in common factor Second order residual correlation in common factor First order residual correlation in specific factors Joint test of first order residual correlation in all factors

0.3

0.35

0.4

Figure 3: Rejection rates for AR(2) specific factors (ψi=−.2)

0.45

0.5

Rejection percentage

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

0.05

0.1

0.15

0.2

First order residual correlation in common factor Second order residual correlation in common factor

0.25 Nominal size

0.3

0.35

Figure 4: Rejection rates for AR(2) specific factors

0.4

(ψi=−.6)

0.45

0.5

Rejection percentage

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

0.05

0.1

0.15

0.2

0.25 Nominal size

First order residual correlation in common factor Second order residual correlation in common factor First order residual correlation in specific factors Joint test of first order residual correlation in all factors

0.3

0.35

0.4

0.45

0.5

Figure 5: Rejection rates for ARMA(2,1) common factor, AR(2) specific factors (ψ=.5, ψi=−.2)

CEMFI WORKING PAPERS 0801

David Martinez-Miera and Rafael Repullo: “Does competition reduce the risk of bank failure?”.

0802

Joan Llull: “The impact of immigration on productivity”.

0803

Cristina López-Mayán: demand”.

0804

Javier Mencía and Enrique Sentana: “Distributional tests in multivariate dynamic models with Normal and Student t innovations”.

0805

Javier Mencía and Enrique Sentana: “Multivariate location-scale mixtures of normals and mean-variance-skewness portfolio allocation”.

0806

Dante Amengual and Enrique Sentana: “A comparison of mean-variance efficiency tests”.

0807

Enrique Sentana: “The econometrics of mean-variance efficiency tests: A survey”.

0808

Anne Layne-Farrar, Gerard Llobet and A. Jorge Padilla: “Are joint negotiations in standard setting “reasonably necessary”?”.

0809

Rafael Repullo and Javier Suarez: “The procyclical effects of Basel II”.

0810

Ildefonso Mendez: “Promoting permanent employment: Lessons from Spain”.

0811

Ildefonso Mendez: “Intergenerational time transfers and internal migration: Accounting for low spatial mobility in Southern Europe”.

0812

Francisco Maeso and Ildefonso Mendez: “The role of partnership status and expectations on the emancipation behaviour of Spanish graduates”.

0813

Rubén Hernández-Murillo, Gerard Llobet and Roberto Fuentes: “Strategic online-banking adoption”.

0901

Max Bruche and Javier Suarez: “The macroeconomics of money market freezes”.

0902

Max Bruche: “Bankruptcy codes, liquidation timing, and debt valuation”.

0903

Rafael Repullo, Jesús Saurina and Carlos Trucharte: “Mitigating the procyclicality of Basel II”.

0904

Manuel Arellano and Stéphane Bonhomme: “Identifying characteristics in random coefficients panel data models”.

0905

Manuel Arellano, Lars “Underidentification?”.

0906

Stéphane Bonhomme and Ulrich Sauder: “Accounting for unobservables in comparing selective and comprehensive schooling”.

0907

Roberto Serrano: “On Watson’s non-forcing contracts and renegotiation”.

0908

Roberto Serrano and Rajiv Vohra: “Multiplicity of mixed equilibria in mechanisms: a unified approach to exact and approximate implementation”.

0909

Roland Pongou and Roberto Serrano: “A dynamic theory of fidelity networks with an application to the spread of HIV / AIDS”.

0910

Josep Pijoan-Mas and Virginia Sánchez-Marcos: “Spain is different: Falling trends of inequality”.

0911

Yusuke Kamishiro and Roberto Serrano: “Equilibrium blocking in large quasilinear economies”.

0912

Gabriele Fiorentini and Enrique Sentana: “Dynamic specification tests for static factor models”.

“Microeconometric

Peter

analysis

Hansen

and

of

residential

Enrique

water

distributional Sentana:

0913

Javier Mencía and Enrique Sentana: “Valuation of VIX derivatives”.

1001

Gerard Llobet and Javier Suarez: “Entrepreneurial innovation, patent protection and industry dynamics”.

1002

Anne Layne-Farrar, Gerard Llobet and A. Jorge Padilla: “An economic take on patent licensing: Understanding the implications of the “first sale patent exhaustion” doctrine.

1003

Max Bruche and Gerard Llobet: “Walking wounded or living dead? Making banks foreclose bad loans”.

1004

Francisco Peñaranda and Enrique Sentana: “A Unifying approach to the empirical evaluation of asset pricing models”.

1005

Javier Suarez: “The Spanish crisis: Background and policy challenges”.

1006

Enrique Moral-Benito: “Panel growth regressions with general predetermined variables: Likelihood-based estimation and Bayesian averaging”.

1007

Laura Crespo and Pedro Mira: “Caregiving to elderly parents and employment status of European mature women”.

1008

Enrique Moral-Benito: “Model averaging in economics”.

1009

Samuel Bentolila, Pierre Cahuc, Juan J. Dolado and Thomas Le Barbanchon: “Two-tier labor markets in the Great Recession: France vs. Spain”.

1010

Manuel García-Santana and Josep Pijoan-Mas: “Small Scale Reservation Laws and the misallocation of talent”.

1101

Javier Díaz-Giménez and Josep Pijoan-Mas: “Flat tax reforms: Investment expensing and progressivity”.

1102

Rafael Repullo and Jesús Saurina: “The countercyclical capital buffer of Basel III: A critical assessment”.

1103

Luis García-Álvarez and Richard Luger: “Dynamic correlations, estimation risk, and portfolio management during the financial crisis”.

1104

Alicia Barroso and Gerard Llobet: “Advertising and consumer awareness of new, differentiated products”.

1105

Anatoli Segura and Javier Suarez: “Dynamic maturity transformation”.

1106

Samuel Bentolila, Juan J. Dolado and Juan F. Jimeno: “Reforming an insideroutsider labor market: The Spanish experience”.

1201

Dante Amengual, Gabriele Fiorentini and Enrique Sentana: “Sequential estimation of shape parameters in multivariate dynamic models”.

1202

Rafael Repullo and Javier Suarez: “The procyclical effects of bank capital regulation”.

1203

Anne Layne-Farrar, Gerard Llobet and Jorge Padilla: “Payments and participation: The incentives to join cooperative standard setting efforts”.

1204

Manuel Garcia-Santana and Roberto Ramos: “Dissecting the size distribution of establishments across countries”.

1205

Rafael Repullo: “Cyclical adjustment of capital requirements: A simple framework”.

1206

Enzo A. Cerletti and Josep Pijoan-Mas: “Durable goods, borrowing constraints and consumption insurance”.

1207

Juan José Ganuza and Fernando Gomez: “Optional law for firms and consumers: An economic analysis of opting into the Common European Sales Law”.

1208

Stéphane Bonhomme and Elena Manresa: “Grouped patterns of heterogeneity in panel data”.

1209

Stéphane Bonhomme and Laura Hospido: “The cycle of earnings inequality: Evidence from Spanish Social Security data”.

1210

Josep Pijoan-Mas and José-Víctor Ríos-Rull: “Heterogeneity in expected longevities”.

1211

Gabriele Fiorentini and Enrique Sentana: “Tests for serial dependence in static, non-Gaussian factor models”.

1301

Jorge De la Roca and Diego Puga: “Learning by working in big cities”.

1302

Monica Martinez-Bravo: “The role of local officials in new democracies: Evidence from Indonesia”.

1303

Max Bruche and Anatoli Segura: “Debt maturity and the liquidity of secondary debt markets”.

1304

Laura Crespo, Borja López-Noval and Pedro Mira: “Compulsory schooling, education and mental health: New evidence from SHARELIFE”.

1305

Lars Peter Hansen: “Challenges in identifying and measuring systemic risk”.

1306

Gabriele Fiorentini and Enrique Sentana: “Dynamic specification tests for dynamic factor models”.