Parameter Estimation of GARCH Model Yiyang Yang 12/28/2012
1
GARCH Model
Denition 1. as
The
Autoregressive Moving Average Model
can be represented
ARM A (p, q) yt = C +
p X
ai yt−i + t +
i=1 and
{t }
q X
bi t−i
i=1
are white noise term.
Expand the residual term from white noise to a
ARM A (p, q)
process, we
can obtain GARCH model.
Denition 2.
The
Generalized Autoregressive Conditional Heteroskedastic Model
can be represented as
GARCH (p, q) t = vt
where
vt
p ht
is white noise term
ht = α0 +
q X
αi 2t−i
+
p X
i=1
βi ht−i
i=1
denes the conditional variance.
2
Theory Background
2.1
Maximum Likelihood Method
Given a sample
{x1 , x2 , · · · , xn }
of
n
independent and identically distributed
observations which comes from a distribution
θ,
f (x)
with unknown parameters
then the joint density function is
f (x1 , x2 , · · · , xn |θ ) = f (x1 |θ ) × f (x2 |θ ) × · · · × f (xn |θ ) .
1
x1 , x2 , · · · , xn to be xed parameters of θ will be the function's variable and allowed to vary freely.
By considering the observed values this function, whereas
And this function is called likelihood:
L (θ |x1 , · · · , xn ) = f (x1 , x2 , · · · , xn |θ ) =
n Y
f (xi |θ ) .
i=1 In practice, it is often more convenient to work with the logarithm of the likelihood function and called the log-likelihood:
ln L (θ |x1 , · · · , xn ) =
n X
ln f (xi |θ ) .
i=1
{x1 , x2 , · · · , xn } θ = µ, σ 2 , then
Assume observations known parameters
ln L (θ |x1 , · · · , xn ) =
n X
follow normal distribution with un-
ln
√
i=1
=
n X i=1
=−
2.2
1 2πσ 2
(xi −µ)2
e−
2σ 2
2
1 (xi − µ) − ln 2π − ln σ − 2 2σ 2
!
n 1 X n 2 ln 2π − n ln σ − 2 (xi − µ) 2 2σ i=1
Optimization Method
Newton's method can obtain result of optimization problem by a local quadratic approximation. In the case of multidimensional optimization, we seek a zero of the gradient. Thus, the iteration scheme for Newton's method has the form
~xk+1 = ~xk − Hf−1 (~xk ) ∇f (~xk ) , where
Hf−1 (~x)
is the Hessian matrix of the second partial derivatives of
{Hf (~x)}ij =
f
∂ 2 f (~x) . ∂xi ∂xj
Thus, for the maximum likelihood problem
θˆ = arg maxL (θ) , θ∈Θ
we can obtain approximated value of
θ, which is denoted as θk after k th iteration.
θk+1 = θk − J −1 (θk ) ∇L (θk )
2
with gradient
∇L =
∂L ∂θ
and Fisher Information matrix
∂2L ∂θ∂θT
J =E 2.3
.
Estimation of GARCH Model
To estimate parameters of GARCH model with given
yt = C +
k X
k, p, q ,
we have
ai yt−i + t
i=1
t = vt ht = α0 +
q X
p ht
αi 2t−i
+
p X
i=1 where
vt
is white noise term.
and conditional variance
ht ,
Then
t
is normal distribution with mean zero
that is
p (t |t−1 , · · · , 0 ) = √
2 t 1 e− 2ht . 2πht
The log-likelihood function of parameter vector is
βi ht−i
i=1
θ = (α0 , α1 , · · · , αq , β1 , · · · , βp )
n X
n X 1 2t 1 L (θ) = lt (θ) = − ln 2π − ln ht − 2 2 2ht t=q+1 t=q+1
therefore, we have
2 t ∂lt (θ) 1 ∂ht = − ∂θ 2h2t 2ht ∂θ 2 2 ∂ 2 lt (θ) t 1 ∂ ht 1 2t ∂ht ∂ht = − + − ∂θ∂θT 2h2t 2ht ∂θ∂θT 2h2t h3t ∂θ ∂θT where
p
T X ∂ht−i ∂ht = 1, 2t−1 , · · · , 2t−q , ht−1 , · · · , ht−p + βi . ∂θ ∂θ i=1 Thus the gradient is
∇L (θ) =
2 n t 1 ∂ht 1 X − 2 t=q+1 h2t ht ∂θ
3
T
and Fisher Information matrix is
J=
n X
E
t=q+1
2t 1 − 2h2t 2ht
∂ 2 ht + ∂θ∂θT
1 2t − 2h2t h3t
∂ht ∂ht ∂θ ∂θT
n 1 X 1 ∂ht ∂ht =− E . 2 t=q+1 h2t ∂θ ∂θT
Example 3.
Consider the most widely applied
GARCH (1, 1)
model
p t = vt ht ht = α0 + α1 2t−1 + β1 ht−1 to estimate the unknown coecients
1 ∇L (θ) = 2
n X t=2
where
3
2t 1 − 2 ht ht
T
θ = (α0 , α1 , β1 ) n
, there are
∂ht 1X , J =− E ∂θ 2 t=2
1 ∂ht ∂ht h2t ∂θ ∂θT
T ∂ht ∂ht−1 = 1, 2t−1 , ht−1 + β1 . ∂θ ∂θ
Algorithm •
n
{yt }t=1 , we can obtain C, a ˆ1 , · · · , a ˆk from best tting ˆ + Pk a autoregressive model AR (k) and yt = C ˆ y + ˆt . i t−i i=1 √ • Since vt is white noise term, then t = vt ht is a normal distribution with mean zero and variance ht . Thus the likelihood function of 1 , · · · , T is n Y 1 2t √ L= exp − 2ht 2πht t=q+1 Given observations
and the log likelihood function is
n X 1 2t 1 log L (θ) = − ln 2π − ln ht − 2 2 2ht t=q+1 •
T
Given initial guess of
θ0 = (α0 , α1 , · · · , αq , β1 , · · · , βp )
∂h1 estimation of ∂θ , · · ·
∂hq ∂θ , we have
,
θk+1 = θk − J −1 (θk ) ∇L (θk ) where
∇L (θ) =
2 n 1 X t 1 ∂ht − 2 t=q+1 h2t ht ∂θ 4
and reasonable
p
T X ∂ht−i ∂ht = 1, 2t−1 , · · · , 2t−q , ht−1 , · · · , ht−p + βi ∂θ ∂θ i=1 and
{t } •
n 1 X 1 ∂ht ∂ht J =− E 2 t=q+1 h2t ∂θ ∂θT are derived from the rst step.
Repeat the interation process until we get convergent
5
θ
Reference: 1. Michael T. Heath,
Scientic Computing: An Introductory Survey,
The
McGraw-Hill Companies, 1997 2. Chao Li, On Estimation of GARCH Models with an Application to Nordea Stock Prices 3. Tim Bollerslev, Generalized Autoregressive Conditional Heteroskedasticity,
Journal of Econometrics,31 (1986) 307-327
4. Walter Enders, Applied Econometric Time Series, John Wiley & Sons, Inc 5. Richard Harris, Robert Sollis, Applied Time Series Modelling and Forecasting, John Wiley & Sons Ltd. 6. Christian Menn, Svetlozar T. Rachev, Soomthly Truncated Stable Distributions, GARCH-Models and Option Pricing,
Finance, Nov.
Journal of Banking and
2007
7. Svetlozar T. Rachev, Young Shin Kim, Michele L. Bianchi, Frank J. Fabozzi,
Financial Models with Lévy Processes and Volatility Clustering,
John Wiley and Sons, ISBN 0470482354, 9780470482353, 2011 8. Jin-Chuan Duan, The GARCH Option Pricing Model, Mathematical Finance, Vol. 5, No. 1, pp 13-32, Jan. 1995
6