LECTURE NOTES ON GARCH MODELS [PDF]

The ARCH model has showed to be particularly useful in modeling the temporal dependencies in asset returns. The ARCH mod

16 downloads 30 Views 545KB Size

Recommend Stories


Lecture Notes on Sorting
If you are irritated by every rub, how will your mirror be polished? Rumi

Lecture notes on Topology
You often feel tired, not because you've done too much, but because you've done too little of what sparks

Evaluating GARCH models
Be grateful for whoever comes, because each has been sent as a guide from beyond. Rumi

Lecture notes on metric embeddings
Silence is the language of God, all else is poor translation. Rumi

Lecture Notes On Cluster Algebras
And you? When will you begin that long journey into yourself? Rumi

Lecture notes on Morse Homology
Sorrow prepares you for joy. It violently sweeps everything out of your house, so that new joy can find

Lecture notes on visual attention
Kindness, like a boomerang, always returns. Unknown

Lecture notes on Fourier series
In the end only three things matter: how much you loved, how gently you lived, and how gracefully you

lecture notes on environmental studies
Don’t grieve. Anything you lose comes round in another form. Rumi

evidence from garch models
Ask yourself: What kind of legacy do you want to leave behind? Next

Idea Transcript


LECTURE NOTES ON GARCH MODELS EDUARDO ROSSI University of Pavia March, 2004

2 Abstract In these notes we present a survey of the theory of univariate and multivariate GARCH models. ARCH, GARCH, EGARCH and other possible nonlinear extensions are examined. Conditions for stationarity (weak and strong) are presented. Inference and testing is presented in the quasi-maximum likelihood framework. Multivariate parameterizations are examined in details.

Contents 1 Univariate ARCH models 1.1 Empirical regularities . . . . . . . . . . . . . . . . 1.2 Why do we need ARCH models? . . . . . . . . . 1.3 The ARCH(q) Model . . . . . . . . . . . . . . . . 1.3.1 The ARCH Regression Model . . . . . . . 1.3.2 ARCH as a nonlinear model . . . . . . . . 1.4 The GARCH(p,q) Model . . . . . . . . . . . . . . 1.4.1 The Yule-Walker equations for the squared 1.4.2 The GARCH Regression Model . . . . . . 1.4.3 Stationarity . . . . . . . . . . . . . . . . . 1.4.4 Forecasting volatility . . . . . . . . . . . . 1.4.5 The IGARCH(p,q) model . . . . . . . . . 1.4.6 Persistence . . . . . . . . . . . . . . . . . 1.4.7 The Component Model . . . . . . . . . . . 1.5 Asymmetric Models . . . . . . . . . . . . . . . . . 1.5.1 The EGARCH(p,q) Model . . . . . . . . . 1.5.2 Other Asymmetric Models . . . . . . . . . 1.6 The News Impact Curve . . . . . . . . . . . . . . 1.7 The GARCH-in-mean Model . . . . . . . . . . . 1.8 Long memory in stock returns . . . . . . . . . . . 2 Estimation procedures 2.1 Quasi-Maximum Likelihood Estimation . . . . 2.1.1 Kullback Information Criterion . . . . 2.1.2 Quasi-Maximum Likelihood Estimation 2.2 Testing in GARCH models . . . . . . . . . . . 2.2.1 The GARCH(1,1) case . . . . . . . . . 2.3 Testing for ARCH disturbances . . . . . . . . 2.4 Test for Asymmetric E®ects . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . Theory . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . . . . . . . . . . . . . .

1 1 1 3 7 7 8 10 11 11 12 13 14 15 19 19 22 24 26 28

. . . . . . .

33 36 36 36 40 40 42 44

4

3 Multivariate GARCH models 3.1 Introduction . . . . . . . . . . . . . . . . . 3.2 Vech representation . . . . . . . . . . . . . 3.2.1 Diagonal vech model . . . . . . . . 3.3 BEKK representation . . . . . . . . . . . . 3.3.1 Covariance Stationarity . . . . . . . 3.4 Constant Correlations Model . . . . . . . . 3.5 Factor ARCH model . . . . . . . . . . . . 3.6 Asymmetric Multivariate GARCH-in-mean 3.7 Estimation procedure . . . . . . . . . . . . 4 References

CONTENTS

. . . . . . . . . . . . . . . . . . . . . . . . . . . . model . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

49 49 50 50 51 54 55 56 59 60 63

Chapter 1 UNIVARIATE ARCH MODELS 1.1

Empirical regularities

GARCH models have been developed to account for empirical regularities in ¯nancial data. Many ¯nancial time series have a number of characteristics in common. 1. Asset prices are generally non stationary. Returns are usually stationary. Some ¯nancial time series are fractionally integrated. 2. Return series usually show no or little autocorrelation. 3. Serial independence between the squared values of the series is often rejected pointing towards the existence of non-linear relationships between subsequent observations. 4. Volatility of the return series appears to be clustered. 5. Normality has to be rejected in favor of some thick-tailed distribution. 6. Some series exhibit so-called leverage e®ect, that is changes in stock prices tend to be negatively correlated with changes in volatility. A ¯rm with debt and equity outstanding typically becomes more highly leveraged when the value of the ¯rm falls.This raises equity returns volatility if returns are constant. Black, however, argued that the response of stock volatility to the direction of returns is too large to be explained by leverage alone. 7. Volatilities of di®erent securities very often move together. 1.2

Why do we need ARCH models?

Wold's decomposition theorem establishes that any covariance stationary f|w g may be written as the sum of a linearly deterministic component and a linearly stochastic with a square-summable, one-sided moving average representation. We can write, |w = gw + xw

2

Univariate ARCH models

gw is linearly deterministic and xw is a linearly regular covariance stationary stochastic process, given by xw = E (O) %w 1 X E (O) = el Ol l=0

1 X e2l ? 1

e0 = 1

l=0

H [%w ] = 0 ½ 2  % ? 1> li w =  H [%w % ] = 0> rwkhuzlvh The uncorrelated innovation sequence need not to be Gaussian and therefore need not be independent. Non-independent innovations are characteristic of non-linear time series in general and conditionally heteroskedastic time series in particular. Now suppose that |w is a linear covariance stationary process with i.i.d. innovations as opposed to merely white noise. The unconditional mean and variance are H [|w ] = 0 1 X £ ¤ H |w2 =  2% e2l l=0

which are both invariant in time. The conditional mean is time varying and is given by H [|w j©w¡1 ] =

1 X el %w¡l l=1

where the information set is ©w¡1 = f%w¡1 > %w¡2 > = = = g. This model is unable to capture the conditional variance dynamics. In fact, the conditional variance of |w is constant at £ ¤ H (|w ¡ H [|w j©w¡1 ])2 j©w¡1 =  2% =

This restriction manifests itself in the properties of the k-step-ahead conditional prediction error variance. The k-step-ahead conditional prediction is H [|w+n j©w ] =

1 X en+l %w¡l l=0

The ARCH(q) Model

3

and the associated prediction error is n¡1 X |w+n ¡ H [|w+n j©w ] = el %w+n¡l l=0

which has a conditional prediction error variance n¡1 X £ ¤ H (|w+n ¡ H [|w+n j©w ])2 j©w =  2% e2l l=0

As n ! 1 the conditional prediction error variance converges to the unconditional 1 P variance  2% e2l . For any n, the conditional prediction error variance depends only l=0

on n and not on ©w . In conclusion, the simple "i.i.d. innovations model" is unable to take into account the relevant information which is available at time w. 1.3

The ARCH(q) Model

Let f%w ()g denote a discrete time stochastic process with conditional mean and variance parametrized by a the ¯nite dimensional vector £ µ Rp , where 0 denotes the true value. We assume, for the moment, that %w (0 ) is a scalar. Hw¡1 [¢] denotes the conditional expectation when the conditioning set is composed by the past values of the process along with other information available at time w ¡ 1 (denoted by ©w¡1 ): Hw¡1 [¢] ´ H [¢ j©w¡1 ] analogously for the conditional variance: Y duw¡1 [¢] ´ Y du [¢ j©w¡1 ] De¯nition 1 (Bollerslev, Engle and Nelson [5]) The process f%w (0 )g follows an ARCH model if Hw¡1 [%w (0 )] = 0

w = 1> 2> = = =

(1.1)

and the conditional variance £ ¤  2w (0 ) ´ Y duw¡1 [%w (0 )] = Hw¡1 %2w (0 )

w = 1> 2> = = =

(1.2)

depends non trivially on the -¯eld generated by the past observations: f%w¡1 (0 ) > %w¡2 (0 ) > = = = g = Let f|w ( 0 )g denote the stochastic process of interest with conditional mean w (0 ) ´ Hw¡1 (|w )

w = 1> 2> = = =

(1.3)

4

Univariate ARCH models

By the time convention, both w (0 ) and  2w (0 ) are measurable with respect to the time w ¡ 1 information set.¤ De¯ne the f%w ( 0 )g process by %w (0 ) ´ |w ¡ w (0 ) =

(1.4)

It follows from eq.(1.1) and (1.2), that the standardized process }w ( 0 ) ´ %w (0 )  2w (0 )¡1@2

w = 1> 2> = = =

(1.5)

will have conditional mean zero (Hw¡1 [}w ( 0 )] = 0) and a time invariant conditional variance of unity. We can think of %w (0 ) as generated by %w (0 ) = }w (0 )  2w (0 )1@2 where %2w (0 ) is unbiased estimator of  2w (0 ). Let's suppose }w (0 ) » Q LG (0> 1) and independent of  2w (0 ) £ ¤ £ ¤ £ ¤ £ ¤ Hw¡1 %2w = Hw¡1  2w Hw¡1 }w2 = Hw¡1  2w © ª because }w2 j©w¡1 » "2(1) . The median of a "2(1) is 0.455 so Pr %2w ? 12  2w A 12 . If the conditional distribution of }w is time invariant with a ¯nite fourth moment, the fourth moment of %w is £ ¤ £ ¤ £ ¤ £ ¤ £ ¤2 £ ¤ £ ¤2 H %4w = H }w4 H  4w ¸ H }w4 H  2w = H }w4 H %2w £ ¤ £ ¤ £ ¤2 H %4w ¸ H }w4 H %2w

by Jensen's inequalityy . The equality holds true for a constant conditional variance only. If }w » Q LG (0> 1), then H [}w4 ] = 3, the unconditional distribution for %w is therefore leptokurtic £ ¤ £ ¤2 H %4w ¸ 3H %2w £ ¤ £ ¤2 H %4w @H %2w ¸ 3

¤ Andersen distinguishes between deterministic, conditionally heteroskedastic, conditionally stochastic and contemporaneously stochastic volatility process. The volatiltiy process is deterministic if the information set (-¯eld), which we denote with ©, is identical to the -¯eld of all random vectors in the system up to and including time w = 0, the process is conditionally heteroskedastic if © contains information available and observable at time w ¡ 1, the process is conditionally stochastic if © contains up to period w ¡ 1 whereas the volatility process is contemporaneously stochastic if the information set © contains the random vectors up to period w. y Jensen's inequality:

H [j ({)] · j (H [{]) if j (¢) is concave H [j ({)] ¸ j (H [{]) if j (¢) is convex.

The ARCH(q) Model

5

The kurtosis can be expressed as a function of the variability of the conditional variance. In fact, if %w j©w¡1 » Q (0>  2w ) £ ¤ £ ¤ Hw¡1 %4w = 3Hw¡1 %2w h ¡ 2 ¢2 i © £ ¡ ¢¤ª2 £ ¡ ¢¤ £ 4¤ H %w = 3H Hw¡1 %w ¸ 3 H Hw¡1 %2w = 3 H %2w n £ ¤ £ ¡ ¢¤2 £ ¤2 o © £ ¡ ¢¤ª2 = 3H Hw¡1 %2w H %4w ¡ 3 H %2w ¡ 3 H Hw¡1 %2w n £ 4¤ £ ¡ 2 ¢¤2 £ 2 ¤2 o © £ ¡ ¢¤ª2 H %w = 3 H %w + 3H Hw¡1 %w ¡ 3 H Hw¡1 %2w n o 2 2 2 [% ] H H ¡ fH [Hw¡1 (%2w )]g 4 w¡1 w H [%w ] n = 2 = 3+3 2 [H (%2w )] [H (%2w )] Y du fHw¡1 [%2w ]g Y du f 2w g = 3+3 = 3 + 3 [H (%2w )]2 [H (%2w )]2

Another important property of the ARCH process is that the process is conditionally serially uncorrelated. Given that Hw¡1 [%w ] = 0 we have that with the Law of Iterated Expectations: Hw¡k [%w ] = Hw¡k [Hw¡1 (%w )] = Hw¡k [0] = 0= This orthogonality property implies that the f%w g process is conditionally uncorrelated: Fryw¡k [%w > %w+n ] = Hw¡k [%w %w+n ] ¡ Hw¡k [%w ] Hw¡k [%w+n ] = = Hw¡k [%w %w+n ] = Hw¡k [Hw+n¡1 (%w %w+n )] = = H [%w Hw+n¡1 [%w+n ]] = 0 The ARCH model has showed to be particularly useful in modeling the temporal dependencies in asset returns. The ARCH model introduced by Engle (Engle ([9])) is a linear function of past squared disturbances:  2w

t X =$+ l %2w¡l

(1.6)

l=1

In this model to assure a positive conditional variance the parameters have to satisfy the following constraints: $ A 0 e 1 ¸ 0> 2 ¸ 0> = = = > t ¸ 0. De¯ning  2w ´ %2w ¡ yw

6

Univariate ARCH models

where Hw¡1 (yw ) = 0 we can write (1.6) as an AR(q) in %2w : %2w = $ +  (O) %2w + yw where  (O) = 1 O + 2 O2 + = = = + t Ot (where L is the lag operator, i.e. {w¡1 = O{w ). t P The process is weakly stationary if and only if l ? 1; in this case the unconditional l=1

variance is given by

¡ ¢ H %2w = $@ (1 ¡ 1 ¡ = = = ¡ t ) =

(1.7)

The process is characterised by leptokurtosis in excess with respect to the normal distribution. In the case, for example, of ARCH(1) with %w j©w¡1 » Q (0>  2w ), the kurtosis is equal to: ¡ ¢ ¡ ¡ ¢2 ¢ ¡ ¢ H %4w @H %2w = 3 1 ¡ 21 @ 1 ¡ 321

(1.8)

with 321 ? 1, when 321 = 1 we have

¡ ¢ ¡ ¢2 H %4w @H %2w = 1=

In both cases we obtain a kurtosis coe±cient greater than 3, characteristic of the normal distribution. The result is readily obtained: ¡ ¢ ¡ ¢ H %4w = 3H  4w ¡ ¡ ¡ ¢ £ ¢ ¢¤ H %4w = 3 $ 2 + 21 H %4w¡1 + 2$1 H %2w¡1 £ ¡ ¢¤ ¡ 4¢ 3 $ 2 + 2$1 H %2w¡1 H %w = (1 ¡ 321 ) 3 [$ 2 + 2$1  2 ] = (1 ¡ 321 )

substituting  2 = $@ (1 ¡ 1 ):

¯nally

¡ 4 ¢ 3 [$ 2 (1 ¡ 1 ) + 2$ 2 1 ] 3$ 2 (1 + 1 ) H %w = = (1 ¡ 321 ) (1 ¡ 1 ) (1 ¡ 321 ) (1 ¡ 1 ) ¡ ¢2 3$ 2 (1 + 1 ) (1 ¡ 1 )2 ¡ ¢ 3 (1 ¡ 21 ) = = H %4w @H %2w = (1 ¡ 321 ) (1 ¡ 1 ) $ 2 1 ¡ 321

(1.9)

The ARCH(q) Model

7

1.3.1 The ARCH Regression Model We have an ARCH regression model when the disturbances in a linear regression model follow an ARCH process: |w = {0w e + %w Hw¡1 (%w ) = 0 ¡ ¢ Hw¡1 %2w ´  2w = $ +  (O) %2w ¢ ¡ %w jªw¡1 » Q 0>  2w

where {w may include lagged dependent and exogenous variables. 1.3.2 ARCH as a nonlinear model ¡ ¢ The essential characteristic of the ARCH model is Fry %2w > %2w¡1 6= 0, although Fry (%w > %w¡1 ) = 0 for m 6= 0. We examine the relation of the ARCH model with the bilinear model. A time series f%w g is said to follow a bilinear model if it satis¯es s v u X X X %w = !l %w¡l + emn %w¡m xw¡n + xw l=1

m=1 n=1

where xw is a sequence of i.i.d.(0>  2x ) variables. The ¯rst two conditional moments are s v u X X X Hw¡1 (%w ) = !l %w¡l + emn %w¡m xw¡n + xw l=1

m=1 n=1

Y duw¡1 (%w ) =  2x = In contrast with the ARCH model in which the conditional variance is time varying, in the bilinear model the conditional variance is constant. Their unconditional moments, however, might be similar. The bilinear model %w = e21 %w¡2 xw¡1 + xw H (%w ) = 0 ¡ ¢ Fry %2w > %2w¡1 = e221  2x

as this process is autocorrelated in squares, it will exhibit temporal clustering of large and small deviations like an ARCH process.

8

Univariate ARCH models

1.4

The GARCH(p,q) Model

In order to model in a parsimonious way the conditional heteroskedasticity, Bollerslev [2] proposed the Generalised ARCH model, i.e GARCH(p,q):  2w = $ +  (O) %2w +  (O)  2w =

(1.10)

where  (O) = 1 O + = = = + t Ot ,  (O) =  1 O + = = = +  s Os . The GARCH(1,1) is the most popular model in the empirical literaturez :  2w = $ + 1 %2w¡1 +  1  2w¡1 =

(1.11)

To ensure that the conditional variance is well de¯ned in a GARCH (p,q) model all the coe±cients in the corresponding linear ARCH (1) should be positive. Rewriting the GARCH (p,q) model as an ARCH (1):  2w =

Ã

s X 1¡  l Ol

= $¤ +

l=1 1 X

!¡1 "

!n %2w¡n¡1

t X $+ m %2w¡m m=1

# (1.12)

n=0

 2w ¸ 0 if $ ¤ ¸ 0 and all !n ¸ 0. The non-negativity of $ ¤ and !n is also a necessary condition for the non negativity of  2w . In order to make $ ¤ e f!n g1 n=0 well de¯ned, assume that : i. the roots of the polynomial  ({) = 1 lie outside the unit circle.and that $ ¸ 0, this is a condition for $ ¤ to be ¯nite and positive. ii.  ({) e 1 ¡  ({) have no common roots. 1

These conditions are establishing nor that  2w · 1 neither that f 2w gw=¡1 is strictly stationary. For the simple GARCH(1,1) almost sure positivity of  2w requires, with the conditions (i) and (ii), that (Nelson and Cao [25]), $ ¸ 0 1 ¸ 0 1 ¸ 0 z

(1.13)

The GARCH model belongs to the class of deterministic conditional heteroskedasticity models in which the conditional variance is a function of variables that are in the information set available at time w.

The GARCH(p,q) Model

9

For the GARCH(1,q) and GARCH(2,q) models these constraints can be relaxed, e.g. in the GARCH(1,2) model the necessary and su±cient conditions become: ¸ · ¸ ¸

$ 0  1 1 + 2 1

0 1 ? 1 0 0

(1.14)

For the GARCH(2,1) model the conditions are: $ 1 1 1 + 2  21 + 4 2

¸ ¸ ¸ ? ¸

0 0 0 1 0

(1.15)

These constraints are less stringent than those proposed by Bollerslev [2]: $ ¸ 0  l ¸ 0 l = 1> = = = > s m ¸ 0 m = 1> = = = > t

(1.16)

These results cannot be adopted in the multivariate case, where, as we will see below, the requirement of positivity for f 2w g means the positive de¯niteness for the conditional variance-covariance matrix. From the point of view of the maximum likelihood estimation of a GARCH(p,q) 1 model we need to recursively calculate f 2w gw=0 starting the (1.10), ª as© 2 from 0 applying 2 2 2 suming arbitrary values for the pre-sample period  ¡1 > = = = >  ¡s > %¡1 > = = = > %¡t . The 1 conditions ©(1.16) guarantee that f 2w gw=0 ª is not negative given arbitrary non negative 2 2 2 2 values for  ¡1 > = = = >  ¡s > %¡1 > = = = > %¡t . On the contrary, the conditions which guarantee that $ ¤ ¸ 0 and !n ¸ 0 (1.14) for the GARCH(1,2) model and the conditions (1.15) for the GARCH(2,1) model) do not. This problem can be solved choosing 1 the starting values that mantain non negative f 2w gw=0 with probability 1, given non 2 negative $ ¤ and f!n g1 n=0 . Nelson and Cao suggest to arbitrarly pick a % ¸ 0 and set %2w = %2 for t from -1 to 1. and  2w =  2 for 1 ¡ s · w · 0 where 2 =

Ã

s X 1¡ l l=1 1 X 2

= $¤ + %

!¡1 "

!n

n=0

t X 2 m $+% m=1

#

10

Univariate ARCH models

So doing we have a sequence f 2w g ¸ 0 for all w ¸ 0 with probability 1, as  2w

1 w¡1 X X 2 =$ + !n %w¡n¡1 + !n %2 ¤

n=0

Supposing that

s P

l +

l=1

tional mean:

t P

n=w

m ? 1 we can set  2 e %2 equal to their common uncondi-

m=1

Ã

s t X X 2 2  ´ % ´ $@ 1 ¡ l ¡ m l=1

m=1

!

=

1.4.1 The Yule-Walker equations for the squared process In the GARCH(p,q) model the process f%2w g has an ARMA(m,p) representations, where p = max (s> t) ! Ã s p X X ¡ ¢ %2w = $ + m +  m %2w¡m +  l w¡l w¡ m=1

l=1

where Hw¡1 [ w ] = 0, w 2 [¡ 2w > 1[ we can apply the classical results of ARMA model. We can study the autocovariance function, that is: ¡ ¢  2 (n) = fry %2w > %2w¡n "

p X ¡ ¢ 2  (n) = fry $ + m +  m %2w¡m +

 2 (n) =

"p X¡ m=1

m=1

m +  m

¢

Ã

s X l w¡

# " ¡ 2 ¢ fry %w¡m > %2w¡n + fry

w¡l

l=1

!

s X l w¡ l=1

> 2w¡n

#

2 w¡l > %w¡n

#

(1.17)

When n is big enough, the last term on the right of expression (1.17) is null. The sequence of autocovariances satisfy a linear di®erence equation of order Pd{ (s> t), for n ¸ s + 1 "p # X¡ ¢  2 (n) = m +  m  2 (n ¡ m) m=1

This system can be used to identify the lag order p and s, that is the s and t order if t ¸ s, the order s if t ? s.

The GARCH(p,q) Model

11

1.4.2 The GARCH Regression Model ¡ ¢0 ¡ ¢ Let zw = 1> %2w¡1 > ¢ ¢ ¢ > %2w¡t >  2w¡1 > ¢ ¢ ¢ >  2w¡s ,  = $> 1 > ¢ ¢ ¢ > t >  1 > ¢ ¢ ¢ >  s and  2 £, where  = (e0 >  0 ) and £ is a compact subspace of a Euclidean space such that %w possesses ¯nite second moments. We may write the GARCH regression model as: %w = |w ¡ {0w e ¢ ¡ %w jªw¡1 » Q 0>  2w  2w = zw0  1.4.3 Stationarity The process f%w g which follows a GARCH(p,q) model is a martingale di®erence sequence. In order to study second-order stationarity it's su±cient to consider that: £ ¤ Y du [%w ] = Y du [Hw¡1 (%w )] + H [Y duw¡1 (%w )] = H  2w and show that is asymptotically constant in time (it does not depend upon time).

Proposition 2 A process f%w g which satis¯es a GARCH(p,q) model with positive coe±cient $ ¸ 0, l ¸ 0 l = 1> = = = > t,  l ¸ 0 l = 1> = = = > s is covariance stationary if and only if:  (1) + (1) ? 1 This is a su±cient but non necessary conditions for strict stationarity. Because ARCH processes are thick tailed, the conditions for covariance stationarity are often more stringent than the conditions for strict stationarity. Example 3 A GARCH(1,1) model can be written as # " 1 Y n X ¡ ¢ 2 2  1 + 1 }w¡l w = $ 1 + n=1 l=1

In fact, ¡ ¢  2w+1 = $ + 1 2w +  1  2w = $ +  2w 1 }w2 +  1 "

1 Y n X ¡ ¢ 2 + 1  2w = $ 1 + 1 }w¡l n=1 l=1

#

12

Univariate ARCH models

¡ ¢ 2  2w = $ +  2w¡1 1 }w¡1 + 1

¡ ¢ 2 + 1  2w¡1 = $ +  2w¡2 1 }w¡2

£ ¡ ¢¤ ¡ ¢ 2 2 + 1 + 1  2w = $ + $ +  2w¡2 1 }w¡2 1 }w¡1 ¡ ¢ ¡ ¢¡ ¢ 2 2 2 = $ + $ 1 }w¡1 +  1 +  2w¡2 1 }w¡2 +  1 1 }w¡1 + 1 ¡ ¢ ¡ ¢¡ ¢ 2 2 2 = $ + $ 1 }w¡1 +  1 + $ 1 }w¡1 +  1 1 }w¡2 + 1 ¡ ¢¡ ¢¡ ¢ 2 2 2 + 2w¡3 1 }w¡3 +  1 1 }w¡2 +  1 1 }w¡1 + 1

Nelson [23] shows that when $ A 0,  2w ? 1 a.s.and f%w >  2w g is strictly stationary if and only if H [ln ( 1 + 1 }w2 )] ? 0 ¢¤ £ ¡ ¢¤ £ ¡ H ln  1 + 1 }w2 · ln H  1 + 1 }w2 = ln (1 +  1 ) when 1 +  1 = 1 the model is strictly stationary. H [ln ( 1 + 1 }w2 )] ? 0 is a weaker requirement than 1 +  1 ? 1. Example 4 ARCH(1), with 1 = 1,  1 = 0, }w » Q (0> 1) £ ¡ ¢¤ £ ¡ ¢¤ H ln }w2 · ln H }w2 = ln (1)

It's strictly but not covariance stationary. The ARCH(q) is covariance stationary if and only if the sum of the positive parameters is less than one. 1.4.4 Forecasting volatility A GARCH(p,q) can be represented as an ARMA process, given that %2w =  2w + where Hw¡1 [ w ] = 0, w 2 [¡ 2w > 1[: Ã ! max(s>t) s X ¡ X ¢ m +  m %2w¡m +  l w¡l %2w = $ + w¡ m=1

w,

l=1

%2w »ARMA(m,p) with p = max(s> t). Forecasting with a GARCH(p,q) (Engle and Bollerslev [11]):  2w+n

=$+

q X £ l=1

l %2w+n¡l

+

 l  2w+n¡l

¤

+

p X £

l %2w+n¡l +  l  2w+n¡l

l=n

¤

where q = min fp> n ¡ 1g and by de¯nition summation from 1 to 0 and from n A p to p both are equal to zero. Thus q p X £ ¤ £ ¡ ¢¤ X £ 2 ¤ Hw  2w+n = $ + (l +  l ) Hw  2w+n¡l + l %w+n¡l +  l  2w+n¡l = l=1

l=n

The GARCH(p,q) Model

13

In particular for a GARCH(1,1) and n A 2: n¡2 X £ ¤ Hw  2w+n = (1 +  1 )l $ + (1 +  1 )n¡1  2w+1 l=0

i h n¡1 1 ¡ (1 +  1 )

+ (1 +  1 )n¡1  2w+1 [1 ¡ (1 +  1 )] h i =  2 1 ¡ (1 +  1 )n¡1 + (1 +  1 )n¡1  2w+1 £ ¤ =  2 + (1 +  1 )n¡1  2w+1 ¡  2 £ ¤ When the process is covariance stationary, it follows that Hw  2w+n converges to  2 as n ! 1. 1.4.5 The IGARCH(p,q) model De¯nition 5 The GARCH(p,q) process characterised by the ¯rst two conditional moments: = $

Hw¡1 [%w ] = 0  2w

t s X X £ 2¤ 2 ´ Hw¡1 %w = $ + l %w¡l +  l  2w¡l l=1

l=1

where $ ¸ 0, l ¸ 0 and  l ¸ 0 for all l and the polynomial 1 ¡  ({) ¡ ({) = 0

has g A 0 unit root(s) and max fs> tg ¡ g root(s) outside the unit circle is said to be: i) Integrated in variance of order g if $ = 0 ii) Integrated in variance of order g with trend if $ A 0. The Integrated GARCH(p,q) models, both with or without trend, are therefore part of a wider class of models with a property called "persistent variance" in which the current information remains important for the forecasts of the conditional variances for all horizon. So we have the Integrated GARCH(p,q) model when (necessary condition)  (1) + (1) = 1 To illustrate consider the IGARCH(1,1) which is characterised by 1 +  1 = 1  2w = $ + 1 %2w¡1 + (1 ¡ 1 )  2w¡1 ¡ ¢  2w = $ +  2w¡1 + 1 %2w¡1 ¡  2w¡1

0 ? 1 · 1

For this particular model the conditional variance n steps in the future is: £ ¤ Hw  2w+n = (n ¡ 1) $ +  2w+1

14

Univariate ARCH models

1.4.6 Persistence In many studies of the time series behavior of asset volatility the question has been how long shocks to conditional variance persist. If volatility shocks persist inde¯nitely, they may move the whole term structure of risk premia. There are many notions of convergence in the probability theory (almost sure, in probability, in Os ), so whether a shock is transitory or persistent may depend on the de¯nition of convergence. In linear models it typically makes no di®erence which of the standard de¯nitions we use, since the de¯nitions usually agree. In GARCH models the situation is more complicated. In the IGARCH(1,1):  2w = $ + 1 %2w¡1 +  1  2w¡1 where 1 +  1 = 1. Given that %2w = }w2  2w , we can rewrite the IGARCH(1,1) process as £ ¤ 2  2w = $ +  2w¡1 (1 ¡ 1 ) + 1 }w¡1 0 ? 1 · 1=

When $ = 0,  2w is a martingale. Based on the nature of persistence in linear models, it seems that IGARCH(1,1) with $ A 0 and $ = 0 are analogous to random walks with and without drift, respectively, and are therefore natural models of "persistent" shocks. This turns out to be misleading, however: in IGARCH(1,1) with $ = 0,  2w collapses to zero almost surely, and in IGARCH(1,1) with $ A 0,  2w is strictly stationary and ergodic and therefore does not behave like a random walk, since random walks diverge almost surely. Two notions of persistence. 1. Suppose  2w is strictly stationary and ergodic. Let I ( 2w ) be the unconditional cdf for  2w , and Iv ( 2w ) the conditional cdf for  2w , given information at time v ? w. For any v I ( 2w ) ¡ Iv ( 2w ) ! 0 at all continuity points as w ! 1= There is no persistence when f 2w g is stationary and ergodic. 2. Persistence is de¯ned in terms of forecast moments. ¡ 2For ¢ some  A 0, the shocks 2 to  w fail to persist if and only if for every v, Hv  w converges, as w ! 1, to a ¯nite limit independent of time v information set. Whether or not shocks to f 2w g "persist" depends very much on which de¯nition is adopted. The conditional moment may diverge to in¯nity for some , but converge to a well-behaved limit independent of initial conditions for other , even when the f 2w g is stationary and ergodic. Example 6 GARCH(1,1) ¡ ¢  2w+1 = $ + 1 2w +  1  2w = $ +  2w 1 }w2 +  1

The GARCH(p,q) Model

15

3 2 w¡(w¡3)¡1 X ¡ 2¢ Hw¡3  w = $ 4 (1 +  1 )n 5 n=0

+ 2w¡3

(1 +  1 ) (1 +  1 ) (1 +  1 )

The volatility forecast for time w, conditioning on information set at time v:

¡ ¢ Hv  2w = $

"w¡v¡1 X n=0

#

¡ ¢ (1 +  1 )n +  2w¡v (1 +  1 )w¡v Hv  2w

converges to the unconditional variance of $@ (1 ¡ 1 ¡  1 ) as w ! 1 if and only if 1 +  1 ? 1. In the IGARCH(1,1) model with $_ A 0 and 1 +  1 = 1 Hv ( 2w ) ! 1 a.s. as w ! 1. Nevertheless, IGARCH models are strictly stationary and ergodic. 1.4.7 The Component Model A permanent and transitory component model of stock returns volatility (Engle and Lee, 1993). The ¯nding of a unit root in the volatility process indicates that there is a stochastic trend as well as a transitory component in stock return volatility. The decomposition of the conditional variance of asset returns in a permanent and transitory component is a way to investigate the long-run and the short-run movement of volatility in the stock market. The GARCH(1,1) model can also be written as  2w = (1 ¡ 1 ¡  1 )  2 + 1 %2w¡1 +  1  2w¡1 ¡ ¢ ¡ ¢ =  2 + 1 %2w¡1 ¡  2 +  1  2w¡1 ¡  2

The last two terms have expected value zero. This model is extended to allow the possibility that volatility is not constant in the long run. Let tw be the permanent component of the conditional variance, the component model for the conditional variance is de¯ned as ¡ ¢ ¡ ¢ (1.18)  2w = tw + 1 %2w¡1 ¡ tw¡1 +  1  2w¡1 ¡ tw¡1 2 2 = tw ¡ (1 +  1 ) tw¡1 + 1 %w¡1 +  1  w¡1 (1 ¡  1 O)  2w = [1 ¡ (1 +  1 ) O] tw + 1 %2w¡1 ¡ ¢ tw = $ + tw¡1 + ! %2w¡1 ¡  2w¡1

16

Univariate ARCH models

The constant volatility  2 has been replaced by the time-varying trend, tw , and its past value. The forecasting error, %2w¡1 ¡  2w¡1 , serves as a driving force for the timedependent movement of the trend. The di®erence between the conditional variance and its trend,  2w¡1 ¡ tw¡1 , is the transitory component of the conditional variance. The multistep forecast of the trend is just the current trend plus a constant drift: ¡ ¢ tw+n = $ + tw+n¡1 + ! %2w+n¡1 ¡  2w+n¡1 £ ¤ Hw¡1 [tw+n ] = $ + Hw¡1 [tw+n¡1 ] + !Hw¡1 %2w+n¡1 ¡  2w+n¡1 ¡ ¢ ¡ ¢ £ ¤ but Hw¡1 %2w+n¡1 = Hw¡1  2w+n¡1 such that Hw¡1 %2w+n¡1 ¡  2w+n¡1 = 0. £ ¤ Hw¡1 [tw+n ] = $ + $ + Hw¡1 [tw+n¡2 ] + !Hw¡1 %2w+n¡2 ¡  2w+n¡2 = === = n$ + tw

(1.19)

From (1.18) ¡ ¢ ¡ ¢  2w+1 ¡ tw+1 = 1 %2w ¡ tw +  1  2w ¡ tw

¡ ¢ ¡ ¢ ¡ ¢ Hw¡1  2w+1 ¡ Hw¡1 (tw+1 ) = 1 Hw¡1 %2w ¡ tw +  1 Hw¡1  2w ¡ tw ¡ ¢ = (1 +  1 )  2w ¡ tw ¡ ¢ ¡ ¢  2w+2 ¡ tw+2 = 1 %2w+1 ¡ tw+1 +  1  2w+1 ¡ tw+1 ¡ ¢ ¡ ¢  2w+3 ¡ tw+3 = 1 %2w+2 ¡ tw+2 +  1  2w+2 ¡ tw+2

¡ ¢ Hw  2w+3 ¡ tw+3 = = = =

¡ ¢ ¡ ¢ 1 Hw %2w+2 ¡ tw+2 +  1 Hw  2w+2 ¡ tw+2 ¡ ¢ ¡ ¢ 1 Hw %2w+2 ¡ 1 Hw (tw+2 ) +  1 Hw  2w+2 ¡  1 Hw (tw+2 ) £ ¡ ¢ ¤ (1 +  1 ) Hw  2w+2 ¡ Hw (tw+2 ) £ ¡ ¢ ¡ ¢¤ (1 +  1 ) Hw 1 %2w+1 ¡ tw+1 +  1  2w+1 ¡ tw+1

¡ ¢ £ ¡ ¢ ¡ ¢¤ Hw¡1  2w+3 ¡ tw+3 = (1 +  1 ) Hw¡1 1 %2w+1 ¡ tw+1 +  1  2w+1 ¡ tw+1 £ ¡ ¢ ¤ = (1 +  1 ) (1 +  1 ) Hw¡1  2w+1 ¡ (1 +  1 ) Hw¡1 (tw+1 ) £ ¡ ¡ ¢ ¢¤ = (1 +  1 ) (1 +  1 ) Hw¡1  2w+1 ¡ Hw¡1 (tw+1 ) ¡ ¢ = (1 +  1 )3  2w ¡ tw

The GARCH(p,q) Model

17

¡ ¢ ¡ ¢ ¡ ¢ Hw¡1  2w+n ¡ Hw¡1 (tw+n ) = (1 +  1 ) Hw¡1  2w+n¡1 ¡ Hw¡1 (tw+n¡1 ) ¡ ¢ = (1 +  1 )n  2w ¡ tw ¡ ¢ The forecast Hw¡1  2w+n ¡ Hw¡1 (tw+n ) will eventually converge to zero as the forecasting horizon extends into the remote future ¡ ¢ Hw¡1  2w+n ¡ Hw¡1 (tw+n ) = 0 as n ! 1 (1.20)

Therefore there will be no di®erence between the conditional variance and the trend in the long run. This is the motivation for tw being called the permanent component of the conditional variance. Combining (1.20) and (1.19), the long run forecast of the conditional variance is just the current expectation of the trend plus a constant drift, ¡ ¢ Hw¡1  2w+n = n$ + tw as n ! 1= The component model can be extended to include non-unit-root process. The general component model becomes ¡ ¢ ¡ ¢  2w = tw + 1 %2w¡1 ¡ tw¡1 +  1  2w¡1 ¡ tw¡1 (1.21) ¡ ¢ tw = $ + tw¡1 + ! %2w¡1 ¡  2w¡1

(1.22)

tw stil represents the component of the conditional variance with the longer memory, as long as  A (1 +  1 ). The multistep forecast of the conditional variance and the trend are ¡ ¢ ¡ ¢ Hw¡1  2w+n ¡ Hw¡1 (tw+n ) = (1 +  1 )n  2w ¡ tw (1.23) ¡ ¢ tw+n = $ + tw+n¡1 + ! %2w+n¡1 ¡  2w+n¡1

Hw¡1 [tw+n ] = = = =

£ ¤ $ + Hw¡1 [tw+n¡1 ] + !Hw¡1 %2w+n¡1 ¡  2w+n¡1 $ +  [$ + Hw¡1 [tw+n¡2 ]] === ¡ ¢ 1 +  + = = = + n¡1 $ + n tw ¡

¢ 1 ¡ n Hw¡1 [tw+n ] = $ + n tw (1 ¡ )

(1.24)

for  ? 1 and (1 +  1 ) ? 1. If  A (1 +  1 ), the transitory component in (1.23) decays faster than the trend in (1.24) so that the trend will dominate the forecast of

18

Univariate ARCH models

the conditional variance as the forecasting horizon extends. The conditional variance will eventually converge to a constant since the trend itself is stationary, ¡ ¢ Hw¡1  2w+n = Hw¡1 (tw+n ) = $@ (1 ¡ ) as n ! 1= By rewriting (1.21) as

 2w = (1 ¡ 1 O ¡  1 O) tw + 1 %2w¡1 +  1  2w¡1 and (1.22) as ¡ ¢ (1 ¡ O) tw = $ + ! %2w¡1 ¡  2w¡1

and multiplying by (1 ¡ O) the general component model reduces to £ ¤ (1 ¡ O)  2w = (1 ¡ O) (1 ¡ 1 O ¡  1 O) tw + 1 %2w¡1 +  1  2w¡1

(1.25)

(1.26)

substituting (1.25) into (1.26)

£ ¡ ¢¤ ¡ ¢ (1 ¡ O)  2w = (1 ¡ 1 O ¡  1 O) $ + ! %2w¡1 ¡  2w¡1 + (1 ¡ O) 1 %2w¡1 +  1  2w¡1 ¡ ¢ ¡ ¢ (1 ¡ O)  2w = (1 ¡ 1 ¡  1 ) $ + (1 ¡ 1 O ¡  1 O) ! %2w¡1 ¡  2w¡1 + (1 ¡ O) 1 %2w¡1 +  1  2w¡1 (1 ¡ O)  2w = (1 ¡ 1 ¡  1 ) $ + (! + 1 ) %2w¡1 + (¡1 ¡ (1 +  1 ) !) %2w¡2 + ( ¡ ! +  1 )  2w¡1 + (! (1 +  1 ) ¡  1 )  2w¡2 A GARCH(2,2) process represents the underlying data generating process for the conditional variance de¯ned in the component model. When  = ! = 0, then the component model will reduce to the GARCH(1,1). So the GARCH(1,1) only describes a single dynamic component of the conditional variance.

Asymmetric Models

1.5

19

Asymmetric Models

1.5.1 The EGARCH(p,q) Model The simple structure of (1.10) imposes important limitations on GARCH models. ² The negative correlation between stock returns and changes in returns volatility, i.e. volatility tends to rise in response to "bad news", (excess returns lower than expected) and to fall in response to "good news" (excess returns higher than expected). GARCH models, however, assume that only the magnitude and not the positivity or negativity of unanticipated excess returns determines feature  2w . If the distribution of }w is symmetric, the change in variance tomorrow is conditionally uncorrelated with excess returns today (Nelson [24]). If we write  2w as a function of lagged  2w and lagged }w2 , where %2w = }w2  2w  2w = $ +

t s X X 2 m }w¡m  2w¡m +  l  2w¡l m=1

l=1

it is evident that the conditional variance is invariant to changes in sign of the 2 }w0 v. Moreover, the innovations }w¡m  2w¡m are not i.i.d. ² Another limitation of GARCH models results from the nonnegativity constraints on $ ¤ and !n in (1.12), which are imposed to ensure that  2w remains nonnegative for all w with probability one. These constraints imply that increasing }w2 in any period increases  2w+p for all p ¸ 1, ruling out random oscillatory behavior in the  2w process. ² The GARCH models are not able to explain the observed covariance between %2w and %w¡m . This is possible only if the conditional variance is expressed as an asymmetric function of %w¡m . ² In GARCH(1,1) model, shocks may persist in one norm and die out in another, so the conditional moments of GARCH(1,1) may explode even when the process is strictly stationary and ergodic. ² GARCH models essentially specify the behavior of the square of the data. In this case a few large observations can dominate the sample. The asymmetric models provide an explanation for the so called leverage e®ect, i.e. an unexpected price drop increases volatility more than an analogous unexpected price increase. The EGARCH(p,q) model (Exponential GARCH(p,q)) put forward by Nelson [24] provides a ¯rst explanation for the  2w depends on both size and the sign of lagged residuals. This is the ¯rst example of asymmetric model: s t X ¢ X ¡ ¢ ¡  l ln  2w¡l + l [!}w¡l + # (j}w¡l j ¡ H j}w¡l j)] ln  2w = $ + l=1

l=1

(1.27)

20

Univariate ARCH models

1 ´ 1, H j}w j = (2@)1@2 given that }w » Q LG(0> 1), where the parameters $,  l , l are not restricted to be nonnegative. Let de¯ne j (}w ) ´ !}w + # [j}w j ¡ H j}w j] by construction fj (}w )g1 w=¡1 is a zero-mean, i.i.d. random sequence. The components of j (}w ) are !}w and # [j}w j ¡ H j}w j], each with mean zero. If the distribution of }w is symmetric, the components are orthogonal, though they are not independent. Over the range 0 ? }w ? 1, j (}w ) is linear in }w with slope ! + #, and over the range ¡1 ? }w · 0, j (}w ) is linear with slope ! ¡ #. Thus, j (}w ) allows for the conditional variance process f 2w g to respond asymmetrically to rises and falls in stock price. The term¡# [j}w¢j ¡ H j}w j] represents a magnitude e®ect. If # A 0 and ! = 0, the innovation in ln  2w+1 is positive (negative) when the magnitude of }w is larger (smaller) than its expected value. If # = 0 and ! ? 0, the innovation in conditional variance is now positive (negative) when returns innovations are negative (positive). A negative shock to the returns which would increase the debt to equity ratio and therefore increase uncertainty of future returns could be accounted for when l A 0 and ! ? 0. ¡ ¢ 2 In the EGARCH model ln  2w+1 ¡ 2 is¢homoskedastic conditional on2  w , and the partial correlation between }w and ln  w+1 is constant conditional on  w . An alternative possible speci¯cation of the news impact curve is the following (Bollerslev, Engle, Nelson (1994)) j(}w >  2w )

=

0  ¡2 w

· µ ¶¸ 1  1 j}w j  1 j}w j ¡2 0 + w ¡ Hw 1 + 2 j}w j 1 +  2 j}w j 1 +  2 j}w j

The of ¢ parameters  0 and 0 parameters allow both the conditional variance 2 ln and its conditional correlation ¡ 2 ¢ with }w to vary with the level of  w . If 1 ? 0 then Fruuw (ln  w+1 > }w ) ? 0: leverage e®ect. The EGARCH model constraints 0 =  0 = 0, so that the conditional correlation is constant, as is the conditional variance of ln ( 2w ). The ,  2 > and 2 parameters give the model °exibility in how much weight to assign to the tail observations: e.g.,  2 A 0> 2 A 0, the model downweights large j}w j's. A number of authors, e.g., Nelson ([24]), have found that standardized residuals from estimated GARCH models are leptokurtic relative to the normal, see also Engle and Gonzalez-Rivera ([18]). Nelson [24] assumes that }w has a GED distribution (also called the exponential power family). The density of a GED random variable normalized to have mean of zero and a variance of one is given by: £ ¡ ¢ ¤ exp ¡ 12 j}@j i (}; ) = ¡ 1 ? } ? 1> 0 ? · 1 2(1+1@ ) ¡ (1@ ) ¡

 2w+1

Asymmetric Models

21

where ¡ (¢) is the gamma function, and £ ¤1@2  ´ 2(¡2@ ) ¡ (1@ ) @¡ (3@ )

is a tail thickness parameter. When = 2, } has a standard normal distribution. For ? 2, the distribution of } has thicker tails than the normal (e.g. when = 1, } has a double exponential distribution) and for A 2, the distribution of } has thinner tails than the normal (e.g., for = 1, } is uniformly distributed on the interval £ 1@2 1@2 ¤ 21@ ¡ (2@ ) ¡3 > 3 (Hamilton, ). With this density, we obtain that H j}w j = ¡ (1@ ) [21]). More general than the GED we have the Generalized t Distribution, which takes the form: ¢ ¡  i %w  ¡1 w ; >  = 1@ 2 w e E (1@> ) [1 + j%w j @ (e  w )]+1@ where E (1@> ) ´ ¡ (1@) ¡ () ¡ (1@ + ) denotes the beta function,

e ´ [¡ () ¡ (1@) @¡ (3@) ¡ ( ¡ 2@)]1@2 ¡ ¢ and  A 2,  A 0 and  A 0= The factor e makes Y du %w  ¡1 = 1. The Generalized w t nests both the Student's t distribution and the GED. The GED is obtained for  = 1. The GED has only one shape parameter , which is apparently insu±cient to ¯t both the central part and the tails of the conditional districbution. Stationarity In order to simply state the stationarity conditions, we write the EGARCH(p,q) model as: # " s t X X ¡ 2¢ l  l O ln  w = $ + l Ol [!}w + # (j}w j ¡ H j}w j)] 1¡ l=1

¡ ¢ ln  2w =

"

l=1

s X 1¡ l

¡ ¢ ln  2w = $ ¤ +

l=1 1 X

#¡1

"

s X $+ 1¡  l Ol l=1

#¡1 " t # X l Ol j (}w ) l=1

*l j (}w¡l ) =

l=1

In the EGARCH(p,q) model ln ( 2w ) is a linear process, and its stationarity (covariance or strict) and ergodicity are easily checked. Given ! 6= 0 or # 6= 0, then

1 X ¯ ¡ 2¢ ¯ ¯ln  ¡ $ ¤ ¯ ? 1 a.s. when *2l ? 1 w l=1

22

Univariate ARCH models

follows from the independence and ¯nite variance of the j (}w ) and from Billingsley (1986, Theorem 22.6). From this we have that ¯ µ ¶¯ ¯ ¯  2w ¯ln ¯ ? 1 a.s. ¯ exp($ ¤ ) ¯ ¯ ¯ ¯  2w ¯ ¯ ¯ ¯ exp($ ¤ ) ¯ ? 1 a.s.

fexp (¡$ ¤ )  2w g, fexp (¡$ ¤ @2) %w g, where %w = }w  w , }w is i.i.d., are ergodic and strictly stationarity. For all w H [ln ( 2w ) ¡ $ ¤ ] = 0 and the variance Y du [ln ( 2w ) ¡ $ ¤ ] = 1 P Y du (j (}w )) *2l . Since Y du (j (}w )) is ¯nite and the distribution of (ln ( 2w ) ¡ $ ¤ ) is l=1

independent of w, the ¯rst two moments of (ln ( 2w ) ¡ $ ¤ ) are ¯nite and time invari1 1 P P ance, so (ln ( 2w ) ¡ $ ¤ ) is covariance stationary if *2l ? 1. If *2l = 1, then l=1

jln ( 2w ) ¡ $ ¤ j = 1 almost surely. Since

ln ( 2w )

l=1

·

is written in ARMA(p,q) form, when 1 ¡

have no common conditions for strict stationarity of · roots, ¸ s P all the roots of 1 ¡  l {l lying outside the unit circle.

s P

l

¸

·

t P

l

¸

 l { and l { l=1 l=1 ln ( 2w ) are equivalent to

l=1

The strict stationarity of fexp (¡$ ¤ )  2w g, fexp (¡$ ¤ @2) %w g need not imply covariance stationarity, since fexp (¡$ ¤ )  2w g, fexp (¡$ ¤ @2) %w g may fail to have ¯nite unconditional means and variances. For some distribution of f}w g (e.g., the Student t with ¯nite degrees of freedomx ), fexp (¡$ ¤ )  2w g and fexp (¡$ ¤ @2) %w g typically have no ¯nite unconditional moments. If the distribution of }w is GED and is 1 P thinner-tailed than the double exponential, and if *l ? 1, then fexp (¡$ ¤ )  2w g l=1

and fexp (¡$ ¤ @2) %w g are not only strictly stationary and ergodic, but have arbitrary ¯nite moments, which in turn implies that they are covariance stationary. 1.5.2 Other Asymmetric Models There is a long tradition in ¯nance that models stock return volatility as negatively correlated with stock returns. The explanation for this phenomenon is based on leverage. A drop in the value of the stock (negative return) increases ¯nancial leverage, which makes the stock riskier and increases its volatility. The news have asymmetric e®ects on volatility. In the aymmetric volatility models good news and bad news have di®erent predictability for future volatility. x

The Student t distribution is: i [}; ] = [ ( ¡ 2)]¡1@2 ¡

·

¸ ³ ´ h i¡(+1)@2 1  ¡1 1 + } ( ¡ 2)¡1 ( + 1) ¡ 2 2

as  (degree of freedom) goes to in¯nity the t distribution converges to the normal. When 4 ?  ? 1, the kurtosis coe±cient is n = 3(¡2) (¡4) A 3.

Asymmetric Models

23

The Non linear ARCH(1) model (Engle - Bollerslev [11]):  w

t s X X  =$+ l j%w¡l j +  l  w¡l l=1

 w

l=1

t s X X  =$+ l j%w¡l ¡ nj +  l  w¡l l=1

l=1

 w

for n 6= 0, the innovations in will depend on the size as well as the sign of lagged residuals, thereby allowing for the leverage e®ect in stock return volatility. The Glosten - Jagannathan - Runkle model[19]:  2w

s t X X ¡ 2 ¢ ¡ 2 2 =$+  l  w¡l + l %w¡1 +  l Vw¡l %w¡l l=1

where

Vw¡

l=1

=

½

1 0

li li

%w ? 0 %w ¸ 0

The Asymmetric GARCH(p,q) model (Engle, [10]):  2w

t s X X 2 =$+ l (%w¡l + ) +  l  2w¡l l=1

l=1

The QGARCH by Sentana (Sentana, [28]):  2w

2

0

=  + ª {w¡t +

{0w¡t D{w¡t

s X +  l  2w¡l l=1

when {w¡t = (%w¡1 > = = = > %w¡t )0 . The linear term (ª0 {w¡t ) allows for asymmetry. The o®-diagonal elements of D accounts for interaction e®ects of lagged values of {w on the conditional variance. The QGARCH nests several asymmetric models. The augmented GARCH assumes ª = 0 (Bera and Lee, 1990). The ARCH(q) model corresponds to ª = 0,  l = 0 and D diagonal. The asymmetric GARCH model assumes D to be diagonal. The linear standard deviation model (Robinson, 1991) corresponds to  l = 0,  2 = 2 , ª = 2! and D = !!0 2

 2w = ( + !0 {w¡t ) = The Conditional Standard Deviation Model (Taylor, [29]) t s X X  w = $ + + l j%w¡l j +  l  2w¡l l=1

l=1

the conditional standard deviation is a distributed lag of absolute residuals.

24

1.6

Univariate ARCH models

The News Impact Curve

The news have asymmetric e®ects on volatility. In the aymmetric volatility models good news and bad news have di®erent predictability for future volatility. The news impact curve characterizes the impact of past return shocks on the return volatility which is implicit in a volatility model. Holding constant the information dated w ¡ 2 and earlier, we can examine the implied relation between %w¡1 and  2w , with  2w¡l =  2 l = 1> = = = > s. This curve is called, with all lagged conditional variances evaluated at the level of the unconditional variance of the stock return, the news impact curve because it relates past return shocks (news) to current volatility. This curve measures how new information is incorporated into volatility estimates. For the GARCH model the News Impact Curve (NIC) is centered on w¡1 = 0. In the case of EGARCH model the curve has its minimum at w¡1 = 0 and is exponentially increasing in both directions but with di®erent paramters. GARCH(1,1):  2w = $ + 2w¡1 +  2w¡1 The news impact curve has the following expression:  2w = D + 2w¡1 D ´ $ +  2 EGARCH(1,1): ¡ ¢ ¡ ¢ ln  2w = $ +  ln  2w¡1 + !}w¡1 + # (j}w¡1 j ¡ H j}w¡1 j)

where }w = w @ w . The news impact curve is 8 · ¸ !+# > > w¡1 < D exp  2 · ¸ w = !¡# > > w¡1 : D exp 

i ru w¡1 A 0 i ru w¡1 ? 0

h i p D ´  2 exp $ ¡ # 2@ ! ? 0

#+!A0

² The EGARCH allows good news and bad news to have di®erent impact on volatility, while the standard GARCH does not. ² The EGARCH model allows big news to have a greater impact on volatility than GARCH model. EGARCH would have higher variances in both directions because the exponential curve eventually dominates the quadrature.

The News Impact Curve

25

The Asymmetric GARCH(1,1) (Engle, 1990)  2w = $ +  (w¡1 + )2 +  2w¡1 the NIC is  2w = D +  (w¡1 + )2 D ´ $ +  2 $ A 0> 0 ·  ? 1>  A 0> 0 ·  ? 1= is asymmetric and centered at w¡1 = ¡. The Glosten-Jagannathan-Runkle model ¡ 2 w¡1  2w = $ + 2w +  2w¡1 + Vw¡1

¡ Vw¡1

=

½

1 li w¡1 ? 0 0 rwkhuzlvh

The NIC is  2w

=

½

D + 2w¡1 li w¡1 A 0 D + ( + ) 2w¡1 li w¡1 ? 0 D ´ $ +  2

$ A 0> 0 ·  ? 1>  A 0> 0 ·  ? 1>  +  ? 1 is centered at w¡1 = ¡. These di®erences between the news impact curves of the models have important implications for portfolio selection and asset pricing. Since predictable market volatility is related to market premium, the two models imply very di®erent market risk premiums, and hence di®erent risk premiums for individual stocks under conditional version of CAPM. Di®erences in predicted volatility after the arrival of some major news leads to a signi¯cant di®erence in the current option price and to di®erent dynamic hedging strategies.

26

1.7

Univariate ARCH models

The GARCH-in-mean Model

The GARCH-in-mean (GARCH-M) proposed by Engle, Lilien and Robins (1987) consists of the system: ¡ ¢ |w =  0 +  1 {w +  2 j  2w + w  2w

= 0 +

t X

l 2w¡1

+

l=1

s X

 l  2w¡1

l=1

w j ©w¡1 » Q(0>  2w ) where |w is a ¯nancial return. This model characterizes the evolution of the mean and the variance of a time series simultaneously. The process specifying the conditional variance is a GARCH(1,1) process. Engle, Lilien and Robbins ([13]) extend the Engle's ARCH model to allow the conditional variance to be a determinant of the conditional mean of the process, i.e., the expected risk premium. They consider an economy where risk averse economic agents choose among two kind of ¯nancial investment in order to maximize their expected utility. The ¯rst possibility is represented by a risky asset with normally distributed returns, i.e., the risky is measured by the asset return variance and the compensation by a rise in the expected returns. The second investment choice is represented by a riskless asset. The agents utility function maximization subject to the market clearing conditions lead to the traditional relation between the mean and the variance of the risky asset return. Engle, Lilien and Robbins investigate the previous relation when the risky asset variance changes over time and therefore the risky asset price will change as well. The above assumptions determine a relation between the mean and the variance of asset return that is still positive but not constant. The GARCH-M model therefore allows to analize the possibility of time-varying risk premium. When |w ´ (uw ¡ ui ), where (uw ¡ ui ) is the risk premium on holding the asset, then the GARCH-M represents a simple way to model the relation between risk premium and its conditional variance: ¡ ¢ |w =  0 +  1 {w +  2 j  2w + w  2w

= 0 +

t X l=1

l 2w¡1

+

s X l=1

w j ©w¡1 » Q(0>  2w )

 l  2w¡1

The GARCH-in-mean Model

27

It turns out that: ¡ ¢ |w j ©w¡1 = (uw ¡ ui ) j ©w¡1 » Q ( 0 +  1 {w +  2 j  2w >  2w )

p In applications, j¡( 2w ) =  2w , j ( 2w ) = ln ( 2w ) and¢ j ( 2w ) =  2w have been used. Let ! =  0 >  1 >  2 > 1 > = = = > t >  1 > = = = >  s be the parameters vector. The procedure utilized in estimating ! is the maximization of the conditional log likelihood function which, under the assumption of l=l=g= distribution of error process becomes: O(!) =

W X w=1

Ow (!) =

W X 1 2 (¡ orj( 2w ) ¡ w 2 ) 2 2 w w=1

Moreover the consistency of the parameters estimation requires that both the ¯rst two conditional moments are correctly speci¯ed and simultaneously estimated. The GARCH-in-mean model can be used to estimate the conditional CAPM.

28

Univariate ARCH models

1.8

Long memory in stock returns

The Asymmetric Power ARCH (Ding, Engle and Granger, 1993) uw =  + w w =  w }w }w » Q (0> 1)  w

t s X X  =$+ l (jw¡l j ¡  l w¡l ) +  m  w¡m l=1

m=1

A ¸ ¸ ? ¸

0 0 0 l = 1> = = = > t l ? 1 l = 1> = = = > t 0 m = 1> = = = > s

where $  l ¡1 m

This model imposes a Box-Cox transformation of the conditional standard deviation process and the asymmetric absolute residuals. The asymmetric response of volatility to positive and negative "shocks" is the well known leverage e®ect. If we assume ¤ distribution of uw is conditionally normal, then the condition £ the for existence of H  w and H jw j is t s o ¡1 µ  + 1 ¶ X 1 X n   p l (1 +  l ) + (1 ¡  l ) 2 2 ¡ +  m ? 1= 2 2 l=1 m=1

If this condition is satis¯ed, then when  ¸ 2 we have w covariance stationary. But  ¸ 2 is a su±cient condition for w to be covariance stationary. This generalized version of ARCH model includes seven other models as special cases. 1. ARCH(q) model, just let  = 2 and  l = 0, l = 1> = = = > t,  m = 0, m = 1> = = = > s. 2. GARCH(p,q) model just let  = 2 and  l = 0, l = 1> = = = > t. 3. Taylor/Schwert's GARCH in standard deviation model just let  = 1 and  l = 0, l = 1> = = = > t.

Long memory in stock returns

29

4. GJR model just let  = 2. When  = 2 and 0 ·  l ? 1 t s X X 2 = $+ l (jw¡l j ¡  l w¡l ) +  m  2w¡m

 2w

l=1 t

m=1

s X ¡ ¢ X 2 2 2 = $+ l jw¡l j +  l w¡l ¡ 2 l jw¡l j w¡l +  m  2w¡m l=1

 2w

 2w

=

m=1

8 t s P P 2 2 > 2 > $ +  (1 +  )  +  m  2w¡m < l l w¡l > > : $+

l=1 t P

l=1

l (1 ¡  l )2 2w¡l +

w¡l ? 0

m=1 s P

 m  2w¡m

w¡l A 0

m=1

t t s X X X © 2 2 2 2ª ¡ 2 =$+ l (1 ¡  l ) w¡l + l (1 +  l ) ¡ (1 ¡  l ) Vl w¡l +  m  2w¡m l=1

 2w = $ +

l=1

m=1

t s t X X X l (1 ¡  l )2 2w¡l +  m  2w¡m + 4l  l Vl¡ 2w¡l l=1

m=1

Vl¡

=

½

l=1

1 li w¡l ? 0 0 rwkhuzlvh

If we de¯ne ¤l = l (1 ¡  l )2  ¤l = 4l  l then we have  2w

s t s X X X 2 2 2 =$+ l (1 ¡  l ) w¡l +  m  w¡m +  ¤l Vl¡ 2w¡l l=1

which is the GJR model.

m=1

l=1

30

Univariate ARCH models

When ¡1 ·  l ? 0 we have  2w

t s X X 2 = $+ l (jw¡l j ¡  l w¡l ) +  m  2w¡m l=1 t

m=1

s X ¡ ¢ X 2 2 2 = $+ l jw¡l j +  l w¡l ¡ 2 l jw¡l j w¡l +  m  2w¡m

= $+ = $+

l=1 t

s

m=1

l=1 t

m=1 s

l=1 t

m=1 s

l=1

m=1

X X l (1 ¡  l )2 2w¡l +  m  2w¡m

w¡l A 0

X X l (1 +  l )2 2w¡l +  m  2w¡m

w¡l ? 0

t X X X © ª 2 2 2 = $+ l (1 +  l ) w¡l +  m  w¡m + l (1 ¡  l )2 ¡ (1 +  l )2 Vl+ 2w¡l l=1

t s t X X X © ª 2 2 2 = $+ l (1 +  l ) w¡l +  m  w¡m + l 1 +  2l ¡ 2 l ¡ 1 ¡  2l ¡ 2 l Vl+ 2w¡l

= $+

l=1 t

m=1 s

l=1 t

l=1

m=1

l=1

X X X l (1 +  l )2 2w¡l +  m  2w¡m ¡ 4l  l Vl+ 2w¡l Vl+

=

½

1 li w¡l A 0 0 rwkhuzlvh

de¯ne ¤l = l (1 +  l )2  ¤l = ¡4l  l we have  2w

t s t X X X ¤ 2 2 =$+ l w¡l +  m  w¡m +  ¤l Vl+ 2w¡l l=1

m=1

l=1

which allows positive shocks to have a stronger e®ect on volatility. 5. Zakoian's TARCH model (Zakoian (1991)), let  = 1 and  m = 0, m = 1> = = = > > s. We have w

t X = $+ l (jw¡l j ¡  l w¡l ) l=1

t t X X + l (1 ¡  l ) w¡l ¡ l (1 +  l ) ¡ = $+ w¡l l=1

l=1

Long memory in stock returns

31

where + w¡l

=

½

w¡l 0

li w¡l A 0 rwkhuzlvh

and + ¡ w¡l = w¡l ¡ w¡l

De¯ning + = l (1 ¡  l ) l ¡  l = l (1 +  l )

w = $ +

t t X X l (1 ¡  l ) + ¡ l (1 +  l ) ¡ w¡l w¡l l=1

l=1

If we let  m 6= 0, m = 1> = = = > > t then we get a more general class of TARCH models.

32

Univariate ARCH models

Chapter 2 ESTIMATION PROCEDURES The procedure most often used in estimating 0 in ARCH models involves the maximization of a likelihood function constructed under the auxiliary assumption of an i.i.d. distribution for the standardized innovation }w (). Let i (}w () ; ) denote the density function for }w () ´ %w () @ w (), with mean zero and variance one, where  is the nuisance parameter,  2 K µ Un . Let (|W > |W ¡1 > = = = > |1 ) be a sample realization from an ARCH model as de¯ned by equations (1.1) through (1.5), and # 0 ´ (0 >  0 ), the combined (p + n) £ 1 parameter vector to be estimated for the conditional mean, variance and density functions. The log-likelihood function for the t-th observation is then given by ow (|w ; #) = ln fi [}w () ; ]g ¡

1 £ 2 ¤ ln  w () 2

w = 1> 2> ===

(2.1)

The term ¡ 12 ln [ 2w ()] on the right hand side is the Jacobiam that arises in the transformation from the standardized innovations, }w (), to the observables |w (i (|w ; #) = C}w 1 i (}w () ; ) jMj, where M = = ). C|w  w () The log-likelihood function for the full sample equals the sum of the conditional log likelihoods in eq.(2.1): W X ow (|w ; #) = OW (|W > |W ¡1 > = = = > |1 ; #) =

(2.2)

w=1

b W is The maximum likelihood estimator for the true parameters # 00 ´ (00 >  00 ), say # found by the maximization of eq.(2.2). Assuming the conditional density and the w () and  2w () functions to be di®erentiable for all # 2 £ £ K ´ ª, the maximum likelihood estimator is the solution to W X VW (|W > |W ¡1 > = = = > |1 ; #) ´ vw (|w ; #) = 0

(2.3)

w=1

where vw ´

Cow (|w > #) is the score vector for the tth observation. In particular for the C#

34

Estimation procedures

conditional mean and variance parameters C}w () 1 £ 2 ¤¡1 C 2w Cow (|w > #) ¡1 0 = i [}w () ; ] i [}w () ; ] ¡  w () C C 2 C Ci (}w () ; ) where i 0 [}w () ; ] ´ and C}w à ! p ¡1@2 C 2w w ¡ C  2w ¡ 12 ( 2w ) % () C}w () C %w () C C w p = = 2 C C w  2w = ¡

where

(2.4)

Cw ¡ 2 ¢¡1@2 1 ¡ 2 ¢¡3@2 C 2w  w () ¡  () %w () = C 2 w C %w () ´ |w ¡ w () =

In practice the solution to the set of p + n non-linear equations in (2.3) is found by numerical optimization techniques. In order to implement the maximum likelihood procedure an explicit assumption regarding the conditional density in eq.(2.1). The most commonly employed distribution in the literature is the normal: ) ( 2 }w () i [}w () ; ] = (2)¡1@2 exp ¡ 2 Since the normal distribution is uniquely determined by its ¯rst two moments, only the conditional mean and variance parameters enter the log-likelihood function in equation (2.2); i.e. # = . The log-likelihood is: 1 1 1 ¡ ¢ ow = ¡ ln (2) ¡ }w ()2 ¡ ln  2w 2 2 2 it follows that the score vector in eq.(2.4) takes the form: C}w 1 ¡ 2 ¢¡1 C ( 2w ()) ¡  () C ³2 w ´ C 2 ¡1@2 1 ¡ 2 ¢¡1 C ( 2w ()) %w () C %w () ( w ) ¡  () ¡p C 2 w C  2w ³ ´ 2 ¡1@2 %w () C (|w ¡ w ()) ( w ) 1 ¡ 2 ¢¡1 C ( 2w ()) ¡p ¡  () C 2 w C  2w %w () Cw () 2 ¡1@2 1 ¡ 2 ¢¡3@2 C 2w () %2w () 1 ¡ 2 ¢¡1 C 2w () p p +  w ()  w () ¡  w () 2 C C  2w () 2  2w C 2 2 2 Cw () %w () 1 C w () ¡ 2 ¢¡1 1 C w () %w () ¡ +  w () 2 C  2w () 2 C ( 2w ()) 2 C

vw = ¡}w =

= = =

Estimation procedures

35

· ¸ Cw () %w () 1 ¡ 2 ¢¡1 C 2w () %2w () vw = +  () ¡1 C  2w () 2 w C  2w ()

(2.5)

Several other conditional distributions have been employed in the literature to capture the degree of tail fatness in speculative prices. We have seen above Student's t, GED, Generalized Student's t. 0 When  = (0 >  0 ) where  are the conditional mean parameters and  are the conditional variance parameters, the score takes the form: µ Cow ¶ C vw = Cow C

where Cow Cw () %w () = C C  2w () · ¸ ¢¡1 C 2w () %2w () 1¡ 2 Cow = ¡1 =  () C 2 w C  2w ()

Weiss (1986) provided the ¯rst study of the asymptotic properties of the ARCH MLE. He showed that the MLE is consistent and asymptotically normal, requiring that the normalized data have ¯nite fourth moments. This rules out IGARCH models. Bollerslev and Wooldridge derive the large sample distribution of the QMLE under high-level assumptions: asymptotic normality of the score vector and uniform weak convergence of the likelihood and its second derivatives. They do not verify conditions or show how they might be veri¯ed for GARCH models. Lumsdaine (1996) imposed assumptions upon the rescaled varaible, %w @ w , rather than upon the observed data. As auxiliary assumptions, Lumsdaine assumed that the rescaled variable is independent and identically distributed (i.i.d.) and drawn from a symmetric unimodal density with 32nd moment ¯nite. Lee and Hansen (1994) extended this literature to encompass a much broader class of GARCH processes. They focus on QMLE properties. They assume that conditional mean and variance equations have been speci¯ed correctly and that a likelihood is used as a vehicle to estimate the parameters. Lee and Hansen stress that there is no reason to assume that all of the conditional dependence is contained in the conditional mean and variance, so, %w @ w , the rescaled variable need not be independent over time. They allow for some time dependency, specifying that the rescaled variable is strictly stationary and ergodic. For the IGARCH case they are only able to prove the existence of a consistent root of the likelihood. For this result we need that the conditional 2 +  moment of the rescaled variable is uniformly bounded. Asymptotic normality is proved (including the IGARCH case) by adding the assumption that the conditional fourth moment of the rescaled variable is uniformly bounded.

36

2.1

Estimation procedures

Quasi-Maximum Likelihood Estimation

2.1.1 Kullback Information Criterion In order to study the properties of a model given a set of observations, one can view a model as a good or bad approximation to the "true" but unknown distribution S0 of the observations. In the ¯rst case, one assumes that the distribution S0 generating the observations belongs to the family of distributions associated with the model, i.e., one assumes that S0 2 P. When the family is parametric so that P = fS >  2 £g, the distribution S0 can be de¯ned through a value 0 of the parameter and one has S0 = S0 . This value is called the true value of the parameter. The distribution S0 uniquely de¯nes 0 if the mapping  7! S is bijective, i.e., the model is identi¯ed. When one believes a priori that the true distribution S0 does not belong to P, one says that there may be speci¯cation errors. Then it is interesting to ¯nd the element S0¤ in P that is closest to S0 in order to assess the type of speci¯cation errors by comparing S0 to S0¤ . To do this, one must have a measure of the proximity or dicrepancy between the probability distributions. This is provided by the Kullback Information Criterion. De¯nition 7 (Kullback Information Criterion). Given two distributions S = (i (|) ¢ ) and S ¤ = (i ¤ (|) ¢ ) the quantity µ ¤ ¶ Z µ ¤ ¶ i (|) i (|) ¤ = log i ¤ (|)  (g|) L (S@S ) = H¤ log i (|) i (|) Y

where  is the common dominating measure, is called the Kullback Information Criterion. 2.1.2 Quasi-Maximum Likelihood Estimation Theory To study the relationship between an endogenous variable | and some exogenous variables {, one considers a conditional model specifying the form of the conditional distribution of |1 > = = = > |W given {1 > = = = > {W . It's assumed that the model is parametrized by  2 £ which is an open subset of Rs the densities can be written as O (|1 > = = = > |W j{1 > = = = > {W ; ) =

W Y i (|l j{l ; ) l=1

2£

thus the model implies the mutual independence of the variables \1 > = = = > \W conditionally on [1 > = = = > [W and the equality of the conditional densities i (|l j{l ; ) across observations. We consider the case where the model is misspeci¯ed. The true distribution of the observations is given by the density O0 (|1 > = = = > |W j{1 > = = = > {W ) =

W Y l=1

i0 (|l j{l )

Quasi-Maximum Likelihood Estimation

37

where i0 (|l j{l ) does not belong to the speci¯ed density family, i0 (|l j{l ) 2 @ fi (| j{ ; ) >  2 £g. It's possible to evaluate the discrepancy between the true density i0 and the model fi (| j{ ; ) >  2 £g by the Kullback Information Criterion. This leads naturally to the concept of quasi true value ¤0 of the parameter  that corresponds to the distribution in the model that is closest to i0 . This quasi true value is a solution to maxH[ H0 log i (\ j[ ; ) 2£

where H0 denotes the conditional expectation of \ given [ under i0 . We assume  ¤0 is unique. W of  is De¯nition 8 A quasi (or pseudo) maximum likelihood (QML) estimator b b a solution W to W X max log i (|l j{l ; ) = 2£

l=1

Thus b W is a maximum likelihood estimator based on a misspeci¯ed model. Under regularity conditions, the QML estimator converges almost surely to the pseudo true value ¤0 = Proposition 9 Under regularity conditions, the QML estimator is asymptotically normal distributed with ´ p ³ ¡ ¢ g W b W ¡ ¤0 ! Q 0> D¡1 ED¡1 The matrices D and E are, respectively, equal to: · 2 ¸ C log O () 1 D = ¡ H0 W CC0

· ¸ 1 C log O () C log O () E = H0 W C C0 The matrices D and E are not, in general, equal when speci¯cation errors are present. Thus comparing estimates of the matrices D and E can be useful for detecting speci¯cation errors. In the case of univariate GARCH models and when the parameter vector  is 0 decomposable such as  = (0 >  0 ) where  are the conditional mean parameters and  are the conditional variance parameters we can show the D = E only under special circumstances.

38

Estimation procedures

The second derivatives matrix of the tth log-likelihood function is equal to C 2 ow 1 ¡ 2 ¢¡2 C 2w () C 2w () 1 ¡ 2 ¢¡1 C 2  2w ()  ()  () = ¡ CC0 2 w C C 0 2 w CC0 %2 () C 2w () C 2w () ¡ 2w 3 C C0 ( w ()) 1 %2w () C 2  2w () Cw () C 2w () %w () ¡ 2 ( 2w ())2 CC0 C C0 ( 2w ())2 %w () C 2 w () ¡ 2 ¢¡1 Cw () Cw () + 2 ¡  w ()  w () CC0 C C0 %w () C 2w () Cw () ¡ 2 = 2 C C0 ( w ()) +

given that Hw¡1

Dw

"

%w () ( 2w ())1@2

#

= 0 and Hw¡1

·

¸ %2w () = 1, we have that ( 2w ())

·

¸ · ¸ C 2 ow 1 ¡ 2 ¢¡2 C 2w () C 2w () 1 ¡ 2 ¢¡1 C 2  2w ()  () = H0 ¡ = H0 ¡  w () + + CC0 2 C C0 2 w CCC 0 · · 2 ¸¸ 1 C 2w () C 2w () %w () H0 Hw¡1 + 2 C ( 2w ()) C0 ( 2w ()) · · 2 ¸¸ 1 1 C 2  2w () %w () + H0 ¡ 2 Hw¡1 2  w () CCC 0 ( 2w ()) " " ## 1 Cw () C 2w () %w () Hw¡1 H0 + 3@2 1@2 C C0 ( 2w ()) ( 2w ()) " " # # ¡ 2 ¢¡1 Cw () Cw () 1 C 2 w () %w () +  w () + H0 ¡ Hw¡1 1@2 CC 0 1@2 C C0 ( 2w ()) ( 2w ()) " " # # 1 %w () C 2w () Cw () H0 = 2 Hw¡1 1@2 C C0 ( 2w ()) ( 2w ())

Dw = H0

·

1 ¡ 2 ¢¡2 C 2w () C 2w () ¡ 2 ¢¡1 Cw () Cw ()  () +  w () 2 w C C C0 C0

¸

Quasi-Maximum Likelihood Estimation

39

The information matrix is · ¸ Cow Cow Ew = H0 = C C0 · ¸ Cw () %w () 1 C 2w () %2w () 1 C 2w () ¡ 2 ¢¡1 H0 ¡ £ +  w () C  2w () 2 C ( 2w ())2 2 C ¸0 · 1 C 2w () %2w () Cw () %w () 1 C 2w () ¡ 2 ¢¡1 + ¡  w () C  2w () 2 C ( 2w ())2 2 C · · 4 ¸¸ 1 1 C 2w () C 2w () 1 1 C 2w () C 2w () %w () + = H0 + Hw¡1 4 ( 2w ())2 C 4 ( 2w ())2 C C 0 C 0 ( 2w ())2 ¸¸ · · 2 · ¸ ¡ 2 ¢¡1 Cw () Cw () 1 %w () 1 C 2w () C 2w () H0  w () Hw¡1 + H0 ¡ + C ( 2w ()) 2 ( 2w ())2 C C 0 C0 " " ## µ 2 ¶ 1 1 C w () Cw () Cw () C 2w () %3w () + Hw¡1 = H0 3@2 2 ( 2w ())3@2 C C C 0 C0 ( 2w ()) Ew

·

¸ ¡ 2 ¢¡1 Cw () Cw () 1 1 C 2w () C 2w () = H0 (Nw () ¡ 1) +  w () + 4 ( 2w ())2 C C0 C C 0 · µ 2 ¶ ¸ 1 C w () Cw () Cw () C 2w () 1 H0 + P3w () 2 ( 2w ())3 C C0 C C0

1 4 2 Hw¡1 [%w ()]. ()) 0 Whenever it is possible to decompose the parameter vector in  = (0 >  0 ) , the hessian matrix for the tth is: ¸ 3 2 · ¡1 Cw () Cw () 2 0 7 6 H ( w ()) C C0 ¸ 7 · Dw = 6 2 2 5 4 ¡2 C w () C w () 0 H 12 ( 2w ()) C C 0 P3w () = Hw¡1 [%3w ()] and Nw () =

( 2w

¸ · ¸ · Cw () C 2w () 1 ¡1 Cw () Cw () 2 P3w () H ( w ()) H 3 6 2 C C0 C C 0 2 ( ()) w 6 · ¸ · ¸ Ew = 4 1 C 2w () Cw () C 2w () C 2w () 1 H P3w () H (Nw () ¡ 1) 3 2 C C0 C C 0 2 ( 2w ()) 4 ( 2w ()) 2

b are The asymptotic variance-covariance matrices of the QML estimators  b W and  W ¸¸ · · hp i ¢¡1 Cw () Cw () ¡1 ¡ 1X Y du dv| W (b W ¡ ) = H  2w () W C C0

3 7 7 5

40

Y du

Estimation procedures dv|

· X · ¸¸ ´i hp ³ ¡ 2 ¢¡2 C 2w () C 2w () ¡1 1 1 b ¡ W  = £ H  () W W 2 w C C 0 · X · ¸¸ 1 1 C 2w () C 2w () H (Nw () ¡ 1) £ 2 W C C 0 4 ( 2w ()) · X · ¸¸¡1 1 ¡ 2 ¢¡2 C 2w () C 2w () 1 H  () W 2 w C C 0

When the true conditional distribution is normal P3w () = 0 and Nw () = 3. In this case, the expressions for Dw and Ew coincide. The asymptotic variance-covariance b reduces to: matrices of the QML estimator  W ¸¸ · ´i hp ³ X ·1 ¡ ¢¡2 C 2w () C 2w () ¡1 1 dv| 2 b ¡ = Y du W  H  () = W W 2 w C C 0 2.2

Testing in GARCH models

2.2.1 The GARCH(1,1) case Suppose the true model is: |w =  + %0w  20w = $ 0 + 0 %20w¡1 +  0  20w¡1 ¡ ¢ %0w jªw¡1 » Q 0>  20w 0 = (0 > $ 0 > 0 >  0 )0

The estimated model is: |w =  + %w  2w = $ + %2w¡1 +  2w¡1  = (> $> > )0 OW = 1 D0 = ¡ H W

X

ow ()

·P

C 2 ow (0 ) CC0

¸

Testing in GARCH models

41

DW () = ¡ 1 DW () = W

"

1 X C 2 ow () W CC0

W

W

1 X ¡ 2 ¢¡2 C 2w C 2w X ¡ ¡2 ¢ C%w C%w +  w 2 w=1 w C C0 C C0 w=1

#

· ¸ CO (0 ) CO (0 ) 1 E0 = H W C C0 W 1 X Cow Cow EW () = W w=1 C C0

³ ´ ³ ´ DW b  and EW b  are consistent estimators of D and E, b  is the maximum likelihood estimator of  0 . When %0 is conditionally normal D = E. Moreover ´ p ³ g G W b  ¡  0 ! Q (0> Ln )

G can be D1@2 in the conditionally case or E ¡1@2 D in the general case. Robust form of the t statistics ´ p ³ ¡1@2 EW DW W b  ¡ 0 non robust form is:

´ p ³ b W  ¡  D1@2 0 W

The t statistics has to be compared to a standard normal distribution. De¯ne the null hypothesis: K0 : 0 +  0 = 1 K0 : j (0 ) = 0 +  0 ¡ 1 = 0= De¯ne b X the ML of the unrestricted model and b  U the estimator of the restricted model. The statistics are: h ³ ´ ³ ´i b  OU = ¡2 O U ¡ O b X  QU OP

³ ´ 10 ³ ´1 0 b b ³ ´ CO  CO U U 1@ ¡1 b A A @ = DW  U W C C 0

42

Estimation procedures

U OP

20 ³ ´ 10 0 ³ ´ 13 b ³ ´ CO  Cj b U U 1 6@ ¡1 b A @ A7 = D  4 5£ U W W C C

³ ´1 3¡1 ³ ´ ³ ´ ³ ´ Cj b U b b 4@ A D¡1 U EW b  U D¡1 U 5 £ W W C0 20 ³ ´ 1 0 ³ ´ 13 ³ ´ CO b Cj b U U ¡1 b A @ 4@ A5 D  U W C0 C 20

U Z

³ ´1 0 ³ ´ 13¡1 b ³ ´0 ³ ´ ³ ´ ³ ´ ³ ´ Cj b Cj X U ¡1 b ¡1 b b b b 4 @ A @ A 5 = W j X D j X  E  D  X W X X W W C0 C 20

 QU Z

20 ³ ´ 1 0 ³ ´ 13¡1 ³ ´0 ³ ´ Cj b ³ ´ Cj b X U ¡1 b b A 5 A @ = Wj b  X 4@ D  j X X W C 0 C

QU The Wald statistics ( U Z and  Z ) are the squares of the (robust and non robust, respectively) t statistics for j () = 0.

2.3

Testing for ARCH disturbances

We want to test for the presence of ARCH e®ect. This can be done with a LM test. The test is based upon the score under the null and information matrix under the null. The null hypothesis is 1 = 2 = = = = = t = 0 Consider model with  2w =  2 (}w ), where  2 (¢) is a di®erentiable function. ¢ ¡ 2the ARCH 2 }w = 1>bw¡1 > = = = >bw¡t ,  = (0 > 1 > = = = > t )0 where bw are the OLS residuals. Under the null,  2w is a constant  2w =  20 . The derivative of  2w with respect to  is C 2w =  20 }w0 C

where  20 is the scalar derivative of  2 (}w ). Recalling that the log-likelihood function is ¸ W W · X X ¡ 2 ¢ 1 b2w 1 OW = ow () = ¡ log  w ¡ 2 2  2w w=1 w=1

Testing for ARCH disturbances

43

the derivative of ow with respect to  is:

· ¸  20 }w0 b2w Cow = ¡1 C 2 2w  2w

the score under the null is

¶ µ  20 X 0 b2w  20 0 0 COW }w ¡ 1 = ]i j0 = 2 C 2 0 w  20 2 20 h³ 2 ´ ³2 ´i0 bW b1 0 where i = 2 ¡ 1 > = = = > 2 ¡ 1 and ] 0 = (}10 > = = = > }W0 ) is a ((t + 1) £ W ) ma0 0 trix. The second derivatives matrix is ¸ · · ¸  20 }w0  20 }w b2w  20 }w0 ¡ 20 }wb2w C 2 ow = ¡ ¡1 + CC0 2 4w  2w 2 2w  4w µ ¶2 2 µ ¶2 µ ¶2 2 bw 0 bw 0 1  20 1  20 1  20 0 }} + }w }w ¡ } }w = ¡ 2 2 w w 2 2 2 w w 2 w 2 w  2w w µ 20 ¶2 2 µ ¶2  bw 0 1  20 = ¡ } }w + }w0 }w  2w  2w w 2  2w This yields the information matrix under the null: ¸ ¸¸ · 2 · ·X 2 1 1 C OW C ow D>0 = ¡ H j0 = ¡ H H j©w¡1 j0 W CC0 W CC0 ·X · 2 ¸¸ 1 C ow H = ¡ H j©w¡1 j0 W CC0 " " µ ¶ ## µ ¶2 2 2 X 1  20 bw 0 1  20 H ¡ = ¡ H } }w + }w0 }w j©w¡1 j0 W  2w  2w w 2  2w ( ) µ ¶2 µ 20 ¶2 W 1X  1  20 = ¡ H [}w0 }w ] + H [}w0 }w ] = W w=1 2  20  20 µ ¶2 X W 1 1  20 H [}w0 }w ] = = 2  20 W w=1 The LM statistic is given by  OP

 OP

1 = W

µ

COW j0 C

¶0

D¡1 >0

µ

COW j0 C



" µ ¶ W #¡1 2  20 0 0  20 1  20 X 0 = i ] 2 H [} } ] ]i w w 2 2 0 2  20 2 0 w=1 Ã W !¡1 X 0 = i0 ] H [}w0 }w ] ] 0 i 0 @2 00

w=1

44

Estimation procedures

it can be consistently estimated by 0

 OP = i 0 ] (] 0 ])

¡1

] 0 i 0 @2=

¢ ¡ 0 When we assume normality solp i 0 i 0 @W = 2. Thus an asymptotically equivalent statistic would be ³ 0 ´ ¡1 0 0 ¤ 0 00  = W i ] (] ]) ] i @ i 0 i 0 = W U2

where U2 is the squared multiple correlation between i 0 and ]. Since adding a constant and multiplying by a scalar will not change the U2 of a regression, this is also the U2 of the regression of b2w on an intercept and t lagged values of b2w . The statistic will be asymptotically distributed as chi square with t degrees of freedom when the null hypothesis is true. The test procedure is to run the OLS regression and save the residuals. Regress the squared residuals on a constant and t lags and test W U2 as a "2t . This will be an asymptotically locally most powerful test. Lee and King (1993) derive a locally most powerful (LMMP) - based score test for the presence of ARCH and GARCH disturbances. Wald and likelihood ratio (LR) criteria could be used to test the hypothesis of conditional homoskedasticity e.g. against a GARCH(1,1) alternative. The statistic associated with K0 : 1 =  1 = 0 against K1 : 1 ¸ 0 or  1 ¸ 0 with at least one strict inequality do not have a "2 distribution with two degrees of freedom can be shown to be conservative. 2.4

Test for Asymmetric E®ects

Implicit in any volatility model is a particular news impact curve. The standard GARCH model has news impact curve which is symmetric and centered at %w¡1 = 0. That is, positive and negative return shocks of the same magnitude produce the same amount of volatility. Also, larger return shocks forecast more volatility at a rate proportional to the square of the size of the return shock. If a negative return shock causes more volatility than a positive return shock of the same size, the GARCH model underpredicts the amount of volatility following bad news and overpredicts the amount of volatility following good news. Furthermore, if large return shocks cause more volatility than a quadratic function allows,, then the standard GARCH model underpredicts volatility after a large return shock and overpredicts volatility after a small return shock. Engle and Ng [16] put forward three diagnostic tests for volatility models: the Sign Bias Test, the Negative Size Bias Test, and the Positive Size Bias Test. These tests examine whether we can predict the squared normalized residual by some variables observed in the past which are not included in the volatility model being used. If these variables can predict the squared normalized residual, then the variance

Test for Asymmetric E®ects

45

model is misspeci¯ed. The sign bias test examines the impact of positive and negative return shocks on volatility not predicted by the model under consideration. The negative size bias test focuses on the di®erent e®ects that large and small negative return shocks have on volatility which are not predicted by the volatilty model. The positive size bias test focuses on the di®erent impacts that large and small positive return shocks may have on volatility, which are not explained by the volatility model. To derive the optimal form of these tests, we assume that the volatility model under the null hypothesis is a special case of a more general model of the following form: ¢ ¡ ¢ ¡ (2.6) log  2w = log  20w ( 00 }0w ) +  0d }dw

where  20w (00 }0w ) is the volatility model hypothesized under the null,  0 is a (n £ 1) vector of parameters under the null, }0w is a (n £ 1) vector of explanatory variables under the null,  d is a (p £ 1) vector of additional parameters, }dw is a (p £ 1) vector of missing explanatory variables. This form encompasses both the GARCH and EGARCH models. For the GARCH(1,1) model  20w ( 00 }0w ) =  00 }0w £ ¤0 }0w ´ 1>  2w¡1 > %2w¡1  0 ´ [$> > ]0

 d = [ ¤ > !¤ > # ¤ ]0 · µ ¶¸0 ¡ 2 ¢ %w¡1 j%w¡1 j p }dw = log  w¡1 > > ¡ 2@  w¡1  w¡1

The encompassing model is

¡ ¢ £ ¤ ¢ ¡ %w¡1 log  2w = log $ +  2w¡1 + %2w¡1 +  ¤ log  2w¡1 + !¤ + #¤  w¡1

µ

j%w¡1 j p ¡ 2@  w¡1



when  =  = 0 is an EGARCH(1,1) while with  ¤ = !¤ = # ¤ = 0 is a GARCH(1,1) model. The null hypothesis is  d = 0. Let w be the normalized residual corresponding %w to observation w under the volatility model hypothesized. That is, w ´ . The LM w test statistic for K0 :  d = 0 in (2.6) is a test of  d = 0 in the auxiliary regression 2 w

¤0 ¤0 = }0w  0 + }dw  d + xw

(2.7)

46

Estimation procedures

¶ µ 2¶ C 2w C w C 2w C 2w ¡2 ¤ where ´ , }dw ´  0w . Both and are evaluated at C0 C d C 0 C d  d = 0 and  0 (the maximum likelihood estimator of  0 under K0 ). If the parameters restrictions are met, the right-hand side variables in (2.7) should have no explanatory variables power at all. Thus, the test is often computed as ¤ }0w

 ¡2 0w

µ

 OP = W U2 where U2 is the squared multiple correlation of (2.7), and W is the number of observations in the sample¤ . The LM statistic is aasymptotically distributed as chi-square with p degrees of freedom when the null hypothesis is true, whereµp is ¶the number C 2w of parameter restrictions. Under the encompassing model (2.6), evaluated C d ¤ under the null is equal toy  20w }dw , hence }dw = }dw . The regression actually involves ¡ ¤ 2 regressing w on a constant }0w and }dw . The variables in }dw are Vw¡1 , Vw¡1 %w¡1 and + Vw¡1 %w¡1 . The optimal form for conducting the sign bias test is: 2 w

¡ ¤ = d + e1 Vw¡1 +  0 }0w + hw

where ¡ Vw¡1

=

½

1 %w¡1 ? 0 0 rwkhuzlvh

the regression for the negative size bias test is: 2 w

¡ ¤ = d + e2 Vw¡1 %w¡1 +  0 }0w + hw

the positive size bias test statistic: 2 w

¤ + = d + e3 Vw¡1 %w¡1 +  0 }0w + hw

+ Vw¡1

=

½

1 %w¡1 A 0 0 rwkhuzlvh

¤

However, for highly nonlinear models, the numerical optimization algorithm generally does not ¤ . Engle and Ng ([16]) propose to regress |w2 on }0w alone, guarantee exact orthogonality of 2w to }0w and use the residuals from this regression (which are now guaranteed to be orthogonal to }0w ) in place of 2w . y In fact, ¡ ¢ ¡ ¢ 2w =  20w  00 }0w exp  0d }dw ¢ ¡ C 2w = 20w }dw exp  0d }dw C d

under the null,  d = 0,

C 2w = 20w }dw . C d

Test for Asymmetric E®ects

47

The t-ratios for e1 , e2 and e3 are the sign bias, the negative size bias, and the positive size bias test statistics, respectively. The joint test is the LM test for adding the three variables in the variance equation (2.6) under the maintained speci¯cation: 2 w

¡ ¡ ¤ + = d + e1 Vw¡1 + e2 Vw¡1 %w¡1 + e3 Vw¡1 %w¡1 +  0 }0w + hw

The test statistics is W U2 . If the volatility model is correct then e1 = e2 = e3 = 0, ¤  = 0 and hw is i.i.d. If }0w is not included the test will be conservative; the size will be less than or equal to the nominal size, and the power may be reduced.

48

Estimation procedures

Chapter 3 MULTIVARIATE GARCH MODELS 3.1

Introduction

The extension from a univariate GARCH model to an N -variate model requires allowing the conditional variance-covariance matrix of the N -dimensional zero mean random variables %w depend on the elements of the information set. Let f}w g be a sequence of (Q £ 1) i.i.d. random vector with the following characteristics: H [}w ] = 0 H [}w }w0 ] = LQ }w » J (0> LQ ) with J continuous density function. Let f%w g be a sequence of (Q £ 1) random vectors generated as: 1@2

%w = Kw }w where Hw¡1 (%w ) = 0 Hw¡1 (%w %0w ) = Kw where Kw is a matrix (Q £ Q) positive de¯nite and measurable with respect to the information set ªw¡1 , that is the -¯eld generated by the past observations: f%w¡1 > %w¡2 > = = = g. The parametrization of Kw as a multivariate GARCH, which means as a function of the information set ªw¡1 , allows each element of Kw to depend on q lagged of the squares and cross-products of %w , as well as p lagged values of the elements of Kw , and a (M £ 1) vector of dummies. So the elements of the covariance matrix follow a vector of ARMA process in squares and cross-products of the disturbances.

50

3.2

Multivariate GARCH models

Vech representation

Let vech denote the vector-half operator, which stacks the lower triangular elements of an Q £ Q matrix as an [Q (Q + 1) @2] £ 1 vector. Since the conditional covariance matrix Kw is symmetric, yhfk (Kw ) contains all the unique elements in Kw . Following Bollerslev et al. [6], a natural multivariate extension of the univariate GARCH(p,q) model is t s X ¡ ¢ X ¤ 0 yhfk (Kw ) = Z + Dl yhfk %w¡l %w¡l + Em¤ yhfk (Kw¡m ) l=1 ¤

= Z +D

m=1

(O) yhfk (%w %0w )

¤

+ E (O) yhfk (Kw )

(3.1)

Z is a [Q (Q + 1) @2]£1 vector, the D¤l and Em¤ are [(Q (Q + 1) @2) £ (Q (Q + 1) @2)] matrices. This general formulation is £termed vec representation by Engle and¤ Kroner [15]. The number of parameters is Q (Q + 1) @2 + (s + t) [Q (Q + 1) @2]2 . Even for low dimensions of N and small values of p and q the number of parameters is very large; for Q = 5 and s = t = 1 the unrestricted version of (3.1) contains 465 parameters. For any parametrization to be sensible, we require that Kw be positive de¯nite for all values of %w in the sample space in the vech representation this restriction can be di±cult to check, let alone impose during estimation. 3.2.1 Diagonal vech model A natural restriction that was ¯rst used in the ARCH context by Engle, Granger and Kraft [14] and in the GARCH context by Bollerslev et al [6] is the diagonal representation, in which each element of the covariance matrix depends only on past values of itself and past values of %mw %nw . In the diagonal model the D¤l and Em¤ matrices are all taken to be diagonal. For Q = 2 and s = t = 1, the diagonal model is written as: 3 2 3 2 ¤ z1 d11 k11>w 4 k21>w 5 = 4 z2 5 + 4 0 k22>w z3 0 2 ¤ e11 0 0 ¤ 4 e22 0 + 0 0 0 e¤33 2

32 2 3 %1>w¡1 0 0 d¤22 0 5 4 %1>w¡1 %2>w¡1 5 %22>w¡1 0 d¤33 32 3 k11>w¡1 5 4 k21>w¡1 5 k22>w¡1

Thus the (l> m) wk element in Kw depends on the corresponding (l> m) wk element in %w¡1 %0w¡1 and Kw¡1 . This restriction reduces the number of parameters to [Q (Q + 1) @2] (1 + s + t). This model does not allow for causality in variance, co-persistence in variance and asymmetries.

BEKK representation

3.3

51

BEKK representation

Engle and Kroner ([15]) propose a parametrization that impose positive de¯niteness restrictions. Consider the following model Kw = FF 0 +

t s N X N X X X 0 Dln %w¡l %0w¡l D0ln + Eln Kw¡l Eln n=1 l=1

(3.2)

n=1 l=1

where F, Dln and Eln . The intercept matrix is decomposed into FF 0 , where F is a lower triangular matrix. Without any further assumption FF 0 is positive semidefinite. This representation is general that it includes all positive de¯nite diagonal representations and nearly all positive de¯nite vech representations. For exposition simplicity we will assume that N = 1: Kw = FF 0 +

t s X X Dl %w¡l %0w¡l D0l + El Kw¡l El0 l=1

(3.3)

l=1

To illustrate the BEKK model, consider the simple GARCH(1,1) model: Kw = FF 0 + D1 %w¡1 %0w¡1 D01 + E1 Kw¡1 E10

(3.4)

Proposition 10 (Engle and Kroner [15]) Suppose that the diagonal elements in F are restricted to be positive and that d11 and e11 are also restricted to be positive. Then if N = 1 there exists no other F, D1 , E1 in the model (3.4) that will give an equivalent representation. The purpose of the restrictions is to eliminate all other observationally equivalent structures. For example, as relates to the term D1 %w¡1 %0w¡1 D01 the only other observationally equivalent structure is obtained by replacing D1 by ¡D1 . The restriction that d11 (e11 ) be positive could be replaced with the condition that dlm (elm ) be positive for a given l and m, as this condition is also su±cient to eliminate ¡D1 from the set of admissible structures. In the bivariate case the BEKK becomes ¸· 2 ¸· ¸0 · %1w¡1 d11 d12 d11 d12 %1w¡1 %2w¡1 0 Kw = FF + d21 d22 %2w¡1 %1w¡1 %22w¡1 d21 d22 ¸· ¸· ¸0 · k11w¡1 k12w¡1 e11 e12 e11 e12 + e21 e22 k21w¡1 k22w¡1 e21 e22 For what concerns the positive de¯niteness of Kw , we have the following result. Proposition 11 (Engle and Kroner [15]) (Su±cient condition) In a GARCH(p,q) model, if K0 , K¡1 > = = = > K¡s+1 are all positive de¯nite, then the BEKK parametrization (with N = 1) yields a positive de¯nite Kw for all possible values of %w if F is a full rank matrix or if any El l = 1> = = = > s is a full rank matrix (the intersection of the null spaces of F 0 and El0 l = 1> = = = > s is null).

52

Multivariate GARCH models

Proof. For simplicity consider the GARCH(1,1) model. The BEKK parameterization is Kw = FF 0 + +D1 %w¡1 %0w¡1 D01 + E1 Kw¡1 E10 The proof proceeds by induction. First Kw is p.d. for w = 1: The term D1 %0 %00 D01 is positive semide¯nite because %0 %00 is positive semide¯nite. Also if the null spaces of the matrices of F and E1 intersect only at the origin, that is at least one of two is full rank then FF 0 + E1 K0 E10 is positive de¯nite. This is true if F or E1 has full rank. To show that the null space condition is su±cient FF 0 + E1 K0 E10 is p.d. if and only if {0 (FF 0 + E1 K0 E10 ) { A 0

8{ 6= 0

or ³ ´0 ³ ´ 0 1@2 1@2 K0 E10 { A 0 (F 0 {) (F 0 {) + K0 E10 {

1@20

1@2

8{ 6= 0

(3.5)

1@2

where K0 = K0 K0 and K0 is full rank. De¯ning Q (S ) to be the null space of the matrix S , (3.5) is true if and only if ³ ´ 1@2 0 0 Q (F ) \ Q K0 E1 = ;=

³ ´ 1@2 1@2 Q K0 E10 = Q (E10 ) because K0 is full rank. This implies that FF 0 + E1 K0 E10 ³ ´ 1@2 is positive de¯nite if and only if Q (F 0 ) \ Q K0 E10 = ;. Now suppose that Kw is positive de¯nite for w =  . Then K +1 = FF 0 + D1 % %0 D01 + E1 K E10 is positive de¯nite if and only if, given that D1 % %0 D01 is positive semide¯nite, the null space condition holds, because K is positive de¯nite by the induction assumption. We now examine the relationship between the BEKK and vech parametrizations. The mathematical relationship between the parameters of the two models can be found simply vectorizing the equation (3.3): yhf(Kw ) = yhf(FF 0 ) +

t s X X yhf(Dl %w¡l %0w¡l D0l ) + yhf(El Kw¡l El0 ) l=1

l=1

BEKK representation

53

where yhf () is an operator such that given a matrix D (q £ q), yhf(D) is a (q2 £ 1) vector. The yhf () satis¯es yhf (DEF) = (F 0 ­ D) yhf (E) then 0

yhf(Kw ) = yhf(FF ) +

t X l=1

(Dl ­ Dl ) yhf(%w¡l %0w¡l )

s X + (El ­ El ) yhf(Kw¡l ) l=1

For D (q £ q) symmetric, then yhfk (D) contains precisely the q (q + 1) @2 distinct elements of D and the elements of yhf (D) are those of yhfk (D) with some repetitions. Hence there exists a unique q2 £ q (q + 1) @2 which transforms, for symmetric D, yhfk (D) into yhf (D). This matrix is called the duplication matrix and is denoted Gq : yhf (D) = Gq yhfk (D) where Gq is the duplication matrix. 0

GQ yhfk (Kw ) = GQ yhfk(FF ) +

t X l=1

(Dl ­ Dl ) GQ yhfk(%w¡l %0w¡l )

s X + (El ­ El ) GQ yhfk(Kw¡l ) l=1

If GQ is a full column rank matrix we can de¯ne the generalized inverse of GQ as: ¡1

0 + GQ = (GQ GQ )

0 GQ

that is a (Q (Q + 1) @2) £ (Q 2 ) matrix, where + GQ GQ = LQ + This implies that premultiplying by GQ

+ yhfk (Kw ) = yhfk(FF 0 ) + GQ

+ +GQ

às X l=1

à t X l=1

!

!

(Dl ­ Dl ) GQ yhfk(%w¡l %0w¡l )

(El ­ El ) GQ yhfk(Kw¡l )

54

Multivariate GARCH models

One implication of this result is that the vech model implied by any given BEKK model is unique, while the converse is not true. The transformation from a vech model to a BEKK model (when it exists) is not unique, because for a given D¤1 the choice of D1 is not unique. This can be seen recognizing that (Dl ­ Dl ) = (¡Dl ­ ¡Dl ) so + while D¤l = GQ (Dl ­ Dl ) GQ is unique, the choice of Dl is not unique. It can also be shown that all positive de¯nite diagonal vech models can be written in the BEKK framework. + (Dl ­ Dl ) GQ is also diagonal, with diagGiven Dl diagonal matrix, then GQ onal elements given by dll dmm (1 · m · l · Q ) (See Magnus [22]). 3.3.1 Covariance Stationarity Given the vech model yhfk (Kw ) = Z + D¤ (O) yhfk (%w %0w ) + E ¤ (O) yhfk (Kw ) the necessary and su±cient condition for covariance stationary of f%w g is that all the ¤ eigenvalues of D¤ (1) But de¯ning D¤ (1) = µ t ¶ + E (1) are less than µ tone in modulus. ¶ P P + + (Dl ­ Dl ) GQ and E ¤ (1) = GQ (El ­ El ) GQ . This implies also that GQ l=1

l=1

stationary if and only if all the eigenvalues of in the model, µ BEKK ¶ f%w g is covariance µ t ¶ t P P + + GQ (Dl ­ Dl ) GQ + GQ (El ­ El ) GQ are less than one in modulus. Let l=1 l=1 µt ¶ P + 1> = = = > Q the eigenvalues of Dl , the eigenvalues of GQ (Dl ­ Dl ) GQ are l m l=1

(1 · m · l · Q). (Magnus, [22]) For a GARCH(p,q) in vech form, the unconditional covariance matrix, when it exists, is given by¤ H (yhfk(%w %0w )) = yhfk (H (%w %0w )) = [LQ ¤ ¡ D¤ (1) ¡ E ¤ (1)]¡1 Z

and in the BEKK modely £ ¤¡1 + + yhfk (H (%w %0w )) = LQ ¤ ¡ GQ (D1 ­ D1 ) GQ ¡ GQ (E1 ­ E1 ) GQ yhfk (FF 0 ) ¤

Given that yhfk (%w %0w ) = yhfk (Kw ) + yhfk (Yw )

with H (yhfk (Yw )) = yhfk(H (Yw )) = 0 ¢¢ £ ¡ ¢ ¡¡ ¤ yhfk (%w %0w ) = Z + D¤1 yhfk %w¡1 %0w¡1 + E1¤ yhfk %w¡1 %0w¡1 ¡ yhfk (Yw¡1 ) ¢¢ ¢¢ ¡ ¡ ¡ ¡ yhfk (H (%w %0w )) = Z + D¤1 yhfk H %w¡1 %0w¡1 + E1¤ yhfk H %w¡1 %0w¡1

y

Given that %w %0w = Kw + Yw

Constant Correlations Model

55

Q ¤ = Q (Q + 1) @2. The diagonal vech model is stationary if and only if the sum d¤ll +e¤ll ? 1 for all l. In the diagonal BEKK model the covariance stationary condition is that d2ll + e2ll ? 1. Only in the case of diagonal models the stationarity properties are determined solely by the diagonal elements of the Dl and El matrices. 3.4

Constant Correlations Model

In the constant correlations model put forward by Bollerslev ([3]) the time-varying conditional covariances are parametrized to be proportional to the product of the corresponding conditional standard deviations. This assumption greatly simpli¯es the computational burden in estimation, and conditions for Kw to be positive de¯nite a.s for all w are easy to impose. The model assumptions are: Hw¡1 [%w ] = 0 Hw¡1 [%w %0w ] = Kw fKw gll =  2lw fKw glm =  lmw = lm  lw  mw

l 6= m

Let Gw denote the (Q £ Q) diagonal matrix with the conditional variances along the diagonal, fGw gll =  2lw . Let ¡w denote the matrix of constant correlations with lm ¡ wk element given by h i¡1@2 l> m = 1> = = = > Q f¡w glm = fKw glm fKw gll fKw gmm the model assumes ¡w = ¡

1@2

1@2

Kw = Gw ¡Gw with H (Yw ) = 0

¢ ¢ ¡ £ ¡ ¤ yhf (%w %0w ) = yhf (FF 0 ) + (D1 ­ D1 ) yhf %w¡1 %0w¡1 + (E1 ­ E1 ) yhf %w¡1 %0w¡1 ¡ yhf (Yw¡1 ) ¢¤ £ ¡ H [yhf (%w %0w )] = yhf (FF 0 ) + [(D1 ­ D1 ) + (E1 ­ E1 )] H yhf %w¡1 %0w¡1 H [yhf (%w %0w )] = [LQ 2 ¡ (D1 ­ D1 ) ¡ (E1 ­ E1 )]¡1 yhf (FF 0 ) or in vech representation as ¢¢ ¢¢ ¡ ¡ ¡ ¡ GQ yhfk (H (%w %0w )) = GQ yhfk (FF 0 ) + (D1 ­ D1 ) GQ yhfk H %w¡1 %0w¡1 + (E1 ­ E1 ) GQ yhfk H %w¡1 %0w¡1

56

Multivariate GARCH models

2

 1w 6 .. Kw = 4 . 0

¢¢¢ ...

0 .. .

¢¢¢

 Qw

When Q = 2 Kw = =

3

2

6 76 56 4 ·

·

1

12

21 .. . Q1

1 .. .

 1w 0

...

Q¡1Q 1

QQ¡1

0  2w

 21w 12  1w  2w

¸·

3

1Q .. .

1 21

12 1 ¸

12  1w  2w  22w

¸·

 1w 0

2 7 76 74 5

 1w .. .

¢¢¢ ...

0 .. .

0

¢¢¢

 Qw

0  2w

3

7 5=

¸

=

If the conditional variances along the diagonal in the Gw matrices are all positive, and the conditional correlation matrix ¡ is positive de¯nite, the sequence of conditional covariance matrices fKw g is guaranteed to be positive de¯nite a.s. for all w. Furthermore the inverse of Kw is given by ¡1@2 ¡1

Kw¡1 = Gw

¡1@2

¡ Gw

=

When calculating the log-likelihood function only one matrix inversion is required for each evaluation. 3.5

Factor ARCH model

The Factor GARCH model, introduced by Engle et al. ([17]), can be thought of as an alternative simple parametrization of the BEKK model. Suppose that the (Q £ 1) |w has a factor structure with N factors given by the N £ 1 vector iw and a time invariant factor loadings given by the Q £ N matrix E: |w = Eiw + %w

(3.6)

Assume that the idiosyncratic shocks %w have conditional covariance matrix ª which is constant in time and positive semide¯nite, and that the common factors are characterized by Hw¡1 (iw ) = 0 Hw¡1 (iw iw0 ) = ¤w ¤w = gldj (1 > = = = > N ) and positive de¯nite. The conditioning set is f|w¡1 > iw¡1 > = = = > |1 > i1 g. Also suppose that H (iw %0w ) = 0. The conditional covariance matrix of |w equals Hw¡1 (|w |w0 ) = Kw = ª + E¤w E 0 = ª +

N X  n  0n nw n=1

Factor ARCH model

57

where  n denotes the nth column in E. Thus, there are N statistics which determine the full covariance matrix. Forecasts of the variances and covariances or of any portfolio of assets, will be based only on the forecasts of these N statistics. There exists factor-representing portfolios with portfolio weights that are orthogonal to all but one set of factor loadings: unw = !0n |w !0n  m

=

½

1 0

n=m rwkhuzlvh

the vector of factor-representing portfolios is uw = ©0 |w where the columns of matrix © are the !n vectors. The conditional variance of unw is given by Y duw¡1 (unw ) = !0n Hw¡1 (|w |w0 ) !n = !0n Kw !n = !0n (ª + E¤w E 0 ) !n = # n + nw where # n = !0n ª!n . The portfolio has the exact time variation as the factors, which is why they are called factor-representing portfolios. In order to estimate this model, the dependence of the nw 's upon the past information set must also be parametrized: nw ´ !0n Kw !n = Y duw¡1 (unw ) = # n + nw So we get that N N N X X X 0 0  n  n nw =  n  n #n +  n  0n nw n=1

n=1

n=1

N N N X X X 0 0  n  n nw =  n  n nw ¡  n  0n # n n=1

Kw

n=1

n=1

N N N X X X 0 0 = ª+  n  n nw = ª +  n  n nw ¡  n  0n # n n=1 N X

 n  0n nw

= ª¤ +

n=1

n=1

n=1

58

Multivariate GARCH models

µ

¤

where ª =

N P

 n  0n # n n=1

ª¡



. The simplest assumption is that there is a set of

factor-representing portfolios with univariate GARCH(1,1) representations. The conditional variance nw follows a GARCH(1,1) process ¡ 2 ¢ 2 nw = $ n + n (!0n %w¡1 ) +  n Hw¡2 unw¡1

¡ ¢ nw = $ n + n !0n %w¡1 %0w¡1 !n +  n Hw¡2 [(!0n |w ) (!0n |w )] ¡ ¢ nw = $ n + n !0n %w¡1 %0w¡1 !n +  n [!0n Hw¡2 (|w |w0 ) !n ] ¡ ¢ nw = $ n + n !0n %w¡1 %0w¡1 !n +  n [!0n Kw¡1 !n ]

(A lezione si era assunto $ n = # n ) The conditional variance-covariance matrix of |w can be written as Kw = ª¤ + = ª¤ + =

Ã

¤

N X  n  0n nw n=1 N X

ª © £ ¡ ¢ ¤  n  0n $ n + n !0n %w¡1 %0w¡1 !n +  n [!0n Kw¡1 !n ]

n=1 N X

 n  0n $ n

ª +

n=1

!

N X © £ ¡ ¢ ¤ ª +  n  0n n !0n %w¡1 %0w¡1 !n +  n [!0n Kw¡1 !n ] n=1

N X  n  0n nw Kw = ¡ + n=1

where ¡ = ª¤ +

N P

 n  0n $ n , therefore

n=1 N N X £ ¡ ¢ ¤ X 0 0 0 n  n !n %w¡1 %w¡1 !n  n +  n [ n !0n Kw¡1 !n  0n ] Kw = ¡ + n=1

n=1

so that the factor GARCH model is a special case of the BEKK parametrization. Estimation of the factor GARCH model is carried out by maximum likelihhod estimation. It is often convenient to assume that the factor-representing portfolios are known a priori.

Asymmetric Multivariate GARCH-in-mean model

3.6

59

Asymmetric Multivariate GARCH-in-mean model

A general multivariate model can be written as: |w =  + ¦ (O) |w¡1 + ©{w¡1 + ¤yhfk (Kw ) + w

(3.7)

where |w is a (Q £ 1) vector of weakly stationary variables (that is, asset returns), ¦ (O) = ¦1 + ¦2 O + ¢ ¢ ¢ + ¦n On¡1 , {w¡1 contains predetermined variables. w is the vector of innovation with respect to the information set formed exclusively of past realizations of |w . ¤ is a (Q £ Q (Q + 1) @2): Kw = Hw¡1 (w 0w ) t s X X 0 0 Kw = FF + Dl (%w¡l + ) (%w¡l + ) Dl + Em Kw¡m Em0 0

l=1

(3.8)

m=1

We can consider a multivariate generalization of the size e®ect and sign e®ect: 0 G0 + J¤1 %w¡1 %0w¡1 J¤0 Kw = FF 0 + D1 %w¡1 %0w¡1 D01 + E1 Kw¡1 E10 + Gyw¡1 yw¡1

p where yw = j}w j ¡ H j}w j, with }lw = %lw @ kll>w and 2

6 6 J =6 4 ¤

L (%1w¡1 ? 0) j11 0 .. .

0 ...

=== ...

0

0

3

0 .. . 0 L (%Qw¡1 ? 0) jQQ

7 7 7 5

When Q = 2 0 yw¡1 yw¡1

¯ ¯ ¯ ¸· ¯ ¯ ¯ ¯ ¸ p p p p · ¯ ¯%1w¡1 @ k11>w¡1 ¯ ¡ H ¯%1w¡1 @ k11>w¡1 ¯ 0 ¯%1w¡1 @ k11>w¡1 ¯ ¡ H ¯%1w¡1 @ k11>w¡1 ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ p p p p = ¯%2w¡1 @ k22>w¡1 ¯ ¡ H ¯%2w¡1 @ k22>w¡1 ¯ ¯%2w¡1 @ k22>w¡1 ¯ ¡ H ¯%2w¡1 @ k22>w¡1 ¯ · ¸ (j}1w j ¡ H j}1w j)2 (j}1w j ¡ H j}1w j) (j}2w j ¡ H j}2w j) = (j}2w j ¡ H j}2w j) (j}1w j ¡ H j}1w j) (j}2w j ¡ H j}2w j)2 J¤1 %w¡1 %0w¡1 J¤0

= =

· ·

¤2 2 %1w¡1 j11 ¤ ¤ j11 j22 %1w¡1 %2w¡1

¤ ¤ j11 j22 %1w¡1 %2w¡1 ¤2 2 j22 %2w¡1

2 2 L (%1w¡1 ? 0) j11 %1w¡1  12 j11 j22 %1w¡1 %2w¡1

¸

 12 j11 j22 %1w¡1 %2w¡1 2 2 L (%2w¡1 ? 0) j22 %2w¡1

12 = L (%1w¡1 ? 0) L (%2w¡1 ? 0)

¸

60

3.7

Multivariate GARCH models

Estimation procedure

Given the model (3.7)-(3.8), the log-likelihood function for f%W > = = = > %1 g obtained under the assumption of conditional multivariate normality is: # " W X ¡ ¢ 1 OW (%W > = = = > %1 ; ) = ¡ W Q ln (2) + ln jKw j + %0w Kw¡1 %w 2 w=1 The function corresponds directly to the conditional likelihood function for the univariate ARCH model, used in maximum likelihood or quasi-maximum likelihood estimation. Because maximum likelihood under normality is so widely used, it is important to investigate its properties in a general setting. In general, the assumption of conditional normality can be quite resctrictive. The symmetry imposed under normality is di±cult to justify, and the tails of even conditional distributions often seem fatter than that of normal distribution. Let f(|w > {w ) : w = 1> 2> = = = g be a sequence of observables random vectors with |w (Q £ 1) and {w (O £ 1). The vector |w contains the "endogenous" variables and {w contains contemporaneous "exogenous" variables. Let zw = ({w > |w¡1 > {w¡1 > = = = > |1 > {1 ). The conditional mean and variance functions are jointly parametrized by a ¯nite dimensional vector : fw (zw > ) >  2 £g fKw (zw > ) >  2 £g where £ is a subset of US and w and Kw are known functions of zw and . In the analysis, the validity of most of the inference procedures is proven under the null hypothesis that the ¯rst two conditional moments are correctly speci¯ed, for some 0 2 £, H (|w jzw ) = w (zw > 0 ) Y du (|w jzw ) = Kw (zw > 0 )

w = 1> 2> = = =

(3.9) (3.10)

The procedure most often used to estimate 0 is maximization of a likelihood function that is constructed under the assumption that |w jzw » Q (w > Kw ). The approach taken here is the same, but the subsequent analysis does not assume that |w has a conditional normal distribution. For observation w the quasi-conditional log-likelihood is ow (; |w > zw ) = ¡

Q 1 1 ln (2) ¡ ln jKw (zw > )j ¡ (|w ¡ w (zw > ))0 Kw¡1 (zw > ) (|w ¡ w (zw > )) 2 2 2

Estimation procedure

61

Letting %w (|w > zw > 0 ) ´ |w ¡w (zw > ) denote the Q £1 residual function, and in amore concise notation ow () = ¡

Q 1 1 ln (2) ¡ ln jKw ()j ¡ %0w () Kw¡1 () %w () 2 2 2

(3.11)

W X ow () OW () = w=1

If w (zw > ) and Kw (zw > ) are di®erentiable on £ for all relevant zw , and if Kw (zw > ) is nonsingular with probability one for all  2 £, then the di®erentiation of (3.11) yields the (1 £ S ) score function vw (): vw ()0 = r ow ()0 ¡ r w ()0 Kw¡1 () %w () + ¤ £ ¤ £ 1 r Kw ()0 Kw¡1 () ­ Kw¡1 () yhf %w () %w ()0 ¡ Kw () 2

where r w () is the (Q £ S ) derivative of w (zw > ) and r Kw () is the (Q 2 £ S ) derivative of Kw (). If the ¯rst conditional two moments are correctly speci¯ed, that is if the (3.9) holds then the true error vector is de¯ned as %0w ´ %w (0 ) = |w ¡w (zw > 0 ) and H (%0w jzw ) = 0. If in addition, (3.10) holds then H (%0w %00 w jzw ) = Kw (zw >  0 ). It follows that under correct speci¯cation of the ¯rst two conditional moments of |w given zw : H [vw ( 0 ) jzw ] = 0 The score evaluated at the true parameter is a vector of martingale di®erence with respect to the ¡ilhogv f (|w > zw ) : w = 1> 2> = = = g. This result can be used to establish weak consistency of the quasi-maximum likelihood estimator (QMLE). For robust inference we also need an expression for the hessian kw () of ow (). De¯ne the (S £ S ) positive semide¯nite matrix dw (0 ) = ¡H [r vw (0 ) jzw ] = H [¡kw (0 ) jzw ]. A straightforward calculation shows that, under (3.9) and (3.10), dw (0 ) = r w ( 0 )0 Kw¡1 (0 ) r w (0 ) ¤ £ 1 + r Kw ()0 Kw¡1 () ­ Kw¡1 () r Kw () 2

When the normality assumption holds the matrix dw (0 ) is the conditional information matrix. However, if |w does not have a conditional normal distribution then Y du [vw ( 0 ) jzw ] is generally not equal to dw (0 ) and the information matrix equality is violated. The QMLE has the following properties: ´ £ 0¡1 0 0¡1 ¤¡1@2 p ³ g DW EW DW W b W ¡ 0 ! Q (0> LS )

62

Multivariate GARCH models

where W

D0W

W

1X 1X ´¡ H [kw (0 )] = H [dw ( 0 )] W w=1 W w=1

and EW0

W £ ¡1@2 ¤ ¤ 1X £ ´ Y du W VW (0 ) = H vw ( 0 )0 vw (0 ) W w=1

in addition s bW ¡ D0W ! D 0

s bW ¡ E 0 ! E 0 W

b D b¡1 is a consistent estimator od the robust asymptotic covariance b¡1 E The matrix D p W³ W W ´ W as if it is normally distributed matrix of W b W ¡ 0 . In practice, one treats b b¡1 E bW D b¡1 @W . Under normality, the variance estimator with mean 0 and variance D W W b¡1 @W (Hessian form) or E b ¡1 @W (outer product of the gradient can be replaced by D W W form). We can derive a robust form for Wald statistics for testing hypotheses about 0 . Assume that the null hypothesis can be stated as K0 : u (0 ) = 0 where u : £ ! RT is continuously di®erentiable on lqw (£) and T ? S . Let U () = r u () be the (T £ S ) gradient of u on lqw (£). If 0 2 lqw (£) and udqn (U (0 )) = T then the Wald statistic ³ ´0 · ³ ´ ³ ´0 ¸¡1 ³ ´ g ¡1 b b¡1 b b b  Z = W u W U W DW EW DW U b W u b W ! "2T = K0

Chapter 4 REFERENCES 1. Bera A.K., M.L. Higgins (1993) "ARCH Models: properties, Estimation and Testing", Journal of Economic Surveys, vol.7,n.4, 305-62. 2. Bollerslev T., (1986) "Generalized Autoregressive Conditional Heteroskedasticity", Journal of Econometrics,31,307-327. 3. Bollerlsev T. (1990) "Modelling the Coherence in Short-Run Nominal Exchange Rates: A Multivariate Generalized ARCH Approach" Review of Economics and Statistics, 72, 498-505. 4. Bollerslev T:,R.F Engle.(1993) "Common Persistence in Conditional Variances", Econometrica,61,166-187. 5. Bollerslev T., R.F. Engle, D.B. Nelson (1994) "ARCH Models" in Handbook of Econometrics, vol. IV, edited by R.F.Engle and D.L. McFadden. Elsevier Science. 6. Bollerslev T., R.F. Engle, J. Wooldridge (1988) "A Capital Asset Pricing Model with Time Varying Covariances", Journal of Political Economy, 96, 116-131. 7. Bollerslev T., J. Wooldridge (1992) "Quasi-Maximum Likelihood Estimation and Inference in Dynamic Models with Time-varying Covariances", Econometric Reviews, 11(2), 143-172. 8. Ding Z., C.W.J.Granger and R.Engle (1993) "A long memory property of stock market returns and a new model", Journal of Empirical Finance, 1, 83-106. 9. Engle R.F. (1982) "Autoregressive Conditional Heteroskedasticity with estimates of the Variance of U.K. In°ation", Econometrica, 50, 987-1008. 10. Engle R.F. (1990) "Discussion: Stock Market Volatility and the Crash of 87" Review of Financial Studies,3,103-106. 11. Engle R.F., T.Bollerslev (1986) "Modeling the persistence of conditional variances" Econometric Review,5,1-50. 12. Engle R. G.J.Lee (1999) "A Permanent and Transitory Component Model of Stock Return Volatility", in ed. R. Engle and H. White Cointegration, Causality, and Forecasting: A Festschrift in Honor of Clive W.J. Granger, Oxford University Press, 475-497. 13. Engle R.F., Lilien D.M. and Robins R.P. (1987) "Estimating time varying risk premia in the term structure: the ARCH-M model", Econometrica, 55, 391-407. 14. Engle R.F., C.W.J. Granger, D.F. Kraft (1984) "Combining competing forecasts of

64

References

in°ation using a bivariate ARCH model" Journal of Economic Dynamics and Control, 8, 151-165. 15. Engle R.F. K.F. Kroner (1995)"Multivariate Simultaneous Generalized ARCH" Econometric Theory, 11, 122-150. 16. Engle R.F., V.K.Ng (1993) "Measuring and Testing the Impact of News on Volatility", Journal of Finance,48,1749-1778. 17. Engle R.F., V.K.Ng, M.Rothschild (1990) "Asset pricing with a factor-ARCH covariance structure", Journal of Econometrics, 213-237. 18. Engle R.F., G. Gonzalez-Rivera (1991) "Semiparametric ARCH Models", Journal of Business and Economic Statistics, 19, 3-29. 19. Glosten L., R.Jagannathan, D.Runkle (1993) "On the Relationship between the Expected Value and the Volatility of the Nominal Excess Return on Stocks" Journal of Finance,48,1779-1801. 20. Gourieroux C. (1992) Modeles ARCH et applications ¯nanciµeres, Economica: Paris. 21. Hamilton J. (1994) Time Series Analysis, Princenton University Press. 22. Magnus J.R. (1988) Linear Structures, Charles Gri±n & Company Lim., London. 23. Nelson D.B. (1990) "Stationarity and Persistence in the GARCH(1,1) Model", Econometric Theory, 6, 318-334. 24. Nelson D.(1991) "Conditional Heteroskedasticity in asset returns: A new approach", Econometrica, 59,347-370. 25. Nelson D., C.Q.Cao (1992) "Inequality Constraints in the Univariate GARCH Model" Journal of Business and Economic Statistics,10,229-235. 26. Pagan A.(1996)"The Econometrics of ¯nancial markets" Journal of Empirical Finance,3,15-102. 27. Robinson P. (1991) 28. Sentana E. (1995) "Quadratic ARCH Models" Review of Economic Studies, 62(4), 639-61. 29. Taylor S. (1986) Modeling Financial Time Series, Wiley and Sons: New York, NY.

Smile Life

When life gives you a hundred reasons to cry, show life that you have a thousand reasons to smile

Get in touch

© Copyright 2015 - 2024 PDFFOX.COM - All rights reserved.