Estimation of Extreme Risk Regions Under Multivariate Regular Variation

Estimation of Extreme Risk Regions Under Multivariate Regular Variation Juan-Juan Cai1 John H. J. Einmahl2 1 Technology University of Delft 2 Tilbu...
Author: Emil Cunningham
2 downloads 2 Views 890KB Size
Estimation of Extreme Risk Regions Under Multivariate Regular Variation Juan-Juan Cai1

John H. J. Einmahl2 1 Technology

University of Delft

2 Tilburg 3 University

Laurens de Haan3

University

of Lisbon and Erasmus University Rotterdam

University of Strasbourg, December 2014

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

1 / 25

Introduction

Let Z be a random vector on Rd (d ≥ 2). A risk region is a set Q such that P(Z ∈ Q) = p, extremely small.

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

2 / 25

Introduction

Let Z be a random vector on Rd (d ≥ 2). A risk region is a set Q such that P(Z ∈ Q) = p, extremely small. Events in Q hardly happen. The interest of these events originates from their potential large consequences.

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

2 / 25

Introduction

Let Z be a random vector on Rd (d ≥ 2). A risk region is a set Q such that P(Z ∈ Q) = p, extremely small. Events in Q hardly happen. The interest of these events originates from their potential large consequences.

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

2 / 25

Introduction

Suppose Z has probability density f . Denote the corresponding probability measure with P . The risk regions of interest are defined in this form: Q = {z ∈ Rd : f (z) ≤ β}, where β is an unknown number such that P Q = p.

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

3 / 25

Introduction

Suppose Z has probability density f . Denote the corresponding probability measure with P . The risk regions of interest are defined in this form: Q = {z ∈ Rd : f (z) ≤ β}, where β is an unknown number such that P Q = p. Qc = {z ∈ Rd : f (z) > β}. Q is the set of less likely points.

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

3 / 25

Introduction

The goal is to estimate Q based on a random sample from Z. The sample size is n. For asymptotics, we consider p = p(n) → 0, as n → ∞. We write: Qn = {z ∈ Rd : f (z) ≤ βn }.

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

4 / 25

Main Result

Assumption

Main Assumption Multivariate Regular Variation There exist a positive number α and a positive function q, such that P(k Z k> tx) = x−α , t→∞ P(k Z k> t) lim

and lim

f (tz) = q(z), Z k> t)

t→∞ t−d P(k

for all x > 0,

for all z 6= 0,

where k · k denotes the L2 norm.

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

5 / 25

Main Result

Assumption

Main Assumption Multivariate Regular Variation There exist a positive number α and a positive function q, such that P(k Z k> tx) = x−α , t→∞ P(k Z k> t) lim

and lim

f (tz) = q(z), Z k> t)

t→∞ t−d P(k

for all x > 0,

for all z 6= 0,

where k · k denotes the L2 norm. 1

Examples: Cauchy distributions and all elliptical distributions with a heavy tailed radius.

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

5 / 25

Main Result

Assumption

Some results from the assumption

The distribution of the radius has a right heavy tail. α is the tail index. q is homogenous: q(az) = a−d−α q(z). R Define ν(B) = B q(z)dz. Then, for a Borel set B with positive distance from the origin, P(Z ∈ tB) = ν(B). t→∞ P(k Z k≥ t) lim

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

6 / 25

Main Result

Estimation

Estimation

Recall that we try to estimate Qn = {z ∈ Rd : f (z) ≤ βn }, such that P(Z ∈ Qn ) = p.

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

7 / 25

Main Result

Estimation

Estimation

Recall that we try to estimate Qn = {z ∈ Rd : f (z) ≤ βn }, such that P(Z ∈ Qn ) = p. Link Qn to S = {z ∈ Rd : q(z) ≤ 1}.

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

7 / 25

Main Result

Estimation

Estimation

Recall that we try to estimate Qn = {z ∈ Rd : f (z) ≤ βn }, such that P(Z ∈ Qn ) = p. Link Qn to S = {z ∈ Rd : q(z) ≤ 1}. e n := un S, where un is such that Inflate S with the factor un : Q ν(S) P(||Z|| > un ) = p .

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

7 / 25

Main Result

Estimation

Estimation

Recall that we try to estimate Qn = {z ∈ Rd : f (z) ≤ βn }, such that P(Z ∈ Qn ) = p. Link Qn to S = {z ∈ Rd : q(z) ≤ 1}. e n := un S, where un is such that Inflate S with the factor un : Q ν(S) P(||Z|| > un ) = p .

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

7 / 25

Main Result

Estimation

e n is a good approximation of Qn . We show that as n → ∞, Q en ) P (Qn ∆Q → 0, p where ∆ denotes the symmetric difference. A4B = (A \ B) ∪ (B \ A). en . To estimate Qn is now to estimate Q

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

8 / 25

Main Result

Estimation

en = un S Estimation of Q

Suppose we have Z1 , . . . , Zn i.i.d copies of Z. Write Ri =k Zi k and Wi =

Zi Ri ,

i = 1, 2, . . . , n.

Put Θ := {z :k z k= 1}. Then Wi ∈ Θ, i = 1, 2, . . . , n.

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

9 / 25

Main Result

Estimation

Estimation of un

Note that un is the tail quantile of R1 : P(R1 > un ) =

ν(S) p .

Suppose that we know ν(S). Applying the univariate extreme value technique, we define the estimator given by  u ˆn = Rn−k,n

kν(S) np

1/αˆ ,

where k = k(n) such that k → ∞ and k/n → 0, as n → ∞ and Rn−k,n is the (n − k)-th order statistics of {Ri , i = 1, . . . , n}.

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

10 / 25

Main Result

Estimation

Estimation of un

Note that un is the tail quantile of R1 : P(R1 > un ) =

ν(S) p .

Suppose that we know ν(S). Applying the univariate extreme value technique, we define the estimator given by  u ˆn = Rn−k,n

kν(S) np

1/αˆ ,

where k = k(n) such that k → ∞ and k/n → 0, as n → ∞ and Rn−k,n is the (n − k)-th order statistics of {Ri , i = 1, . . . , n}. We need to estimate ν(S). It is sufficient to estimate q.

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

10 / 25

Main Result

Estimation

Estimation of q

For a Borel set A ∈ Θ, limt→∞ P(W1 ∈ A|R1 > t) =: Ψ(A) exists. The density of Ψ exists: ψ(w) = α1 q(w), w ∈ Θ. We propose a kernel desity estimator of ψ making use of observations with big radius.

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

11 / 25

Main Result

Estimation

Estimation of q

For a Borel set A ∈ Θ, limt→∞ P(W1 ∈ A|R1 > t) =: Ψ(A) exists. The density of Ψ exists: ψ(w) = α1 q(w), w ∈ Θ. We propose a kernel desity estimator of ψ making use of observations with big radius. As n → ∞, ˆ P sup ψ(w) − ψ(w) − → 0.

w∈Θ

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

11 / 25

Main Result

Estimation

Estimation of q

For a Borel set A ∈ Θ, limt→∞ P(W1 ∈ A|R1 > t) =: Ψ(A) exists. The density of Ψ exists: ψ(w) = α1 q(w), w ∈ Θ. We propose a kernel desity estimator of ψ making use of observations with big radius. As n → ∞, ˆ P sup ψ(w) − ψ(w) − → 0.

w∈Θ

ˆ The estimations of S and ν(S) follow directly. Then qˆ = α ˆ ψ.

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

11 / 25

Main Result

Estimation

We obtain our estimator: bn = u Q ˆn Sb = Rn−k,n

Cai, Einmahl, de Haan

[ k ν(S) np

!1/αˆ

Multivariate Extreme Risk Regions

{z : qˆ(z) < 1}.

12 / 25

Main Result

Estimation

We obtain our estimator: bn = u Q ˆn Sb = Rn−k,n

[ k ν(S) np

!1/αˆ {z : qˆ(z) < 1}.

Theorem Under some regular conditions, we have, as n → ∞,   b n 4Qn P Q P → 0, p Here 4 denotes the symmetric difference.

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

12 / 25

Simulation

One Sample Plot

Bivariate Cauchy Distribution

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

13 / 25

Simulation

One Sample Plot

Bivariate Cauchy Distribution

10000

Cauchy Density, n=5000

5000

Data are simulated from the bivariate Cauchy distribution. n = 5000.

0

●● ● ●● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ●● ●

−10000

−5000



−10000

−5000

Cai, Einmahl, de Haan

0

5000

10000

Multivariate Extreme Risk Regions

13 / 25

Simulation

One Sample Plot

Bivariate Cauchy Distribution

10000

Cauchy Density, n=5000

Q

5000

Data are simulated from the bivariate Cauchy distribution. n = 5000. The area outside the solid line is the true risk region. P Q = 10−4 .

0

●● ● ●● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ●● ●

−10000

−5000



−10000

−5000

Cai, Einmahl, de Haan

0

5000

10000

Multivariate Extreme Risk Regions

13 / 25

Simulation

One Sample Plot

Bivariate Cauchy Distribution

10000

Cauchy Density, n=5000

5000

Q ^ Qn

The area outside the solid line is the true risk region. P Q = 10−4 .

0

●● ● ●● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ●● ●

The area outside the dotted curve corresponds to the estimated risk region.

−10000

−5000



−10000

−5000

Cai, Einmahl, de Haan

0

Data are simulated from the bivariate Cauchy distribution. n = 5000.

5000

10000

Multivariate Extreme Risk Regions

13 / 25

Simulation

One Sample Plot

Bivariate Cauchy Distribution

10000

Cauchy Density, n=5000, p=1/2000, 1/5000, 1/10000

5000

Q ^ Qn

The area outside the solid line is the true risk region. P Q = 10−4 .

0

●● ● ●● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ●● ●

The area outside the dotted curve corresponds to the estimated risk region.

−10000

−5000



−10000

−5000

Cai, Einmahl, de Haan

0

Data are simulated from the bivariate Cauchy distribution. n = 5000.

5000

10000

Multivariate Extreme Risk Regions

13 / 25

Simulation

One Sample Plot

Clover Density

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

14 / 25

Simulation

One Sample Plot

Clover Density

20

Clover Density, n=5000

n = 5000. 10

● ● ●

● ●

● ● ● ●● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ●● ●● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ●● ●

−10

0





−20



−30

−20

−10

Cai, Einmahl, de Haan

0

10

20

30

Multivariate Extreme Risk Regions

14 / 25

Simulation

One Sample Plot

Clover Density

Clover Density, n=5000

20

Q

n = 5000. 10

● ● ●

● ● ● ●● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ●● ●● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ●● ●

0



−10

The area outside the solid line is the true risk region, Q. P Q = 10−4 .

● ●



−20



−30

−20

−10

Cai, Einmahl, de Haan

0

10

20

30

Multivariate Extreme Risk Regions

14 / 25

Simulation

One Sample Plot

Clover Density

Clover Density, n=5000

20

Q ^ Qn

n = 5000.

10

● ● ●

● ● ● ●● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ●● ●● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ●● ●

0



−10

The area outside the solid line is the true risk region, Q. P Q = 10−4 .

● ●



The area outside the dotted curve corresponds to the estimated risk region.

−20



−30

−20

−10

Cai, Einmahl, de Haan

0

10

20

30

Multivariate Extreme Risk Regions

14 / 25

Simulation

One Sample Plot

40

Elliptical Density, n=5000, p=1/2000, 1/10000

20

Q ^ Qn



0



−40

−20



●● ● ● ●● ● ●●● ●● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ●● ●● ● ●●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●● ● ● ●● ●● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ●● ●● ●● ● ● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●● ● ● ●● ●● ●● ● ●● ●● ● ● ●● ● ●● ● ● ● ●● ●● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●

−40

Cai, Einmahl, de Haan

−20

0

20

Multivariate Extreme Risk Regions

40

15 / 25

Simulation

One Sample Plot

10000

Asymmetric Shifted Density, n=5000, p=1/2000, 1/10000

Q ^ Qn

5000



0

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●

● ● ●

−10000

−5000



−10000

Cai, Einmahl, de Haan

−5000

0

5000

Multivariate Extreme Risk Regions

10000

16 / 25

Simulation

Comparison: Boxplots

Two competitors

A ”Parametric” estimator Estimate ν(S) and S by assuming ψ(w1 , w2 ) = ψ(cos θ, sin θ) = (4π)−1 (2 + sin(2(θ − ρ))), θ ∈ [0, 2π] . The method works for bivariate distributions only. A non-parametric estimator Compute the smallest ellipsoid containing half of the data, the so-called MVE. Inflate this ellipsoid such that largest observation lies on its boundary. It works for p = 1/n only.

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

17 / 25

Simulation

Comparison: Boxplots

We simulate 100 data sets from four bivariate distributions and the trivariate Cauchy distribution. Each data set is of size 5000. The main theorem states

Cai, Einmahl, de Haan

ˆ n 4Qn ) P P (Q → p

0.

Multivariate Extreme Risk Regions

18 / 25

Simulation

Comparison: Boxplots

ˆ n 4Qn ) P (Q , p1 = 1/5000 and p2 = 1/10000. p b np 4Qn ) P (Q enp = , p1 = 1/5000. p b par 4Qn ) P (Q , p1 = 1/5000 and p2 = 1/10000. epar = p

eevt =

● Trivariate Cauchy Density

Bivariate Cauchy Density 2.5

2.5



● ●

2.0

2.0

● ● ●



● ●



1.5

● ● ● ● ●

● ● ● ● ● ●

● ● ● ●

● ● ●

1.0

● ● ● ● ● ●

0.0

0.0

0.5

● ● ● ● ● ●

0.5

1.0

1.5

● ● ●

EVT p1

Par p1

Cai, Einmahl, de Haan

NP p1

EVT p2

Par p2

EVT p1

Multivariate Extreme Risk Regions

NP p1

EVT p2

19 / 25

Simulation

Comparison: Boxplots

ˆ n 4Qn ) P (Q , p1 = 1/5000 and p2 = 1/10000. p b np 4Qn ) P (Q enp = , p1 = 1/5000. p b par 4Qn ) P (Q , p1 = 1/5000 and p2 = 1/10000. epar = p

eevt =



● ●



2.0

● ● ● ●



● ●



● ● ●

● ●

● ●

● ● ● ●

● ● ●

● ● ● ●



● ● ● ●

Clover Density

● ● ● ● ●

● ●

● ●



● ● ● ●







● ● ●



● ● ●

● ●

● ● ● ● ● ● ●





0.0

0.0

0.5

0.5

1.0

1.0

1.5

1.5

● ●

3.0









2.0

2.5



● ●



2.5



Elliptical● Density

EVT p1

Par p1

Cai, Einmahl, de Haan

NP p1

EVT p2

Par p2

EVT p1

Multivariate Extreme Risk Regions

Par p1

NP p1

EVT p2

Par p2

20 / 25

Simulation

Comparison: Boxplots

ˆ n 4Qn ) P (Q , p1 = 1/5000 and p2 = 1/10000. p b np 4Qn ) P (Q enp = , p1 = 1/5000. p b par 4Qn ) P (Q , p1 = 1/5000 and p2 = 1/10000. epar = p

eevt =





Asymmetric Shifted Density ●



● ●

2.5

● ●

● ●

1.5

● ● ●

● ●





● ●

● ●

● ●

● ●

● ●

● ●



0.0

0.5

1.0





2.0



EVT p1

Cai, Einmahl, de Haan

Par p1

NP p1

EVT p2

Par p2

Multivariate Extreme Risk Regions

21 / 25

Real Data

We apply our method to foreign exchange rate data. Data: daily exchange rates of yen-dollar and pound-dollar, dating from 4 Jan 1999 to 31 July 2009. n = 2665 We consider the log-return. Xt,i = log

Yt,i Yt−1,i

where t = 1, . . . , 2664, i = 1, 2 and Yt,1 is the daily exchange rate of yen-dollar and Yt,2 pound-dollar.

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

22 / 25

Real Data

α ˆ = 3.9

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

23 / 25

Real Data

0.05

0.10

α ˆ = 3.9

● ● ● ● ●

0.00



● ● ● ● ● ● ● ● ●● ● ●●● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●





−0.10

−0.05

Pound−Dollar



−0.10

−0.05

0.00

0.05

0.10

Yen−Dollar

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

23 / 25

Real Data

α ˆ = 3.9

0.05

0.10

^ Qp,p=1/2000,1/5000,1/10000

● ● ● ● ●

0.00



● ● ● ● ● ● ● ● ●● ● ●●● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●





−0.10

−0.05

Pound−Dollar



−0.10

−0.05

0.00

0.05

0.10

Yen−Dollar

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

23 / 25

Real Data

α ˆ = 3.9

0.05

0.10

^ Qp,p=1/2000,1/5000,1/10000

4● ● ● ● ●

0.00



● ● ● ● ● ● ● ● ●● ● ●●● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●

2●

3●

−0.10

−0.05

Pound−Dollar

1●

−0.10

−0.05

0.00

0.05

0.10

Yen−Dollar

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

23 / 25

Future Research

Future Research

Ordering multivariate extreme observations  For an extreme observation Zi , we associate a p-value given by H(Zi ) = P (z : f (z) ≤ f (Zi )).  Estimation of H(Zi ) follows easily from the current procedure.  Order multivariate extremes accordingly

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

24 / 25

Future Research

Future Research

Outlier detection

5000

10000

Cauchy Density, n=5000

0

●● ● ●● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ●● ●

−10000

−5000



−10000

−5000

0

5000

10000

An issue: It is not clear how to formulate an alternative hypothesis.

Cai, Einmahl, de Haan

Multivariate Extreme Risk Regions

25 / 25

Suggest Documents