Estimation of Extreme Risk Regions Under Multivariate Regular Variation Juan-Juan Cai1
John H. J. Einmahl2 1 Technology
University of Delft
2 Tilburg 3 University
Laurens de Haan3
University
of Lisbon and Erasmus University Rotterdam
University of Strasbourg, December 2014
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
1 / 25
Introduction
Let Z be a random vector on Rd (d ≥ 2). A risk region is a set Q such that P(Z ∈ Q) = p, extremely small.
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
2 / 25
Introduction
Let Z be a random vector on Rd (d ≥ 2). A risk region is a set Q such that P(Z ∈ Q) = p, extremely small. Events in Q hardly happen. The interest of these events originates from their potential large consequences.
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
2 / 25
Introduction
Let Z be a random vector on Rd (d ≥ 2). A risk region is a set Q such that P(Z ∈ Q) = p, extremely small. Events in Q hardly happen. The interest of these events originates from their potential large consequences.
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
2 / 25
Introduction
Suppose Z has probability density f . Denote the corresponding probability measure with P . The risk regions of interest are defined in this form: Q = {z ∈ Rd : f (z) ≤ β}, where β is an unknown number such that P Q = p.
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
3 / 25
Introduction
Suppose Z has probability density f . Denote the corresponding probability measure with P . The risk regions of interest are defined in this form: Q = {z ∈ Rd : f (z) ≤ β}, where β is an unknown number such that P Q = p. Qc = {z ∈ Rd : f (z) > β}. Q is the set of less likely points.
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
3 / 25
Introduction
The goal is to estimate Q based on a random sample from Z. The sample size is n. For asymptotics, we consider p = p(n) → 0, as n → ∞. We write: Qn = {z ∈ Rd : f (z) ≤ βn }.
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
4 / 25
Main Result
Assumption
Main Assumption Multivariate Regular Variation There exist a positive number α and a positive function q, such that P(k Z k> tx) = x−α , t→∞ P(k Z k> t) lim
and lim
f (tz) = q(z), Z k> t)
t→∞ t−d P(k
for all x > 0,
for all z 6= 0,
where k · k denotes the L2 norm.
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
5 / 25
Main Result
Assumption
Main Assumption Multivariate Regular Variation There exist a positive number α and a positive function q, such that P(k Z k> tx) = x−α , t→∞ P(k Z k> t) lim
and lim
f (tz) = q(z), Z k> t)
t→∞ t−d P(k
for all x > 0,
for all z 6= 0,
where k · k denotes the L2 norm. 1
Examples: Cauchy distributions and all elliptical distributions with a heavy tailed radius.
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
5 / 25
Main Result
Assumption
Some results from the assumption
The distribution of the radius has a right heavy tail. α is the tail index. q is homogenous: q(az) = a−d−α q(z). R Define ν(B) = B q(z)dz. Then, for a Borel set B with positive distance from the origin, P(Z ∈ tB) = ν(B). t→∞ P(k Z k≥ t) lim
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
6 / 25
Main Result
Estimation
Estimation
Recall that we try to estimate Qn = {z ∈ Rd : f (z) ≤ βn }, such that P(Z ∈ Qn ) = p.
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
7 / 25
Main Result
Estimation
Estimation
Recall that we try to estimate Qn = {z ∈ Rd : f (z) ≤ βn }, such that P(Z ∈ Qn ) = p. Link Qn to S = {z ∈ Rd : q(z) ≤ 1}.
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
7 / 25
Main Result
Estimation
Estimation
Recall that we try to estimate Qn = {z ∈ Rd : f (z) ≤ βn }, such that P(Z ∈ Qn ) = p. Link Qn to S = {z ∈ Rd : q(z) ≤ 1}. e n := un S, where un is such that Inflate S with the factor un : Q ν(S) P(||Z|| > un ) = p .
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
7 / 25
Main Result
Estimation
Estimation
Recall that we try to estimate Qn = {z ∈ Rd : f (z) ≤ βn }, such that P(Z ∈ Qn ) = p. Link Qn to S = {z ∈ Rd : q(z) ≤ 1}. e n := un S, where un is such that Inflate S with the factor un : Q ν(S) P(||Z|| > un ) = p .
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
7 / 25
Main Result
Estimation
e n is a good approximation of Qn . We show that as n → ∞, Q en ) P (Qn ∆Q → 0, p where ∆ denotes the symmetric difference. A4B = (A \ B) ∪ (B \ A). en . To estimate Qn is now to estimate Q
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
8 / 25
Main Result
Estimation
en = un S Estimation of Q
Suppose we have Z1 , . . . , Zn i.i.d copies of Z. Write Ri =k Zi k and Wi =
Zi Ri ,
i = 1, 2, . . . , n.
Put Θ := {z :k z k= 1}. Then Wi ∈ Θ, i = 1, 2, . . . , n.
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
9 / 25
Main Result
Estimation
Estimation of un
Note that un is the tail quantile of R1 : P(R1 > un ) =
ν(S) p .
Suppose that we know ν(S). Applying the univariate extreme value technique, we define the estimator given by u ˆn = Rn−k,n
kν(S) np
1/αˆ ,
where k = k(n) such that k → ∞ and k/n → 0, as n → ∞ and Rn−k,n is the (n − k)-th order statistics of {Ri , i = 1, . . . , n}.
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
10 / 25
Main Result
Estimation
Estimation of un
Note that un is the tail quantile of R1 : P(R1 > un ) =
ν(S) p .
Suppose that we know ν(S). Applying the univariate extreme value technique, we define the estimator given by u ˆn = Rn−k,n
kν(S) np
1/αˆ ,
where k = k(n) such that k → ∞ and k/n → 0, as n → ∞ and Rn−k,n is the (n − k)-th order statistics of {Ri , i = 1, . . . , n}. We need to estimate ν(S). It is sufficient to estimate q.
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
10 / 25
Main Result
Estimation
Estimation of q
For a Borel set A ∈ Θ, limt→∞ P(W1 ∈ A|R1 > t) =: Ψ(A) exists. The density of Ψ exists: ψ(w) = α1 q(w), w ∈ Θ. We propose a kernel desity estimator of ψ making use of observations with big radius.
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
11 / 25
Main Result
Estimation
Estimation of q
For a Borel set A ∈ Θ, limt→∞ P(W1 ∈ A|R1 > t) =: Ψ(A) exists. The density of Ψ exists: ψ(w) = α1 q(w), w ∈ Θ. We propose a kernel desity estimator of ψ making use of observations with big radius. As n → ∞, ˆ P sup ψ(w) − ψ(w) − → 0.
w∈Θ
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
11 / 25
Main Result
Estimation
Estimation of q
For a Borel set A ∈ Θ, limt→∞ P(W1 ∈ A|R1 > t) =: Ψ(A) exists. The density of Ψ exists: ψ(w) = α1 q(w), w ∈ Θ. We propose a kernel desity estimator of ψ making use of observations with big radius. As n → ∞, ˆ P sup ψ(w) − ψ(w) − → 0.
w∈Θ
ˆ The estimations of S and ν(S) follow directly. Then qˆ = α ˆ ψ.
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
11 / 25
Main Result
Estimation
We obtain our estimator: bn = u Q ˆn Sb = Rn−k,n
Cai, Einmahl, de Haan
[ k ν(S) np
!1/αˆ
Multivariate Extreme Risk Regions
{z : qˆ(z) < 1}.
12 / 25
Main Result
Estimation
We obtain our estimator: bn = u Q ˆn Sb = Rn−k,n
[ k ν(S) np
!1/αˆ {z : qˆ(z) < 1}.
Theorem Under some regular conditions, we have, as n → ∞, b n 4Qn P Q P → 0, p Here 4 denotes the symmetric difference.
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
12 / 25
Simulation
One Sample Plot
Bivariate Cauchy Distribution
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
13 / 25
Simulation
One Sample Plot
Bivariate Cauchy Distribution
10000
Cauchy Density, n=5000
5000
Data are simulated from the bivariate Cauchy distribution. n = 5000.
0
●● ● ●● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ●● ●
−10000
−5000
●
−10000
−5000
Cai, Einmahl, de Haan
0
5000
10000
Multivariate Extreme Risk Regions
13 / 25
Simulation
One Sample Plot
Bivariate Cauchy Distribution
10000
Cauchy Density, n=5000
Q
5000
Data are simulated from the bivariate Cauchy distribution. n = 5000. The area outside the solid line is the true risk region. P Q = 10−4 .
0
●● ● ●● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ●● ●
−10000
−5000
●
−10000
−5000
Cai, Einmahl, de Haan
0
5000
10000
Multivariate Extreme Risk Regions
13 / 25
Simulation
One Sample Plot
Bivariate Cauchy Distribution
10000
Cauchy Density, n=5000
5000
Q ^ Qn
The area outside the solid line is the true risk region. P Q = 10−4 .
0
●● ● ●● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ●● ●
The area outside the dotted curve corresponds to the estimated risk region.
−10000
−5000
●
−10000
−5000
Cai, Einmahl, de Haan
0
Data are simulated from the bivariate Cauchy distribution. n = 5000.
5000
10000
Multivariate Extreme Risk Regions
13 / 25
Simulation
One Sample Plot
Bivariate Cauchy Distribution
10000
Cauchy Density, n=5000, p=1/2000, 1/5000, 1/10000
5000
Q ^ Qn
The area outside the solid line is the true risk region. P Q = 10−4 .
0
●● ● ●● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ●● ●
The area outside the dotted curve corresponds to the estimated risk region.
−10000
−5000
●
−10000
−5000
Cai, Einmahl, de Haan
0
Data are simulated from the bivariate Cauchy distribution. n = 5000.
5000
10000
Multivariate Extreme Risk Regions
13 / 25
Simulation
One Sample Plot
Clover Density
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
14 / 25
Simulation
One Sample Plot
Clover Density
20
Clover Density, n=5000
n = 5000. 10
● ● ●
● ●
● ● ● ●● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ●● ●● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ●● ●
−10
0
●
●
−20
●
−30
−20
−10
Cai, Einmahl, de Haan
0
10
20
30
Multivariate Extreme Risk Regions
14 / 25
Simulation
One Sample Plot
Clover Density
Clover Density, n=5000
20
Q
n = 5000. 10
● ● ●
● ● ● ●● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ●● ●● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ●● ●
0
●
−10
The area outside the solid line is the true risk region, Q. P Q = 10−4 .
● ●
●
−20
●
−30
−20
−10
Cai, Einmahl, de Haan
0
10
20
30
Multivariate Extreme Risk Regions
14 / 25
Simulation
One Sample Plot
Clover Density
Clover Density, n=5000
20
Q ^ Qn
n = 5000.
10
● ● ●
● ● ● ●● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ●● ●● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ●● ●
0
●
−10
The area outside the solid line is the true risk region, Q. P Q = 10−4 .
● ●
●
The area outside the dotted curve corresponds to the estimated risk region.
−20
●
−30
−20
−10
Cai, Einmahl, de Haan
0
10
20
30
Multivariate Extreme Risk Regions
14 / 25
Simulation
One Sample Plot
40
Elliptical Density, n=5000, p=1/2000, 1/10000
20
Q ^ Qn
●
0
●
−40
−20
●
●● ● ● ●● ● ●●● ●● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ●● ●● ● ●●● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●● ● ● ●● ●● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ●● ●● ●● ● ● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●● ● ● ●● ●● ●● ● ●● ●● ● ● ●● ● ●● ● ● ● ●● ●● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●
−40
Cai, Einmahl, de Haan
−20
0
20
Multivariate Extreme Risk Regions
40
15 / 25
Simulation
One Sample Plot
10000
Asymmetric Shifted Density, n=5000, p=1/2000, 1/10000
Q ^ Qn
5000
●
0
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●
● ● ●
−10000
−5000
●
−10000
Cai, Einmahl, de Haan
−5000
0
5000
Multivariate Extreme Risk Regions
10000
16 / 25
Simulation
Comparison: Boxplots
Two competitors
A ”Parametric” estimator Estimate ν(S) and S by assuming ψ(w1 , w2 ) = ψ(cos θ, sin θ) = (4π)−1 (2 + sin(2(θ − ρ))), θ ∈ [0, 2π] . The method works for bivariate distributions only. A non-parametric estimator Compute the smallest ellipsoid containing half of the data, the so-called MVE. Inflate this ellipsoid such that largest observation lies on its boundary. It works for p = 1/n only.
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
17 / 25
Simulation
Comparison: Boxplots
We simulate 100 data sets from four bivariate distributions and the trivariate Cauchy distribution. Each data set is of size 5000. The main theorem states
Cai, Einmahl, de Haan
ˆ n 4Qn ) P P (Q → p
0.
Multivariate Extreme Risk Regions
18 / 25
Simulation
Comparison: Boxplots
ˆ n 4Qn ) P (Q , p1 = 1/5000 and p2 = 1/10000. p b np 4Qn ) P (Q enp = , p1 = 1/5000. p b par 4Qn ) P (Q , p1 = 1/5000 and p2 = 1/10000. epar = p
eevt =
● Trivariate Cauchy Density
Bivariate Cauchy Density 2.5
2.5
●
● ●
2.0
2.0
● ● ●
●
● ●
●
1.5
● ● ● ● ●
● ● ● ● ● ●
● ● ● ●
● ● ●
1.0
● ● ● ● ● ●
0.0
0.0
0.5
● ● ● ● ● ●
0.5
1.0
1.5
● ● ●
EVT p1
Par p1
Cai, Einmahl, de Haan
NP p1
EVT p2
Par p2
EVT p1
Multivariate Extreme Risk Regions
NP p1
EVT p2
19 / 25
Simulation
Comparison: Boxplots
ˆ n 4Qn ) P (Q , p1 = 1/5000 and p2 = 1/10000. p b np 4Qn ) P (Q enp = , p1 = 1/5000. p b par 4Qn ) P (Q , p1 = 1/5000 and p2 = 1/10000. epar = p
eevt =
●
● ●
●
2.0
● ● ● ●
●
● ●
●
● ● ●
● ●
● ●
● ● ● ●
● ● ●
● ● ● ●
●
● ● ● ●
Clover Density
● ● ● ● ●
● ●
● ●
●
● ● ● ●
●
●
●
● ● ●
●
● ● ●
● ●
● ● ● ● ● ● ●
●
●
0.0
0.0
0.5
0.5
1.0
1.0
1.5
1.5
● ●
3.0
●
●
●
●
2.0
2.5
●
● ●
●
2.5
●
Elliptical● Density
EVT p1
Par p1
Cai, Einmahl, de Haan
NP p1
EVT p2
Par p2
EVT p1
Multivariate Extreme Risk Regions
Par p1
NP p1
EVT p2
Par p2
20 / 25
Simulation
Comparison: Boxplots
ˆ n 4Qn ) P (Q , p1 = 1/5000 and p2 = 1/10000. p b np 4Qn ) P (Q enp = , p1 = 1/5000. p b par 4Qn ) P (Q , p1 = 1/5000 and p2 = 1/10000. epar = p
eevt =
●
●
Asymmetric Shifted Density ●
●
● ●
2.5
● ●
● ●
1.5
● ● ●
● ●
●
●
● ●
● ●
● ●
● ●
● ●
● ●
●
0.0
0.5
1.0
●
●
2.0
●
EVT p1
Cai, Einmahl, de Haan
Par p1
NP p1
EVT p2
Par p2
Multivariate Extreme Risk Regions
21 / 25
Real Data
We apply our method to foreign exchange rate data. Data: daily exchange rates of yen-dollar and pound-dollar, dating from 4 Jan 1999 to 31 July 2009. n = 2665 We consider the log-return. Xt,i = log
Yt,i Yt−1,i
where t = 1, . . . , 2664, i = 1, 2 and Yt,1 is the daily exchange rate of yen-dollar and Yt,2 pound-dollar.
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
22 / 25
Real Data
α ˆ = 3.9
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
23 / 25
Real Data
0.05
0.10
α ˆ = 3.9
● ● ● ● ●
0.00
●
● ● ● ● ● ● ● ● ●● ● ●●● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●
●
●
−0.10
−0.05
Pound−Dollar
●
−0.10
−0.05
0.00
0.05
0.10
Yen−Dollar
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
23 / 25
Real Data
α ˆ = 3.9
0.05
0.10
^ Qp,p=1/2000,1/5000,1/10000
● ● ● ● ●
0.00
●
● ● ● ● ● ● ● ● ●● ● ●●● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●
●
●
−0.10
−0.05
Pound−Dollar
●
−0.10
−0.05
0.00
0.05
0.10
Yen−Dollar
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
23 / 25
Real Data
α ˆ = 3.9
0.05
0.10
^ Qp,p=1/2000,1/5000,1/10000
4● ● ● ● ●
0.00
●
● ● ● ● ● ● ● ● ●● ● ●●● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●
2●
3●
−0.10
−0.05
Pound−Dollar
1●
−0.10
−0.05
0.00
0.05
0.10
Yen−Dollar
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
23 / 25
Future Research
Future Research
Ordering multivariate extreme observations For an extreme observation Zi , we associate a p-value given by H(Zi ) = P (z : f (z) ≤ f (Zi )). Estimation of H(Zi ) follows easily from the current procedure. Order multivariate extremes accordingly
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
24 / 25
Future Research
Future Research
Outlier detection
5000
10000
Cauchy Density, n=5000
0
●● ● ●● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ●● ●
−10000
−5000
●
−10000
−5000
0
5000
10000
An issue: It is not clear how to formulate an alternative hypothesis.
Cai, Einmahl, de Haan
Multivariate Extreme Risk Regions
25 / 25