Tail Risk of Multivariate Regular Variation

Tail Risk of Multivariate Regular Variation Harry Joe∗ Haijun Li† June 2009 Abstract Tail risk refers to the risk associated with extreme values an...
Author: Everett Joseph
2 downloads 1 Views 276KB Size
Tail Risk of Multivariate Regular Variation Harry Joe∗

Haijun Li†

June 2009

Abstract Tail risk refers to the risk associated with extreme values and is often affected by extremal dependence among multivariate extremes. Multivariate tail risk, as measured by a coherent risk measure of tail conditional expectation, is analyzed for multivariate regularly varying distributions. Asymptotic expressions for tail risk are established in terms of the intensity measure that characterizes multivariate regular variation. Tractable bounds for tail risk are derived in terms of the tail dependence function that describes extremal dependence. Various examples involving Archimedean copulas are presented to illustrate the results and quality of the bounds. Key words and phrases: Coherent risk, tail conditional expectation, regularly varying, copula, tail dependence. MSC2000 classification: 62H20, 91B30.

1

Introduction

The performance (gain or loss, etc.) of a financial portfolio at the end of a given period is often evaluated by a real-valued random variable X. A risk measure % is defined as a measurable mapping, with some coherency principles, from the space of all the performance variables into R [21], and these coherency principles provide a set of operational axioms that % should satisfy in order to accurately characterize risky behaviors of portfolios. The coherent risk measure, introduced in [1] for analyzing economic risk of financial portfolios, is an example of such an axiomatic approach. Let L be the convex cone1 consisting of all the performance variables which represent losses of financial portfolios at the end of a given period. Note that −X, where X ∈ L, represents the net ∗

[email protected], Department of Statistics, University of British Columbia, Vancouver, BC, V6T 1Z2,

Canada. This author is supported by NSERC Discovery Grant. † [email protected], Department of Mathematics, Washington State University, Pullman, WA 99164, U.S.A. This author is supported in part by NSF grant CMMI 0825960. 1 A subset L of a linear space is a convex cone if x1 ∈ L and x2 ∈ L imply that λ1 x1 + λ2 x2 ∈ L for any λ1 > 0 and λ2 > 0. A convex cone is called salient if it does not contain both x and −x for any non-zero vector x.

1

worth of a financial position. A mapping % : L → R is called a coherent risk measure if % satisfies the following four economically coherent axioms: 1. (monotonicity) For X1 , X2 ∈ L with X1 ≤ X2 almost surely, %(X1 ) ≤ %(X2 ). 2. (subadditivity) For all X1 , X2 ∈ L, %(X1 + X2 ) ≤ %(X1 ) + %(X2 ). 3. (positive homogeneity) For all X ∈ L and every λ > 0, %(λX) = λ%(X). 4. (translation invariance) For all X ∈ L and every l ∈ R, %(X + l) = %(X) + l. The interpretations of these axioms have been well documented in the literature (see, e.g., [21] for details), and risk %(X) for loss X corresponds to the amount of extra capital requirement that has to be invested in some secure instrument so that the resulting position %(X) − X is acceptable to regulators/supervisors. The general theory of coherent risk measures was developed for arbitrary real random variables in [9], and more general convex measures that combine subadditivity and positive homogeneity into the convexity property were extended to c`adl`ag processes in [7], and to abstract spaces in [11] that include deterministic, stochastic, single or multi-period cash-stream structures. It follows from the duality theory that any coherent risk measure %(X) arises as the supremum of expected values of X, taken over with respect to a convex set of probability measures on environmental states, all of them being absolutely continuous with respect to the underlying physical measure. If the set is taken to be the set of all conditional probability measures conditioning on events with probability greater than or equal to p, 0 < p < 1, then the corresponding coherent risk measure is known as the worst conditional expectation W CEp (X), which, in the case that loss variable X is continuous, equals to the tail conditional expectation (TCE) defined as follows, TCEp (X) := E(X | X > VaRp (X)),

(1.1)

where VaRp (X) := inf{x ∈ R : Pr{X > x} ≤ 1 − p} is known as the Value-at-Risk (VaR) with confidence level p (i.e., p-quantile). The VaR has been widely used in risk management, but it violates the subadditivity of coherency on convex cone L and often underestimates risks. Although VaR is coherent on a much smaller convex cone consisting of only linearized portfolio losses from elliptically distributed risk factors, the non-subadditivity of VaR can occur in the situations where portfolio losses are skewed or heavy-tailed with asymmetric dependence structures [21]. It can be shown that for continuous losses, TCE is the average of VaR over all confidence levels greater than p, focusing more than VaR does on extremal losses. Thus, TCE is more conservative than VaR at the same level of confidence (i.e., TCEp (X) ≥ VaRp (X)) and provides an effective tool for analyzing tail risks. The TCE is also related to the expected residual lifetime, a performance measure widely used in reliability theory and survival analysis. 2

For light-tailed loss distributions, such as normal distributions, TCE and VaR at the same level p of confidence are asymptotically equal as p → 1. Another example of light-tailed losses is the phase-type distribution2 . The explicit relation between TCE and VaR for the phase-type loss distributions was obtained in [6], from which asymptotic equivalence of TCE and VaR as p → 1 is evident. It is precisely the heavy-tails of loss distributions that make TCE more effective in analyzing tail risks. Formally, a non-negative loss variable X with distribution function (df) F has a heavy or regularly varying right tail at ∞ with heavy-tail index α if its survival function is of the following form (see, e.g., [5] for detail), F (r) := 1 − F (r) = r−α L(r), r > 0, α > 0,

(1.2)

where L is a slowly varying function; that is, L is a positive function on (0, ∞) with property lim

r→∞

L(cr) = 1, for every c > 0. L(r)

(1.3)

For example, the Pareto distribution with survival function F (r) = (1 + r)−α , r ≥ 0, has a regularly varying tail. It can be easily verified that if α > 1 for Pareto loss variable X, then TCEp (X) ≈

α VaRp (X), as p → 1. α−1

(1.4)

In fact, (1.4) holds for any loss distribution (1.2) that is regularly varying with heavy-tail index α > 1. Observe that TCEp (X) = =

E(XI{X > VaRp (X)}) Pr{X > VaRp (X)}

1 Pr{X > VaRp (X)}

Z

!



VaRp (X) Pr{X > VaRp (X)} +

Pr{X > x}dx , (1.5) VaRp (X)

where I(A) hereafter denotes the indicator function of set A. By the Karamata theorem (see, e.g., [25]), we have Z



Pr{X > x}dx ≈ VaRp (X)

1 VaRp (X) Pr{X > VaRp (X)}, as p → 1. α−1

(1.6)

Plug this estimate into (1.5), we obtain (1.4) for any regularly varying distribution with heavy-tail index α > 1. The asymptotic formula (1.4) of TCE for univariate tail risks is fairly straightforward, but the multivariate case remains unsettled and is the focus of this paper. Consider a random vector X = (X1 , . . . , Xd ) from a multi-assets portfolio at the end of a given period, where the i-th component Xi corresponds to the loss of the financial position on the i-th market. A risk measure 2

That is, the hitting time distribution of a finite-state Markov chain.

3

R(X) for loss vector X corresponds to a subset of Rd consisting of all the deterministic portfolios x such that the modified positions x − X is acceptable to regulators/supervisors. The coherency principles that are similar to the univariate case were formulated in [15] for multivariate risk measure R(X), and it was further shown in [4] that for continuous loss vectors, multivariate TCE’s are coherent in the sense of [15]. Note, however, that multivariate TCE’s, to be formally defined in Section 2, are subsets of Rd , which lack tractable expressions even for some widely used multivariate distributions, such as multivariate normals. The effect of dependence among losses X1 , . . . , Xd in different assets on the multivariate TCE also remains difficult to understand. In this paper, we study asymptotic behaviors of multivariate TCE’s for multivariate regularly varying distributions. Our method, based on tail dependence functions developed in [23, 14], not only yields explicit asymptotic expressions of multivariate TCE’s for various multivariate distributions, but also leads to better insights into how the dependence among extreme losses would affect analysis on tail risks. The rest of the paper is organized as follows. In Section 2, we briefly discuss the multivariate coherent risk measures introduced in [15] and obtain the asymptotic expressions of multivariate TCE’s for multivariate regularly varying distributions in terms of their intensity measures. In Section 3, we utilize tail dependence functions to obtain asymptotic bounds for multivariate TCE’s. Section 4 has some concluding remarks. Throughout this paper, measureability of functions and sets are often assumed without explicitly mention, and the maximum operator is denoted by ∨.

2

Tail Risks of Multivariate Regular Variation

To explain the vector-valued coherent risk measures, we use the notations from [15]. Let K be a closed, salient convex cone1 of Rd such that Rd+ ⊆ K. The convex cone K induces a partial order on Rd : x ≤K y if and only if y ∈ x + K. Note that a convex cone K must be an upper set3 with respect to partial order ≤K induced by itself. Moreover, if A is an upper set with respect to partial order ≤K , then for any x ∈ A and k ∈ K, x + k ≥K x, leading to x + k ∈ A and thus A + K ⊆ A. Observe that we always have A + K ⊇ A due to the fact that any closed convex cone must contain the origin. Conversely, if A + K = A for some subset A, then for any y ≥K x with x ∈ A, y ∈ x + K ⊆ A + K = A, implying that A must be upper with respect to partial order ≤K . Hence, A is an upper set with respect to partial order ≤K if and only if A + K = A. If K = Rd+ , then the ≤K -order becomes the usual component-wise order. For any two loss random vectors X and Y on the probability space (Ω, F, P), define X ≤K Y if and only if Y − X ∈ K, P-almost surely. Using the partial order ≥K rather than the usual component-wise partial order can account for some financial market frictions such as transaction cost, etc.. See [15] for details. 3

A set S is called upper (lower) with respect to partial order ≤K if s ≤K (≥K ) s0 and s ∈ S imply that s0 ∈ S.

4

Definition 2.1. Consider random loss vectors on a probability space (Ω, F, P). A vector-valued coherent risk measure R(·) is a measurable set-valued map satisfying that R(X) ⊂ Rd is closed for any loss random vector X and 0 ∈ R(0) 6= Rd , as well as the following axioms: 1. (Monotonicity) For any X and Y , X ≤K Y implies that R(X) ⊇ R(Y ). 2. (Subadditivity) For any X and Y , R(X + Y ) ⊇ R(X) + R(Y ). 3. (Positive Homogeneity) For any X and positive s, R(sX) = sR(X). 4. (Translation Invariance) For any X and any deterministic vector l, R(X + l) = R(X) + l. When d = 1, %(X) := inf{r : r ∈ R(X)} is a univariate coherent risk measure satisfying the four axioms discussed in Section 1, and thus R(X) = [%(X), ∞). It was shown in [15] that the worst conditional expectation for random vector X, defined as W CE p (X) := {x ∈ Rd : E(x − X | B) ≥K 0, ∀B ∈ F with P(B) ≥ 1 − p}, 0 < p < 1, is a vector-valued coherent risk measure. Since W CE p (X) = ∩B∈F with P(B)≥1−p (E(X | B) + K) and K is an upper set, W CE p (X) is also an upper set. For any continuous random vector X, W CE p (X) equals the tail conditional expectation (TCE) for X, defined as in [4] by, T CE p (X) := {x ∈ Rd : E(x − X | X ∈ A) ≥K 0, ∀A ∈ Qp (X)} \ = (E(X | X ∈ A) + K), 0 < p < 1,

(2.1)

A∈Qp (X)

where Qp (X) = {A ⊆ Rd : A is Borel-measurable and A + K = A, Pr{X ∈ A} ≥ 1 − p} is the set of all the upper sets (with respect to ≤K ) with probability mass greater than or equal to 1 − p. Observe that T CE p (X) is a convex and upper set that consists of all the deterministic portfolios x of capital reserves that can be used to cover the expected losses E(X | X ∈ A) in the events that X ∈ A. Note that multivariate coherent risk measures discussed in [15, 4] are defined for essentially bounded random vectors. To discuss asymptotic properties, these measures have to be extended to d

the set of all random vectors on R = [−∞, ∞]d . This can be done using the idea in [9] that allows vectors in R(X) to have components taking the value of ∞; that is, the positions corresponding to these components are so risky, whatever that means, that no matter what the capital added, the positions will remain unacceptable. We need also to exclude the situations where components of the vectors in R(X) take the value of −∞, which would mean that arbitrary amounts of capitals could be withdrawn without endangering the portfolios (see [9] for details). As a matter of fact, it can be easily verified that T CE p (X) is coherent in the sense of Definition 2.1 if X, which may not be bounded, has a continuous density function. 5

The extreme value analysis of TCE T CE p (X) as p → 1 boils down to analyzing asymptotic behaviors of E(X | X ∈ rB) as r → ∞ for various upper set B, for which multivariate regular variation suits well. A non-negative random vector X with joint df F is said to have a multivariate regularly varying (MRV) distribution F if there exists a Radon measure µ (i.e., finite on compact d

sets), called the intensity measure, on R+ \{0}4 and a common normalization sequence {bn } with bn → ∞ such that



X ∈B bn

n Pr

 → µ(B),

(2.2)

d

for all relatively compacts B ⊂ R \{0} with µ(∂B) = 0. The following equivalent characterizations of MRV distributions can be found in [24, 25, 2, 3]. Theorem 2.2. (Multivariate Regular Variation) The following statements are equivalent: 1. Random vector X has an MRV df F . d

2. There exists a Radon measure µ on R \{0} such that 1 − F (rx) Pr{X/r ∈ [0, x]c } c = lim (2.3) c = µ([0, x] ), r→∞ 1 − F (r1) r→∞ Pr{X/r ∈ [0, 1] } R for all continuous points x of µ, where µ([0, x]c ) = c Sd−1 max1≤j≤d (uj /xj )α S(du) for some lim

+

positive constants c, α and a probability measure S on Sd−1 := {x ∈ Rd+ : ||x|| = 1}, where + || · || denote a norm on Rd . d

d

3. There exists a Radon measure µ on R \{0} such that for every Borel set B ⊂ R \{0} bounded away from the origin satisfying that µ(∂B) = 0, lim

r→∞

Pr{X ∈ rB} = µ(B), Pr{||X|| > r}

(2.4)

with the homogeneity condition µ(rB) = r−α µ(B). Observe from (2.3) that the margins Fj , 1 ≤ j ≤ d, of an MRV df F are regularly varying in the sense of (1.2). Since F1 , . . . , Fd are usually assumed to be tail equivalent [25], we have that F j (x) = Lj (x)/xα , 1 ≤ j ≤ d, where Li (x)/Lj (x) → cij as x → ∞, 0 < cij < ∞. We assume hereafter that cij = 1 for notational convenience. If cij 6= 1 for some i 6= j, we can properly rescale the margins and the results still follow. We also assume that the heavy-tail index α > 1 to ensure the existence of expectations. It is well-known from [24] that a random vector X has an MRV distribution F if and only if X is in the maximum domain of attraction of a multivariate extreme value (MEV) distribution with identical Fr´echet margins H(x; 1/α) := exp{−x−α } for 4

d

d

Here R+ = [0, ∞]d is compact and the punctured version R+ \{0} is modified via the one-point uncompactification

(see, e.g., [25]).

6

x > 0 and α > 0. That is, properly normalized component-wise maximums of random samples from df F converge weakly, as the sample size goes to infinity, to an MEV distribution with identical Fr´echet margins. In general, the margins of an MEV distribution can be expressed in terms of the generalized extreme value family, H(x; ξ) := exp{−(max{1 + ξx, 0})−1/ξ },

x ∈ R, ξ ∈ R,

and in particular, with Fr´echet margins, the extremal dependence structure can be characterized by intensity measure µ. Note, however, that the parametric feature, enjoyed by the univariate EV distributions, is lost in the multivariate context. Theorem 2.3. Let X be a non-negative loss vector that has an MRV df with intensity measure µ. 1. Let B be an upper set bounded away from 0. Then limr→∞ r−1 E(Xj | X ∈ rB) = R ∞ µ(Aj (w)∩B) dw =: uj (B; µ), where Aj (w) := {(x1 , . . . , xd ) ∈ Rd : xj > w}, 1 ≤ j ≤ d. 0 µ(B) 2. As p → 1, T CE p (X) ≈

\

VaR1−(1−p)/µ(B) (||X||) ((u1 (B; µ), . . . , ud (B; µ)) + K)

B∈Q||·||

where Q||·|| := {B ⊆ Rd : B+K = B, B∩Sd−1 6= ∅, B ⊆ (Bd )c }, and Bd := {x ∈ Rd : ||x|| ≤ 1} + denotes the unit ball in Rd with respect to the norm || · ||. Proof. To estimate E(X | X ∈ rB) for any upper set B bounded away from 0, consider, for any 1 ≤ j ≤ d, Z E(Xj | X ∈ rB) =



Z Pr{Xj > x | X ∈ rB}dx = r

0

0



Pr{Xj > rw, X ∈ rB} dw. Pr{X ∈ rB}

(2.5)

We first argue that we can pass the limit through the integration. Since upper set B 6= ∅, we have that the complement B c 6= Rd is a lower set3 with respect to the ≤K -order. Since K ⊇ Rd+ , there exists a vector w =(w1 , . . . , wd ) such that the complement B c ⊇ w − K ⊇ w − Rd+ , and thus c Qd B⊆ i=1 (−∞, wi ] . Therefore, ( Pr{Xj > rw, X ∈ rB} ≤ Pr Xj > rw, X ∈

d Y

c (−∞, rwi ]

)

i=1

≤ Pr {Xj > rw, Xj > rwj } + Pr {rwj ≥ Xj > rw} .

(2.6)

Observe from (2.4) and the generalized dominated convergence theorem that, as r → ∞, Z wj Z wj Z ∞ Pr {rwj ≥ Xj > rw} µ(Aj (w)\Aj (wj )) Pr {rwj ≥ Xj > rw} dw = dw → dw, Pr{X ∈ rB} Pr{X ∈ rB} µ(B) 0 0 0 7

where Aj (w) := {(x1 , . . . , xd ) ∈ Rd : xj > w}. For the first summand of (2.6), we have Z ∞ Z ∞ Pr {Xj > rwj } Pr {Xj > rw} Pr {Xj > rw, Xj > rwj } dw = wj + dw. Pr{X ∈ rB} Pr{X ∈ rB} wj Pr{X ∈ rB} 0 Using the Karamata theorem (1.6), we have, as r → ∞, Z ∞ Z ∞ Pr {Xj > rw} Pr {Xj > x} 1 rwj Pr {Xj > rwj } dw = dx ≈ . Pr{X ∈ rB} r Pr{X ∈ rB} α − 1 r Pr{X ∈ rB} wj rwj Thus, we have, via (2.4), as r → ∞, Z ∞ Z ∞ Pr {Xj > rw, Xj > rwj } µ(Aj (wj )) µ(Aj (w ∨ wj )) 1 wj µ(Aj (wj )) dw → wj + = dw, Pr{X ∈ rB} µ(B) α−1 µ(B) µ(B) 0 0 where the last equality follows from the direct calculation via (2.3). Therefore, we have  Z ∞ Pr {rwj ≥ Xj > rw} Pr {Xj > rw, Xj > rwj } lim + dw r→∞ 0 Pr{X ∈ rB} Pr{X ∈ rB}  Z ∞ µ(Aj (w)\Aj (wj )) µ(Aj (w ∨ wj )) + dw = µ(B) µ(B) 0   Z ∞ Pr {rwj ≥ Xj > rw} Pr {Xj > rw, Xj > rwj } = lim + dw, Pr{X ∈ rB} Pr{X ∈ rB} 0 r→∞

(2.7)

where the second equality follows from (2.4). Because of (2.6), (2.7) and the generalized dominated convergence theorem, we have from (2.5) that Z ∞ Pr{Xj > rw, X ∈ rB} 1 lim E(Xj | X ∈ rB) = lim dw r→∞ r r→∞ 0 Pr{X ∈ rB} Z ∞ Z ∞ µ(Aj (w) ∩ B) Pr{Xj > rw, X ∈ rB} = lim dw = dw. (2.8) r→∞ Pr{X ∈ rB} µ(B) 0 0 This concludes the proof of statement (1). For statement (2), we simplify the asymptotic expression for (2.1). For any upper set A ∈ d−1 Qp (X), there exists an upper set B with B ∩ S+ 6= ∅ and a positive number rB such that

A = rB B. Consider p → 1. Since Pr{X ∈ rB} is decreasing in r, we can find rB,p ≥ rB for any A = rB B such that Pr{X ∈ A} ≥ Pr{X ∈ rB,p B} = 1 − p, as p → 1. It follows from (2.8) that E(Xj | X ∈ rB,p B) is asymptotically increasing for sufficiently small 1 − p and goes to +∞ as p → 1, and thus we have E(X | X ∈ A) ≤ E(X | X ∈ rB,p B) for sufficiently small 1 − p. Since E(X | X ∈ A) + K ⊇ E(X | X ∈ rB,p B) + K for sufficiently small 1 − p, we have, as p → 1, (E(X | X ∈ A) + K) ∩ (E(X | X ∈ rB,p B) + K) = E(X | X ∈ rB,p B) + K. 8

Observe that rB,p B ∈ Qp (X), and we have " ! # \ lim (E(X | X ∈ rB,p B) + K) \ T CE p (X) p→1

B∈Q



 \

= lim  p→1



(E(X | X ∈ A) + K) \ T CE p (X) = ∅,

A∈Qp (X)

where Q := {B ⊆ Rd : B + K = B, B ∩ Sd−1 6= ∅, B is bounded away from 0} and Pr{X ∈ + rB,p B} = 1 − p. That is, (2.1) can be rewritten as follows, for sufficiently small 1 − p, T CE p (X) ≈

\

(E(X | X ∈ rB,p B) + K).

(2.9)

B∈Q

For any B ∈ Q, there exists a real number rB with rB ≥ 1 such that rB B ∈ Q||·|| = {B ⊆ Rd : B + K = B, B ∩ Sd−1 6= ∅, B ⊆ (Bd )c }. That is, for any B ∈ Q with Pr{X ∈ rB,p B} = 1 − p, we + can find a B 0 ∈ Q||·|| and a real number rB 0 ,p (e.g., rB 0 ,p = rB,p /rB ) such that rB,p B = rB 0 ,p B 0 . Thus (2.9) can be rewritten further as \

T CE p (X) ≈

(E(X | X ∈ rB,p B) + K),

(2.10)

B∈Q||·|| ,Pr{X∈rB,p B}=1−p

for sufficiently small 1 − p. Observe that as p → 1, rB,p → ∞, and thus it follows from (2.4) that for sufficiently small 1 − p, µ(B) Pr{||X|| > rB,p } ≈ 1 − p, which implies that rB,p ≈ VaR1−(1−p)/µ(B) (||X||) as p → 1. Therefore, (2.8) and (2.10) imply that \

T CE p (X) ≈

VaR1−(1−p)/µ(B) (||X||) ((u1 (B; µ), . . . , ud (B; µ)) + K)

B∈Q||·||

as p → 1, where uj (B; µ) =

3

R∞ 0

µ(Aj (w)∩B) dw, µ(B)

1 ≤ j ≤ d.



Bounds for Tail Risks via Tail Dependence Functions

Theorem 2.3 shows how extremal dependence, as described by the intensity measure, would affect tail risks, but the asymptotic expression obtained in Theorem 2.3 is intractable for some multivariate distributions. In this section, we utilize the method of tail dependence functions introduced in [23, 14] to derive tractable bounds for TCE. For notational convenience, we only consider the case where K = Rd+ . The idea is to separate the margins from the dependence structure of df F , so that TCE’s can be expressed asymptotically in terms of the marginal heavy-tail index and tail dependence of the 9

copula of F . The copula-based approach for extremal dependence analysis is especially effective in developing versatile parametric dependence models [13, 22]. Assume that df F of random vector X = (X1 , . . . , Xd ) has continuous margins F1 , . . . , Fd , and then from [26], the copula C of F can be uniquely expressed as C(u1 , . . . , ud ) = F (F1−1 (u1 ), . . . , Fd−1 (ud )), (u1 , . . . , un ) ∈ [0, 1]d , where Fj−1 , 1 ≤ j ≤ d, are the quantile functions of the margins. The extremal dependence of a df F can be described by various tail dependence parameters of its copula C. The lower or upper tail dependence parameters, for example, are the conditional probabilities that random vector (U1 , . . . , Ud ) := (F1 (X1 ), . . . , Fd (Xd )) with standard uniform margins belongs to lower or upper tail orthants given that a univariate margin takes extreme values (small or large): λL = lim Pr{U1 ≤ u, . . . , Ud ≤ u | Ud ≤ u} = lim u↓0

λU

u↓0

C(u, . . . , u) u

= lim Pr{U1 > 1 − u, . . . , Ud > 1 − u | Ud > 1 − u} = lim u↓0

u↓0

C(1 − u, . . . , 1 − u) . u

(3.1)

where C denotes the survival function of C. Bivariate tail dependence has been widely studied [13], and various multivariate versions of tail dependence parameters have also been introduced and studied in [16, 18]. Observe from (3.1) that the tail dependence parameters of copula C are the conditional tail probabilities that components Ui ’s go to extremes at the same rate (same relative scale), and thus they describe only some aspects of extremal dependence. The tail dependence parameters also lack operational properties to facilitate the extremal dependence analysis of certain multivariate distributions, such as vine copulas, that are constructed from basic building blocks of bivariate distributions. To overcome these deficiencies, the lower and upper tail dependence functions, denoted by b(·; C) and b∗ (·; C) respectively, were introduced in [16, 23, 14] as follows, C(uwj , 1 ≤ j ≤ d) , ∀w = (w1 , . . . , wd ) ∈ Rd+ ; u C(1 − uwj , 1 ≤ j ≤ d) b∗ (w; C) := lim , ∀w = (w1 , . . . , wd ) ∈ Rd+ . u↓0 u b(w; C) := lim u↓0

(3.2)

b = b∗ (w; C) where C(u b 1 , . . . , ud ) = C(1 − u1 , . . . , 1 − ud ) is the survival copula of C, Since b(w; C) we focus only on upper tail dependence in this paper and any result about upper tail dependence can be easily translated into the result for lower tail dependence. The explicit expression of b∗ for elliptical distributions was obtained in [16]. A theory of tail dependence functions was developed in [23, 14] based on the Euler’s formula for homogeneous functions: ∗

b (w; C) =

d X

wj tj (wi , i 6= j | wj ), ∀w = (w1 , . . . , wd ) ∈ Rd+ ,

j=1

10

(3.3)

where the upper conditional tail dependence functions tj (wi , i 6= j | wj ) := limu↓0 Pr{Ui > 1 − uwi , ∀i 6= j | Uj = 1 − uwj }, 1 ≤ j ≤ d, are homogeneous of order zero. For the copulas with explicit expressions, the tail dependence functions can be obtained directly with relative ease. For the copulas without explicit expressions, the tail dependence functions can be obtained via (3.3) by exploring closure properties of conditional distributions. For example, the tail dependence function of the t distribution can be obtained by (3.3) (see [23]). If follows from (3.1)–(3.3) that the upper tail dependence parameter can be expressed as λU =

d X j=1

lim Pr{Ui > 1 − u, ∀i 6= j | Uj = 1 − u}, u↓0

which extends the well-known formula (see, e.g., [10]) for bivariate tail dependence parameters to the multivariate case. It was shown in [14] that b∗ (w; C) > 0 for all w ∈ Rd+ if and only if λU > 0. Unlike λU , however, the tail dependence function provides all the extremal dependence information of copula C as specified by its extreme value copula [23, 14]. Using the inclusion-exclusion principle, we define the upper exponent function of C as follows X a∗ (w; C) := (−1)|S|−1 b∗S (wi , i ∈ S; CS ), (3.4) S⊆{1,...,d},S6=∅

where b∗S (wi , i ∈ S; CS ) denotes the upper tail dependence function of the margin CS of C with component indexes in S. Similar to tail dependence functions, the exponent function has the following homogeneous representation: ∗

a (w; C) =

d X

wj tj (wi , i 6= j | wj ), ∀w = (w1 , . . . , wd ) ∈ Rd+ ,

(3.5)

j=1

where tj (wi , i 6= j | wj ) = limu↓0 Pr{Ui ≤ 1 − uwi , ∀i 6= j | Uj = 1 − uwj }, 1 ≤ j ≤ d, is homogeneous of order zero. It was shown in [14] that tail dependence functions {b∗S (wi , i ∈ S; CS )} and the exponent function a∗ (w; C) are uniquely determined from one to another. It follows from Theorem 2.4 of [18] and (2.3) that µ([1, ∞]d ) =

b∗ (1, . . . , 1; C) , and µ([0, 1]c ) = 1. a∗ (1, . . . , 1; C)

(3.6)

In fact, the detailed relations between the intensity measure µ and tail dependence function b∗ have been established in [19], and in particular, µ([w, ∞]) =

b∗ (w1−α , . . . , wd−α ; C) , ∀w = (w1 , . . . , wd ) ∈ Rd+ . a∗ (1, . . . , 1; C)

The Radon-Nikodym derivative of µ with respect to the Lebesgue measure is then given by ! Q αd di=1 wi−α−1 ∂b∗ (v1 , . . . , vd ; C) dµ = , ∀w = (w1 , . . . , wd ) ∈ Rd+ . (3.7) dv v=w a∗ ((1, . . . , 1); C) ∂v1 , . . . , ∂vd {vi =wi−α }di=1 11

If the tail dependence function is explicitly known, such as in the case of Archimedean copulas [14], then the Radon-Nikodym derivative (3.7) of the corresponding intensity measure µ can be calculated explicitly, and Z µ(Aj (w) ∩ B) = Aj (w)∩B

dµ dx, dv v=x

Z µ(B) = B

dµ dx. dv v=x

Therefore, by Theorem 2.3 (1), E(X | X ∈ rB) can be asymptotically expressed in terms of the tail dependence function b∗ for sufficiently large r. But the asymptotic estimation of T CE p (X) via Theorem 2.3 (2) is still cumbersome because B ∈ Q||·|| can be quite arbitrary. More tractable bounds for T CE p (X) can be established using the tail dependence and exponent functions, as shown in the next theorem. Theorem 3.1. Let X be a non-negative loss vector with an MRV df F and heavy-tail index α > 1. Assume that the copula C of F has a positive upper tail dependence function b∗ (w; C) > 0. Let || · ||max denote the maximum norm. 1. For 1 ≤ j ≤ d, 1 E(Xj | X ∈ r(x, ∞]) = r→∞ r

Z



lim

0

−α , . . . , x−α ; C) b∗ (x−α 1 , . . . , (wj ∨ xj ) d dwj . −α ; C) , . . . , x b∗ (x−α 1 d

2. For sufficiently small 1 − p,   T CE p (X) ⊆ VaR1−(1−p) a∗ (1,...,1;C) (||X||max ) (S1 (b∗ , α), . . . , Sd (b∗ , α)) + Rd+ b∗ (1,...,1;C)

where Sj (b∗ , α) =

R∞ 0

b∗ (1,...,1,(wj ∨1)−α ,1,...,1;C) dwj , b∗ (1,...,1;C)

1 ≤ j ≤ d.

3. For sufficiently small 1 − p,   VaRp (||X||max ) (s1 (b∗ , α), . . . , sd (b∗ , α)) + Rd+ ⊆ T CE p (X) where, for 1 ≤ j ≤ d, sj (b∗ , α) := +

X

α 1 ∗ α − 1 b (1, . . . , 1; C) |S|

(−1)

b∗{j}∪S (1, . . . , 1; C{j}∪S ) −

R1 0

b∗{j}∪S (wj−α , 1, . . . , 1; C{j}∪S )dwj

b∗ (1, . . . , 1; C)

∅6=S⊆{i:i6=j}

.

Proof. Let F have margins F1 , . . . , Fd that are regularly varying in the sense of (1.2). Since F1 , . . . , Fd are tail equivalent [25], we have that F j (x) = Lj (x)/xα , 1 ≤ j ≤ d, where Li (x)/Lj (x) → 1 as x → ∞.

12

(1) Without loss of generality, let j = 1. The straightforward calculation shows Z ∞ Pr{X1 > x, X1 > rx1 , . . . , Xd > rxd } E(X1 | X > rx) = dx Pr{X1 > rx1 , . . . , Xd > rxd } 0 Z ∞ Pr{X1 > x, X2 > rx2 , . . . , Xd > rxd } dx = rx1 + Pr{X1 > rx1 , . . . , Xd > rxd } rx1   Z ∞ Pr{X1 > rw, X2 > rx2 , . . . , Xd > rxd } = r x1 + dw Pr{X1 > rx1 , . . . , Xd > rxd } x1   Z ∞ Pr{U1 > F1 (rw), U2 > F2 (rx2 ), . . . , Ud > Fd (rxd )} dw . = r x1 + Pr{U1 > F1 (rx1 ), . . . , Ud > Fd (rxd )} x1 Applying the Karamata theorem and generalized dominated convergence theorem, we are allowed to pass the limit through the integral. Since Lj , 1 ≤ j ≤ d, are slowly varying and the margins are tail equivalent, we have, 1 E(X1 | X > rx) r→∞ r Z ∞ Pr{U1 > 1 − L1 (rw)/(rw)α , . . . , Ud > 1 − Ld (rxd )/(rxd )α } x1 + lim dw r→∞ x Pr{U1 > 1 − L1 (rx1 )/(rx1 )α , . . . , Ud > 1 − Ld (rxd )/(rxd )α } 1 Z ∞ −α , . . . , U > 1 − x−α L (r)r −α } Pr{U1 > 1 − w−α L1 (r)r−α , U2 > 1 − x−α 1 d 2 L1 (r)r d dw x1 + lim −α −α , . . . , U > 1 − x−α L (r)r −α } r→∞ L (r)r Pr{U > 1 − x 1 1 1 x1 d 1 d Z ∞ −α Pr{U1 > 1 − w−α u, U2 > 1 − x−α 2 u, . . . , Ud > 1 − xd u} x1 + lim dw −α Pr{U1 > 1 − x−α x1 u→0 1 u, . . . , Ud > 1 − xd u} Z ∞ ∗ −α −α Z ∞ ∗ −α b ((w1 ∨ x1 )−α , x−α b (w , x2 , . . . , x−α 2 , . . . , xd ; C) d ; C) x1 + dw = dw1 . −α −α ∗ −α −α b∗ (x−α x1 b (x1 , x2 , . . . , xd ; C) 0 1 , . . . , xd ; C) lim

= = = =

(2) It follows from (2.10) that as p → 1, \ T CE p (X) ⊆ (E(X | X ∈ rx,p (x, ∞]) + Rd+ ) x∈Sd−1 +

where rx,p satisfies Pr{X ∈ rx,p (x, ∞]} = 1 − p. Since b∗ (1; C) > 0, it follows from Theorem 2.4 of [18] that µ((1, ∞]) > 0 (see also (3.6)). Since ||X||max is regularly varying at ∞, we have for sufficiently small 1 − p, there exists r1,p , such that µ((1, ∞]) Pr{||X||max > r1,p } = 1 − p, which implies that r1,p ≈

VaR1−(1−p)/µ((1,∞]) (||X||max ) as p → 1. Observe that as p → 1,

r1,p → ∞, and thus it follows from (2.4) that for sufficiently small 1 − p, Pr{X ∈ r1,p (1, ∞]} ≈ µ((1, ∞]) Pr{||X||max > r1,p } = 1 − p. Therefore, as p → 1, T CE p (X) ⊆

\

(E(X | X ∈ rx,p (x, ∞]) + Rd+ ) ⊆ E(X | X ∈ r1,p (1, ∞]) + Rd+ .

x∈Sd−1 +

13

(3.8)

Observe from (1) and (3.6) that as p → 1, E(X | X ∈ r1,p (1, ∞]) ≈ VaR1−(1−p) a∗ (1,...,1;C) (||X||max )(S1 (b∗ , α), . . . , Sd (b∗ , α)) b∗ (1,...,1;C)

where Sj (b∗ , α) = 1 +

R∞ 1

b∗ (1,...,1,wj−α ,1,...,1;C) dwj , b∗ (1,...,1;C)

1 ≤ j ≤ d. Plug this estimate into (3.8), we

obtain (2). (3) In light of (2.10), consider, for any B ∈ Q||·||max with Pr{X ∈ rB,p B} = 1 − p, E(Xj | X ∈ rB,p B) =

E(Xj I{X ∈ rB,p B}) . Pr{X ∈ rB,p B}

Since (1, ∞]d ⊆ B ⊆ [0, 1]c for any B ∈ Q||·||max , we have E(Xj I{X ∈ rB,p [0, 1]c }) E(Xj | X ∈ rB,p B) ≤ = Pr{X ∈ rB,p (1, ∞]d }

Z 0



Pr{{Xj > x} ∩ {X ∈ rB,p [0, 1]c }} dx.(3.9) Pr{X ∈ rB,p (1, ∞]d }

If x > rB,p then Pr{{Xj > x} ∩ {X ∈ rB,p [0, 1]c }} = Pr{Xj > x}. If x ≤ rB,p then Pr{{Xj > x} ∩ {X ∈ rB,p [0, 1]c }} = Pr{{Xj > x} ∩ (∪di=1 {Xi > rB,p })} = Pr{∪di=1 ({Xj > x} ∩ {Xi > rB,p })} = Pr{(∪i6=j {Xj > x, Xi > rB,p }) ∪ {Xj > rB,p }} X X = (−1)|S| Pr{Xj > rB,p , Xi > rB,p , i ∈ S} − (−1)|S| Pr{Xj > x, Xi > rB,p , i ∈ S} ∅6=S⊆{i:i6=j}

S⊆{i:i6=j}

= Pr{Xj > rB,p } + X (−1)|S| (Pr{Xj > rB,p , Xi > rB,p , i ∈ S} − Pr{Xj > x, Xi > rB,p , i ∈ S}) .

(3.10)

∅6=S⊆{i:i6=j}

Since the margins are tail equivalent and slowly varying, we have, for any 0 ≤ wj ≤ 1, and any ∅= 6 S ⊆ {i : i 6= j}, lim

p→1

= lim

Pr{Xj > rB,p wj , Xi > rB,p , i ∈ S} Pr{X ∈ rB,p (1, ∞]d } −α −α Pr{Uj > 1 − wj−α rB,p Lj (rB,p w), Ui > 1 − rB,p Li (rB,p ), i ∈ S}

p→1

= =

lim

rB,p →∞

−α Pr{Ui > 1 − rB,p Li (rB,p ), 1 ≤ i ≤ d} −α −α Pr{Uj > 1 − wj−α rB,p L1 (rB,p ), Ui > 1 − rB,p L1 (rB,p ), i ∈ S} −α Pr{Ui > 1 − rB,p L1 (rB,p ), 1 ≤ i ≤ d}

b∗{j}∪S (wj−α , 1, . . . , 1; C{j}∪S ) b∗ (1, . . . , 1; C)

,

14

where b∗{j}∪S (wj−α , 1, . . . , 1; C{j}∪S ) denotes the upper tail dependence function of the multivariate margin C{j}∪S evaluated with the j-th argument being wj−α and others being one. Similarly, b∗{j}∪S (1, . . . , 1; C{j}∪S ) Pr{Xj > rB,p , Xi > rB,p , i ∈ S} = , p→1 b∗ (1, . . . , 1; C) Pr{X ∈ rB,p (1, ∞]d } Pr{Xj > rB,p } 1 = ∗ lim . p→1 Pr{X ∈ rB,p (1, ∞]d } b (1, . . . , 1; C) lim

(3.11)

Using the bounded convergence theorem, we then have, for sufficiently small 1 − p, Z 1 X Pr{Xj > rB,p , Xi > rB,p , i ∈ S} − Pr{Xj > rB,p wj , Xi > rB,p , i ∈ S} (−1)|S| dwj Pr{X ∈ rB,p (1, ∞]d } 0 ∅6=S⊆{i:i6=j} R1 X b∗ (1, . . . , 1; C{j}∪S ) − 0 b∗{j}∪S (wj−α , 1, . . . , 1; C{j}∪S )dwj |S| {j}∪S (−1) ≈ . (3.12) b∗ (1, . . . , 1; C) ∅6=S⊆{i:i6=j}

Plug (3.11) and (3.12) into (3.10), and we have, for sufficiently small 1 − p, Z rB,p Pr{{Xj > x} ∩ {X ∈ rB,p [0, 1]c }} rB,p + dx ≈ ∗ d b (1, . . . , 1; C) Pr{X ∈ rB,p (1, ∞] } 0 R1 X b∗{j}∪S (1, . . . , 1; C{j}∪S ) − 0 b∗{j}∪S (wj−α , 1, . . . , 1; C{j}∪S )dwj rB,p (−1)|S| .(3.13) b∗ (1, . . . , 1; C) ∅6=S⊆{i:i6=j}

On the other hand, using the Karamata theorem (1.6), we have, for sufficiently small 1 − p, Z ∞ Z ∞ Pr{{Xj > x} ∩ {X ∈ rB,p [0, 1]c }} Pr{Xj > x} dx = dx d} Pr{X ∈ r (1, ∞] Pr{X ∈ rB,p (1, ∞]d } B,p rB,p rB,p ≈ rB,p

Pr{Xj > rB,p } rB,p 1 ≈ . α − 1 Pr{X ∈ rB,p (1, ∞]d } (α − 1)b∗ (1, . . . , 1; C)

(3.14)

Combining (3.13) and (3.14) into (3.9), we have, for sufficiently small 1 − p, E(Xj | X ∈ rB,p B) ≤ X

+ rB,p

|S|

(−1)

rB,p α ∗ α − 1 b (1, . . . , 1; C) b∗{j}∪S (1, . . . , 1; C{j}∪S ) −

R1 0

b∗{j}∪S (wj−α , 1, . . . , 1; C{j}∪S )dwj

b∗ (1, . . . , 1; C)

∅6=S⊆{i:i6=j}

.

As p → 1, rB,p ≈ VaR1−(1−p)/µ(B) (||X||max ) ≤ VaR1−(1−p)/µ([0,1]c ) (||X||max ) = VaRp (||X||max ). Thus, for sufficiently small 1 − p, E(Xj | X ∈ rB,p B) α 1 ≤ VaRp (||X||max ) α − 1 b∗ (1, . . . , 1; C) +

X

b∗ (1, . . . , 1; C{j}∪S ) |S| {j}∪S

(−1)

∅6=S⊆{i:i6=j}



R1 0

b∗{j}∪S (wj−α , 1, . . . , 1; C{j}∪S )dwj

b∗ (1, . . . , 1; C) 15

=: sj (b∗ , α),

for any B ∈ Q||·||max . Therefore,   T CE p (X) ⊇ VaRp (||X||max ) (s1 (b∗ , α), . . . , sd (b∗ , α)) + Rd+ , for sufficiently small 1 − p.



Remark 3.2. Observe that if d = 1, then Theorem 3.1 (2) and (3) reduce to (1.4). In multivariate risk management, the upper (subset) bound presented in Theorem 3.1 (3) is more important, because it provides a set of portfolios of conservative reserves so that even in worst case scenarios the resulting positions are still acceptable to regulators/supervisors. In the remainder of this section, we have some examples to examine the quality of the results in Theorem 3.1 when used as approximations. The examples show that they are better with more tail dependence and a larger ζ, where ζ is in the exponent of the second order expansion C(1 − uwj , 1 ≤ j ≤ d) ≈ u b∗ (w; C) + u1+ζ b∗2 (w; C),

u → 0.

(3.15)

It is intuitive that if ζ is larger (especially if ζ ≥ 1), then the second order term is less important. It remains an open question whether it can be proved in any generality that ζ increases as the amount of extremal dependence in a parametric family increases. Note that for the Fr´echet upper bound copula, C U (1 − uw) = u min{w1 , . . . , wd }, and there is no second order term (i.e., ζ = ∞). Example 3.3.

(a) Analysis of complete dependence (the Fr´echet upper bound). Let CU be

the Fr´echet upper bound copula of dimension d. Then b∗ (w; CU ) = min{w1 , . . . , wd } and b∗ (1; CU ) = 1, a∗ (1; CU ) = 1. In part (2) of Theorem 3.1, 1 − (1 − p)a∗ /b∗ = p, and for α > 1, R∞ Sj (b∗ , α) = 1 + 1 min{1, w−α }dw = 1 + (α − 1)−1 = α/(α − 1). In part (3) of Theorem 3.1, P for α > 1, sj (b∗ , α) = α/(α − 1) + ∅6=S⊆{i:i6=j} (−1)|S| 0 = α/(α − 1). That is, the expressions in parts (2) and (3) coincide. (b) Analysis of near independence. As the d-variate copula C (with tail dependence) moves towards independence, b∗ (1; C) → 0 and a∗ (1; C) → d and 1 − (1 − p)a∗ (1; C)/b∗ (1; C) > 0 only if p > 1 − b∗ (1; C)/a∗ (1; C) so that for small b∗ (1; C), the result in part (2) is non-trivial only for large p near 1. This is a hint that all of the limiting results of Theorem 3.1 are worse for weak tail dependence. In this case, one has to use Theorem 2.3 to approximate the multivariate TCE. Example 3.4. We show some details for three copula families to illustrate Theorem 3.1. The first copula is the exchangeable MTCJ copula (or Mardia-Takahasi-Cook-Johnson copula, see [20, 27, 8]), the second is a mixture of the MTCJ copula and the independence copula, and the third is a non-exchangeable trivariate copula. Second order expansions of the tail dependence functions are obtained and the approximation from part (1) of Theorem 3.1 is summarized in Tables for some special cases. 16

(a) The MTCJ copula in dimension d, with dependence increasing in δ, is:  −1/δ −δ C(u; δ) = u−δ , 1 + · · · + ud − (d − 1)

δ > 0.

(3.16)

Let wj > 0 for j = 1, . . . , d, and let W = w1−δ + · · · + wd−δ . Then C(uw; δ) = u[w1−δ + · · · + wd−δ − (d − 1)uδ ]−1/δ = uW −1/δ [1 − (d − 1)uδ /W ]−1/δ  ≈ uW −1/δ 1 + (d − 1)δ −1 uδ /W ] = ub(w; δ) + u1+δ b2 (w; δ), as u → 0, where b(w; δ) = b(w; C) = W −1/δ = (w1−δ + · · · + wd−δ )−1/δ , b2 (w; δ) = b2 (w; C) = (d − 1)δ −1 (w1−δ + · · · + wd−δ )−1/δ−1 . The second order term of C(uw; δ) is O(u1+ζ ), where ζ = δ increases with more dependence. Suppose (X1 , . . . , Xd ) is multivariate Pareto of the form used in [20]; the univariate survival function is x−α for x > 1 for all d margins and the copula is given in (3.16). That is, i−1/δ h −α δα δα − (d − 1) , xj > 1, j = 1, . . . , d. (3.17) F (x) = C(x−α , . . . , x ; δ) = x + · · · + x 1 d 1 d An expression for the conditional expectation (given for the first component only because of symmetry) is: R∞ E [X1 |X1 > x1 , . . . , Xd > xd ] = x1 +

0

F (x1 + z1 , x2 , . . . , xd ) dz1 F (x1 , . . . , xd )

,

leading to TCE r

−1

R∞ E [X1 | X1 > rx1 , . . . , Xd > rxd ] = x1 +

0

F (rx1 + rw1 , rx2 , . . . , rxd ) dw1 F (rx)

.

(3.18)

The above expectations exist for α > 1. Because we are using the copula with survival functions, we will use the above b, b2 in their upper tail dependence form of b∗ , b∗2 . • Exact calculation of the last summand in (3.18):  R∞ −α , (rx )−α , . . . , (rx )−α ; δ dw 2 1 d 0 C (r[x1 + w1 ])  −α −α C (rx1 ) , . . . , (rxd ) ; δ  R ∞ αδ + (rx )αδ + · · · (rx )αδ − (d − 1) −1/δ dw 2 1 d 0 (r[x1 + w1 ]) = .  −1/δ (rx1 )αδ + · · · + (rxd )αδ − (d − 1)

17

• First order approximation of the last summand in (3.18):   R∞ ∗ R∞ −α , x−α , . . . , x−α ; δ dw αδ −1/δ dw (x1 + w1 )αδ + xαδ 1 1 2 + · · · + xd 2 d 0 b (x1 + w1 ) 0  . = −1/δ −α αδ αδ b∗ x−α , . . . , x ; δ x1 + · · · + xd 1 d This can be computed via numerical integration. Let the numerator and denominator of the above be denoted as N1 = N1 (x; α, δ) and D1 = D1 (x; α, δ). • Second order approximation of the last summand in (3.18):  R∞ −α r−α N1 + r−α(1+δ) 0 b∗2 (x1 + w1 )−α , x−α 2 , . . . , xd ; δ dw1  −α r−α D1 + r−α(1+δ) b∗2 x−α 1 , . . . , xd ; δ =

 αδ −1/δ−1 dw (x1 + w1 )αδ + xαδ 1 2 + · · · + xd . −1/δ−1 αδ αδ −αδ −1 D1 + (d − 1)r δ x1 + · · · + xd

N1 + (d − 1)r−αδ δ −1

R∞ 0

Table 1 has some (representative) results to show how the approximations compare; we take r = (1 − p)−1/α , d = 2, x1 = x2 = 1, p = 0.999, α = 2 and 5, and δ ∈ [0.1, 1.9] . The table shows that the first order approximation is worse only when the dependence is weak and the exponent ζ of the second order term is much less than 1; in these cases, the second order term of the expansion is useful. (b) Mixture model with MTCJ and independence copulas. Now, the second order term is between O(u) and O(u2 ), depending on the amount of dependence in the copula. Let C(u; δ, β) = (1 − β)

d Y

−δ −1/δ , uj + β[u−δ 1 + · · · + ud − (d − 1)]

δ > 0, 0 < β < 1

j=1

so that dependence increases as δ and β increase. Let W = w1−δ + · · · + wd−δ . Then d

C(uw; δ, β) ≈ (1 − β)u

d Y

  wj + βuW −1/δ 1 + (d − 1)δ −1 uδ /W

j=1

= u b(w; δ, β) + u1+ζ b2 (w; δ, β), where b(w; δ, β) = βW −1/δ = β(w1−δ + · · · + wd−δ )−1/δ , ( (d − 1)βδ −1 (w−δ + · · · + w−δ )−1/δ−1 if δ < d − 1, 1 d Qd −δ −δ −1 −1/δ−1 b2 (w; δ, β) = (1 − β) j=1 wj + (d − 1)βδ (w1 + · · · + wd ) if δ = d − 1, Qd (1 − β) j=1 wj if δ > d − 1, and ζ = δ if δ < d − 1 and ζ = d − 1 if δ ≥ d − 1. The second order term is not far from the first order term if δ is near 0 (i.e., weak dependence). Similar to part (a), we list the exact TCE and the first/second order approximations for the last summand in (3.18). 18

• Exact (assuming α > 1 as before): with Px = β

R ∞ 0

since

−α j=1 xi ,

Qd

−1/δ (r[x1 + w1 ])αδ + (rx2 )αδ + · · · + (rxd )αδ − (d − 1) dw1 + (1 − β)r−dα Px x1 /(α − 1)  −1/δ β (rx1 )αδ + · · · + (rxd )αδ − (d − 1) + (1 − β)r−dα Px

R∞ 0

(x1 + w)−α dw = x−α+1 /(α − 1). 1

• First order approximation: this is the same as in part (a) because β cancels from the numerator and denominator. • Second order approximation: this is the same as in part (a) for δ < d − 1. For δ ≥ d − 1, one gets   R∞ ∗ R −α , x−α , . . . , x−α ; δ, β dw + r −α(d−1) ∞ b∗ (x + w )−α , x−α , . . . , x−α ; δ, β dw 1 1 1 1 2 2 2 d d 0 b (x1 + w1 ) 0   −α −α −α −α ∗ −α(d−1) ∗ b x1 , . . . , xd ; δ, β + r b2 x1 , . . . , xd ; δ, β Table 2 has some (representative) results to show how the approximations compare; we take r = (1 − p)−1/α , d = 2, x1 = x2 = 1; p = 0.999, β = 0.25, α = 2 and 5, δ ∈ [0.1, 1.9]. The conclusions are similar to Table 1, except the first and second order approximations are slightly off in the last decimal place shown even for δ > 1. This is because the accuracy is of order O(ud ) = O(u2 ) rather than order O(u1+δ ) (see part (a)) that increases as δ > 1 becomes larger. (c) Consider a trivariate nested Archimedean copula (Joe 1993) that is non-exchangeable. Let C(u; δ1 , δ2 ) =

h

2 2 u−δ + u−δ −1 1 2

δ1 /δ2

i−1/δ1 1 + u−δ − 1 , 3

δ2 ≥ δ1 > 0.

Similar to part (a), one gets C(uw; δ1 , δ2 ) ≈ u b(w; δ1 , δ2 ) + u1+δ1 b2 (w; δ1 , δ2 ), where b(w; δ1 , δ2 ) =

h

w1−δ2 + w2−δ2

δ1 /δ2

+ w3−δ1

i−1/δ1

=: W −1/δ1 ,

b2 (w; δ1 , δ2 ) = δ1−1 W −1/δ1 −1 . The second order term has a parameter δ1 which is the weakest dependence parameter of the bivariate margins. Example 3.5. We show the quality of the approximations in parts (2) and (3) of Theorem 3.1 for ˆ = b(w; C) = (w−δ + · · · + w−δ )−1/δ , the margins are (3.17) with copula (3.16). Since b∗ (w; C) 1

b∗S (wj : j ∈ S) =

X

wj−δ

d

−1/δ

,

j∈S

and these can be used to compute sj (b∗ , α) and Sj (b∗ , α) via numerical integrations. The exponent function a∗ is in (3.4). If (X1 , . . . , Xd ) has the distribution in (3.17), the distribution of Xmax = 19

Table 1: Values of exact TCE minus x1 , together with first/second order approximations for the bivariate MTCJ copula with Pareto survival margins; r = (1 − p)−1/α , x1 = x2 = 1, p = 0.999. α=2 α=5 δ

exact

appr1

appr2

exact

appr1

appr2

0.1

2.114

4.063

3.349

0.3955

0.5556

0.5079

0.3

2.257

2.464

2.290

0.4382

0.4639

0.4428

0.5

1.968

2.000

1.969

0.4133

0.4180

0.4134

0.7

1.761

1.766

1.761

0.3883

0.3892

0.3883

0.9

1.622

1.624

1.622

0.3690

0.3692

0.3690

1.1

1.526

1.526

1.526

0.3543

0.3543

0.3543

1.3

1.456

1.456

1.456

0.3429

0.3429

0.3429

1.5

1.402

1.402

1.402

0.3338

0.3338

0.3338

1.7

1.360

1.360

1.360

0.3263

0.3263

0.3263

1.9

1.326

1.326

1.326

0.3200

0.3200

0.3200

Table 2: Values of exact TCE minus x1 , together with first/second order approximations for the bivariate mixture of independence and MTCJ copulas, with Pareto survival margins; r = (1 − p)−1/α , x1 = x2 = 1, p = 0.999, β = 0.25. α=2

α=5

δ

exact

appr1

appr2

exact

appr1

appr2

0.1

1.951

4.063

3.349

0.3742

0.5556

0.5079

0.3

2.227

2.464

2.290

0.4338

0.4639

0.4428

0.5

1.957

2.000

1.969

0.4114

0.4180

0.4134

0.7

1.755

1.766

1.761

0.3872

0.3892

0.3883

0.9

1.622

1.624

1.622

0.3683

0.3692

0.3690

1.1

1.523

1.526

1.526

0.3538

0.3544

0.3542

1.3

1.453

1.456

1.455

0.3424

0.3429

0.3428

1.5

1.400

1.402

1.402

0.3334

0.3338

0.3337

1.7

1.358

1.360

1.360

0.3259

0.3263

0.3262

1.9

1.324

1.326

1.325

0.3197

0.3200

0.3199

20

Table 3: Bounds for parts (2) and (3) of Theorem 3.1 for the MTCJ copula, with Pareto survival margins; p = 0.999, (1 − p)−1/α α/(α − 1) = 63.25 and 4.98 provides an intermediate value for α = 2 and 5 respectively. α=2

α=5

δ

LB2

U B2

LB3

U B3

LB2

U B2

LB3

U B3

0.2

21.46

2908.

11.53

31340.

2.954

211.4

2.133

2208.

0.5

47.21

375.8

41.61

1175.

4.270

30.05

3.967

105.2

0.8

55.07

216.3

51.97

488.9

4.613

17.33

4.456

43.01

1.0

57.48

177.0

55.23

353.8

4.718

14.16

4.605

30.76

1.5

60.29

132.4

59.10

220.2

4.841

10.52

4.782

18.74

2.0

61.45

112.9

60.72

169.2

4.893

8.944

4.857

14.21

3.0

62.38

94.96

62.02

126.8

4.935

7.500

4.918

10.47

4.0

62.74

86.54

62.53

108.4

4.952

6.826

4.942

8.872

5.0

62.91

81.66

62.77

98.22

4.960

6.435

4.953

7.988

8.0

63.11

74.54

63.05

84.07

4.970

5.869

4.967

6.764

max{X1 , . . . , Xd } is   d X j d FXmax (x) = F (x, . . . , x) = 1 + (−1) (jxαδ − j + 1)−1/δ , j

x > 0.

j=1

Based on this distribution, expressions of the form VaRg(p) (||X||max ) can be computed numerically. Because of exchangeability, parts (2) and (3) have the form U Bd [1d , ∞] ⊆ T CE p (X) ⊆ LBd [1d , ∞]. Table 3 lists the values of LBd and U Bd for d = 2, 3 with α = 2 and 5. As might be expected, the ratio U Bd /LBd decreases as δ and α increase, and increases as d increases. Example 3.6. We consider general Archimedean copulas which satisfy a regular variation condition. Consider a loss vector (X1 , . . . , Xd ) with copula C and regularly varying margins having b of C is an Archimedean copula heavy-tail index α > 1. Assume that the survival copula C P d b C(u; φ) = φ( i=1 φ−1 (ui )) where the Laplace transform φ is regularly varying at ∞ in the sense of (1.2) with tail index β > 0. It follows from Proposition 2.8 of [14] that −1/β

b∗ (w1 , . . . , wd ; C) = (w1

21

−1/β −β

+ · · · + wd

)

.

Observe that (X1 , . . . , Xd ) is more tail dependent as β decreases. Thus, for 1 ≤ j ≤ d, Z ∞ −β ∗ β Sj (b , α) = 1 + d wα/β + d − 1 dw. 1

sj (b∗ , α) =

α β d + dβ α−1

X

(−1)|S| [(|S| + 1)−β −

1

Z

(wα/β + |S|)−β dw].

0

∅6=S⊆{i:i6=j}

It follows from Theorem 3.1 that computable asymptotic bounds are given by T CE p (X) ⊇ (s1 (b∗ , α), . . . , sd (b∗ , α)) + Rd+ . p→1 VaRp (||X||max )

(S1 (b∗ , α), . . . , Sd (b∗ , α)) + Rd+ ⊇ lim Since Z lim

∞

β→0 1

wα/β + d − 1

−β

Z dw =



w−α dw =

1

1 , and lim β→0 α−1

Z

1

(wα/β + |S|)−β dw = 1,

0

we obtain that for fixed α > 1, sj (b∗ , α) = 1, for 1 ≤ j ≤ d. β→0 Sj (b∗ , α) lim

That is, asymptotic subset and superset bounds for multivariate TCE are approximately identical for small β.

4

Concluding Remarks

Our results illustrate how tail risk is quantitatively affected by extremal dependence and also show how the tool of tail dependence functions can be used to estimate such an asymptotic relation. Similar to the univariate case (1.4), the multivariate tail conditional expectation T CE p (X) as p → 1 is essentially linearly related to the value-at-risk of an aggregated norm of X. In contrast to the univariate case where the asymptotic proportionality constant is related to the heavy-tail index α, the asymptotic proportionality constants in the multivariate case depend not only on the heavy-tail index α but also on the tail dependence structure (see (3.7) and Theorem 3.1). Weak tail dependence can occur at some margins in high-dimensional distributions such as vine copulas (see [14]), and the quality of the bounds presented in Theorem 3.1 is rather poor for the distributions with weak tail dependence. In this situation, the higher order expansions such as (3.15) should be used to reveal the dependence structure at sub-extreme levels so that more accurate, tractable bounds can be developed. Our numerical examples via the second order expansion show some significant improvements in the presence of weak tail dependence, but more theoretical studies are indeed needed in this area.

22

References [1] Artzner, P., Delbaen, F., Eber, J.M. and Heath, D. (1999). Coherent measures of risks. Mathematical Finance 9:203–228. [2] Basrak, B., Davis, R.A. and Mikosch, T. (2002). A characterization of multivariate regular variation. Ann. Appl. Probab., 12:908–920. [3] Basrak, B., Davis, R.A. and Mikosch, T. (2002). Regular variation of GARCH processes. Stoch. Proc. Appl., 99:95–116. [4] Bentahar, I. (2006). Tail conditional expectation for vector-valued risks. Discussion paper 2006-029, http://sfb649.wiwi.hu-berlin.de, Technische Universit¨at Berlin, Germany. [5] Bingham, N. H., Goldie, C. M. and Teugels, J. L. (1987). Regular Variation. Cambridge University Press, Cambridge, UK. [6] Cai, J. and Li, H. (2005). Conditional tail expectations for multivariate phase-type distributions. J. Appl. Prob. 42:810–825. [7] Cheridito, P., Delbaen, F. and Kl¨ uppelberg, C. (2004). Coherent and convex monetary risk measures for bounded c` adl` ag processes. Stochastic Processes and their Applications, 112:1–22. [8] Cook, R.D. and Johnson, M.E. (1981). A family of distributions for modelling non-elliptically symmetric multivariate data. J. Roy. Statist. Soc. B, 43:210–218. [9] Delbaen, F. (2002). Coherent risk measure on general probability spaces. Advances in Finance and Stochastics-Essays in Honour of Dieter Sondermann, Eds. K. Sandmann, P. J. Sch¨onbucher, Springer-Verlag, Berlin, 1–37. [10] Embrechts, P., Lindskog, F. and McNeil, A. (2003). Modeling dependence with copulas and applications to risk management. Handbook of Heavy Tailed Distributions in Finance, ed. S. Rachev, Elsevier, Chapter 8, pp. 329–384. [11] F¨ollmer, H. and Schied, A. (2002). Convex measures of risk and trading constraints. Finance and Stochastics, 6:426–447. [12] Joe, H. (1993). Parametric family of multivariate distributions with given margins. J. Multivariate Anal., 46, 262–282. [13] Joe, H. (1997). Multivariate Models and Dependence Concepts. Chapman & Hall, London. [14] Joe, H., Li, H. and Nikoloulopoulos, A.K. (2008). Tail dependence functions and vine copulas. Submitted to Journal of Multivariate Analysis. 23

[15] Jouini, E., Meddeb, M. and Touzi, N. (2004). Vector-valued coherent risk measures. Finance and Stochastics 8:531–552. [16] Kl¨ uppelberg, C., Kuhn, G. and Peng, L. (2008). Semi-parametric models for the multivariate tail dependence function – the asymptotically dependent. Scandinavian Journal of Statistics, 35(4):701–718. [17] Landsman Z. and Valdez, E. (2003). Tail conditional expectations for elliptical distributions. North American Actuarial Journal, 7:55–71. [18] Li, H. (2009). Orthant tail dependence of multivariate extreme value distributions. Journal of Multivariate Analysis, 100:243–256. [19] Li, H. and Sun, Y. (2008). Tail dependence for heavy-tailed scale mixtures of multivariate distributions. Technical Report, Department of Mathematics, Washington State University. [20] Mardia, K.V. (1962). Multivariate Pareto distributions. Ann. Math. Statist., 33:1008–1015. [21] McNeil, A. J., Frey, R., Embrechts, P. (2005). Quantitative Risk Management: Concepts, Techniques, and Tools. Princeton University Press, Princeton, New Jersey. [22] Nelsen, R. (2006). An Introduction to Copulas. Springer, New York. [23] Nikoloulopoulos, A.K., Joe, H. and Li, H. (2009). Extreme value properties of multivariate t copulas. Extremes, 12:129–148. [24] Resnick, S. (1987). Extreme Values, Regular Variation, and Point Processes, Springer, New York. [25] Resnick, S. (2007). Heavy-Tail Phenomena: Probabilistic and Statistical Modeling. Springer, New York. [26] Sklar, A. (1959). Fonctions de r´epartition `a n dimensions et leurs marges. Publ. Inst. Statist. Univ. Paris, 8:229–231. [27] Takahasi, K. (1965). Note on the multivariate Burr’s distribution. Ann. Inst. Statist. Math., 17:257–260.

24