NEARLY ROUND SPHERES LOOK CONVEX A. FIGALLI, L. RIFFORD, AND C. VILLANI Abstract. We prove that a Riemannian manifold (M, g), close enough to the round sphere in the C 4 topology, has uniformly convex injectivity domains — so M appears uniformly convex in any exponential chart. The proof is based on the Ma–Trudinger–Wang nonlocal curvature tensor, which originates from the regularity theory of optimal transport.

Contents 1. Introduction 2. Discussion 3. Variations of Jacobi fields 4. Convexity of nonfocal domains 5. Stability of MTW condition 6. Convexity of injectivity domains References

1 3 10 12 14 24 32

1. Introduction The main result of this paper states, in short, that the round sphere is “robustly intrinsically convex”, or “robustly log convex”, in a sense which we shall now explain. Let (M, g) be a C ∞ compact Riemannian manifold of dimension n ≥ 2, and exp the associated Riemannian exponential. For any x ∈ M , v ∈ Tx M \ {0}, we define tC (x, v) = cut time of (x, v) n o = max t ≥ 0; (expx (sv))0≤s≤t is a minimizing geodesic .

1

2

A. FIGALLI, L. RIFFORD, AND C. VILLANI

Then for any x ∈ M we let TCL(x) = tangent cut locus of x  = tC (x, v)v; v ∈ Tx M \ {0} ; I(x) = injectivity domain of the exponential map at x n o = tv; 0 ≤ t < tC (x, v), v ∈ Tx M \ {0} . We write exp−1 or log for the inverse of the exponential map: by convention logx (y) = exp−1 x (y) is the set of minimizing velocities v such that expx v = y. In particular TCL(x) = logx (cut(x)), and I(x) = logx (M \ cut(x)). So the injectivity domain at x is the parameterization of M (after “cutting out the cut locus”) in the maximal exponential chart centered at x. Ever since its introduction by Poincar´e [31], the cut locus has retained some mystery; see Berger [2, Subsection 6.5.4] for a review. Around 1960 it played a key role in Klingenberg’s proof of the topological sphere theorem, as exposed e.g. in [10, Chapter 13]. Since then not so much has been found, except for explicit computations in particular geometries,1 and local properties, such as the Lipschitz continuity and (n − 1)-dimensional rectifiability [6, 21, 25, 30]. The second-order behavior is still open: for instance it is not known whether I(x) is a semiconvex set [28, Appendix B], or if TCL(x) is an Alexandrov space [21, Problem 3.4]. A genuinely nonsmooth object, the cut locus is in general not triangulable [32] and in high enough dimension does not depend smoothly on the metric, even for generic manifolds [3, 4]. In the present paper we prove a global, perturbative geometric result of a new nature on the injectivity domain, or equivalently on the tangent cut locus: Theorem 1.1. Let (M, g) be a C 4 perturbation of the round sphere Sn . Then all injectivity domains of M are uniformly convex. An informal way to paraphrase the conclusion is to say that M appears convex from any of its points. Here are some first comments. Remarks 1.2. (1) In Theorem 1.1, “C 4 perturbation” means that M is Sn equipped with a metric g, such that kg − g 0 kC 4 ≤ δ, where g 0 is the round metric, δ = δ(n) is small enough, and the C 4 norm is computed in a choice of local charts. This implies that the exponential map is a C 3 perturbation 1Shockingly,

the cut locus of the ellipsoid was rigorously described only a few years ago [20].

NEARLY ROUND SPHERES LOOK CONVEX

3

of the “round” exponential (recall that the geodesic equations involve the Christoffel symbols, which are derivatives of the metric). (2) It will be clear from the proof that the result holds under the more intrinsic assumption that the Riemannian curvature is C 2 -close to the (constant) Riemannian curvature of the round sphere. (3) Theorem 1.1 was established for n = 2 in [14], except for the fact that only strict convexity was proven. In the next section we shall provide much more comments, discuss the main difficulties and ingredients behind Theorem 1.1, and set up some preliminaries for the proof. 2. Discussion 2.1. Focalization. The major difficulty in the proof of Theorem 1.1, as in the vast majority of studies on the cut locus, is the focalization phenomenon. Let us introduce some more notation: tF (x, v) = focalization time of (x, v) o n = inf t ≥ 0; det(dtv expx ) = 0 ; TFL(x) = tangent focal locus of x  = tF (x, v)v; v ∈ Tx M \ {0} ; NF(x) = nonfocal domain at x  = tv; 0 ≤ t < tF (x, v), v ∈ Tx M \ {0} . It is a classical result that I(x) ⊂ NF(x), see e.g. [18, Corollary 3.77] or [33, Problem 8.8]. In negative sectional curvature, there is no focalization and TFL is empty; conversely, in positive curvature there is focalization in all directions, and the tangent cut locus is “surrounded” by the tangent focal locus (remember that M is assumed to be compact). The tangent focal locus is much better understood than the tangent cut locus. For instance, for any complete Riemannian manifold (M, g) the tangent focal locus is included in a countable union of smooth hypersurfaces; and nonfocal domains are semiconvex [6]. However, the focal locus is also the source for most of the major difficulties in the analysis of the cut locus. In fact the “bad set” is the tangent

4

A. FIGALLI, L. RIFFORD, AND C. VILLANI

focal cut locus, defined by (2.1)

TFCL(x) = TCL(x) ∩ TFL(x).

To illustrate this, let us consider M = RPn = Sn /{±Id }; then TFCL = ∅. This nonfocality property makes it possible to locally describe TCL(x), for a perturbation of RPn , by the Implicit Function Theorem [28, Appendix C]. Then it is easy to prove the convexity of injectivity domains as soon as (M, g) is just a C 3 perturbation of RPn . In other words, if in Theorem 1.1 we replace Sn by RPn the result becomes just an exercise, and the conclusion can be improved. More generally, the nonfocality property is also true for any manifold with sufficiently pinched positive curvature and nontrivial topology. Indeed, if all sectional curvatures are close to 1 and M is not homeomorphic to the sphere, then by the Grove–Shiohama theorem [19] [1, Theorem 1.9], the diameter cannot be much larger than π/2, so there is ε > 0 small such that tC ≤ π/2 + ε throughout the whole unit tangent bundle; while by classical comparison theorems [10, Chapter 10] tF ' π. We are not aware of any general result of convexity of injectivity domains on quotients of the round sphere, but if such a property holds then it will survive C 3 perturbations. But on Sn things are made much more tricky by the focalization (in this case TFCL = TCL, i.e. the whole tangent cut locus is focal). If M is an perturbation of Sn , then the tangent focal locus of M is locally defined by the equation det(dv expx ) = 0, so the convexity of the nonfocal domains is guaranteed only if d exp is a C 2 perturbation of the “round” d exp; this means that g should be C 4 -close to the round metric. (The sufficiency of the C 2 -perturbation of d exp is easy on S2 because we can apply the Implicit Function Theorem. In dimension n a more subtle reasoning is required, see Section 4 below.) And focalization is not a rare event for simply connected manifolds: according to a classical result by Klingenberg [24], amplified by Weinstein [35], if M is a simply connected Riemannian manifold with strictly 1/4-pinched positive sectional curvatures, then the injectivity radius coincides with the conjugate radius, therefore the tangent focal locus and tangent cut locus intersect (TFCL 6= ∅). In even dimension, the pinching condition is not necessary and positive curvatures are sufficient. (For surfaces, such results go back to Poincar´e himself.) Conclusion: What makes Theorem 1.1 nontrivial is the fact that the sphere is simply connected, allowing focalization at the cut locus; and it is for the same reason that the exponent 4 (rather than 3) is natural, and possibly optimal.

NEARLY ROUND SPHERES LOOK CONVEX

5

2.2. The Ma–Trudinger–Wang tensor. Although the conclusion of Theorem 1.1 is natural and simple, our proof is quite indirect, since it is based on the Ma– Trudinger–Wang curvature tensor (MTW tensor), introduced in [29] and further studied in [8, 14, 16, 22, 23, 26, 28, 33, 34], in relation with the regularity theory of optimal transport; see [13] or [33, Chapter 12] for a presentation and survey. Below is a precise definition, with the same conventions as in [33]; we write ∇2x F for the Hessian of F at x, and d for the geodesic distance. Definition 2.1 (MTW tensor). Let (x, y) ∈ (M × M ) \ cut(M ), and (ξ, ζ) ∈ Tx M × Ty M . Then • the pseudo-scalar product of ξ and ζ is defined by hξ, ζix,y = hξ, ηix ,

v = (expx )−1 (y),

η = (dv expx )−1 (ζ);

• the MTW tensor evaluated on (ξ, ζ) is  d2 d2  3 d2 exp (tξ), exp (v + sη) S(x,y) (ξ, ζ) = − x x 2 ds2 s=0 dt2 t=0 2  E 3 d2 D 2 d2  ∇ · , exp (v + sη) · ξ, ξ . =− x x 2 ds2 s=0 2 x The MTW tensor is a nonlocal generalization of sectional curvature. Indeed, if P is a tangent plane included in Tx M , with orthonormal basis {ξ, η}, then S(x,x) (ξ, η) is the sectional curvature at x along P [33, Particular Case 12.30]. Another geometric interpretation of this tensor can be found in [22]. The Ma–Trudinger–Wang condition requires that for all (x, y) ∈ (M × M ) \ cut(M ), and (ξ, ζ) ∈ Tx M × Ty M , (2.2)

hξ, ζix,y = 0 =⇒

S(x,y) (ξ, ζ) ≥ 0.

This condition implies nonnegative sectional curvature. It may or may not be satisfied by M ; and if it is not, this has dramatic consequences on the regularity theory of optimal transport [33, Theorem 12.44]. Various conditions reinforcing the MTW condition (extended or not) have been introduced and studied; they can be thought of as nonlocal variants of the condition of positive lower bound on the sectional curvature. Away from focalization, all these conditions are equivalent, but it is not so in presence of focalization. To state the condition used in this paper, we need first an extended notion of the MTW tensor, and secondly a bit of background in Jacobi fields analysis.

6

A. FIGALLI, L. RIFFORD, AND C. VILLANI

2.3. The extended Ma–Trudinger–Wang tensor. In Definition 2.1 the exponential map induces a one-to-one correspondence between v ∈ I(x) and y ∈ M \ cut(x), and its differential induces a one-to-one correspondence between η ∈ Tx M and ζ ∈ Ty M ; so it makes sense to abuse notation by writing, say S(x,y) (ξ, ζ) = S(x,v) (ξ, η). Then the latter object may be extended by letting v vary in the whole nonfocal domain rather than in the injectivity domain. To define this extension, we let x ∈ M , v ∈ NF(x), and (ξ, η) ∈ Tx M × Tx M . Since y := expx v is not conjugate to x, by the Inverse Function Theorem there are an open neighborhood V of (x, v) in TM , and an open neighborhood W of (x, y) in M × M , such that Ψ(x,v) : V ⊂ T M −→ W ⊂ M × M (x0 , v 0 ) 7−→ x0 , expx0 (v 0 ) is a smooth diffeomorphism from V to W. Then we may define b c(x,v) : W → R by (2.3)

1 0 0 2 (x , y ) b c(x,v) (x0 , y 0 ) := Ψ−1 x0 2 (x,v)

∀(x0 , y 0 ) ∈ W.

If v ∈ I(x) then for y 0 close to expx v and x0 close to x we have b c(x,v) (x0 , y 0 ) = c(x0 , y 0 ) := d(x0 , y 0 )2 /2. Definition 2.2 (extended MTW tensor). Let x ∈ M , v ∈ NF(x), and (ξ, η) ∈ Tx M × Tx M . Then the extended MTW tensor at (x, v), evaluated on (ξ, η), is   3 d2 d2 S(x,v) (ξ, η) = − b c(x,v) expx (tξ), expx (v + sη) 2 ds2 s=0 dt2 t=0   E 3 d2 D 2 =− ∇ · , exp (v + sη) · ξ, ξ b c (x,v) x x 2 ds2 s=0 x D   E 2 3 d 2 =− ∇ · , exp (v + sη) · ξ, ξ . b c x x (x,v+sη) 2 ds2 s=0 x We note that ∇2x b c(x,v) (x, expx v) blows up as v approaches TFL(x).2 In contrast, all the x-derivatives of c(x, expx v) remain bounded (but not continuous) if v approaches a nonfocal cut velocity. 2Beware

expx v.

of confusions: ∇2x b c(x,v) (x, expx v) means ∇2x0 b c(x,v) (x0 , y 0 ) evaluated at x0 = x, y 0 =

NEARLY ROUND SPHERES LOOK CONVEX

7

2.4. Jacobi fields. Jacobi fields are variations of geodesics [10, Chapter 5]. Given a geodesic γ and a moving reference frame along γ, all Jacobi fields along γ can be reconstructed from two “elementary” matrix-valued functions, which we denote by J0 and J1 . In the next statement we use dots for derivatives with respect to the t variable, and write In for the n × n identity matrix, 0n for the n × n zero matrix. Definition 2.3 (elementary Jacobi fields). Let (x, v) ∈ TM , v 6= 0. Let (e1 , . . . , en ) be an orthonormal basis of Tx M with e1 = v/|v|x . For t ≥ 0 let γ(t) = expx (tv), and let (e1 (t), . . . , en (t)) be the orthonormal basis of Tγ(t) M obtained by parallel transport of (e1 , . . . , en ) along γ. Let further, for t ≥ 0, D E  (2.4) Rij (t) = Riemγ(t) γ(t), ˙ ei (t) γ(t), ˙ ej (t) , 1 ≤ i, j ≤ n, γ(t)

where Riem stands for the Riemann curvature tensor. We define J0 (t), J1 (t), implicitly depending on x and v, as the matrix-valued solutions of  ¨  i = 0, 1,  Ji (t) + R(t)Ji (t) = 0, (2.5) J0 (0) = 0n , J˙0 (0) = In ,   J (0) = I , J˙ (0) = 0 . 1

n

1

n

The Hessian of the squared distance can be expressed in terms of J0 and J1 : Proposition 2.4. With the same notation as in Definition 2.3, for 0 ≤ t < tF (x, v), let S(x, v, t) be the linear operator Tx M → Tx M whose matrix in the orthonormal basis (e1 , . . . , en ) is given by tJ0 (t)−1 J1 (t); then this operator is symmetric. If v ∈ I(x), then for any ξ ∈ Tx M ,    2

2 d( · , y) S(x, v, 1)ξ, ξ x = ∇x · ξ, ξ 2 x 2 d expx (τ ξ), y d2 = , y = expx v. 2 dτ τ =0 2 The proof can be found in [33, Chapter 14], see in particular p. 414. This statement is also implicit in [7, Section 2] or [14, Section 2]. For any x ∈ M , v ∈ NF(x) \ {0}, we write   v (2.6) S(x,v) = S(x, v, 1) = S x, , |v|x , |v|x so that if v ∈ I(x), then S(x,v) coincides with ∇2x d( ·, expx v)2 /2. By mimicking the proof of Proposition 2.4, one easily obtains the following result: if v ∈ NF(x), then

8

A. FIGALLI, L. RIFFORD, AND C. VILLANI

for any ξ ∈ Tx M ,



S(x,v) ξ, ξ x = ∇2xb c(x,v) ( · , y) · ξ, ξ x  d2 b c exp (τ ξ), y , = (x,v) x dτ 2 τ =0

y = expx v.

In particular, the extended MTW tensor can be computed as follows: 3 d2

S (2.7) S(x,v) (ξ, η) = − ξ, ξ . (x,v+sη) x 2 ds2 s=0 In the sequel, we will always use this formula to compute the extended MTW tensor on perturbations of the sphere, see Section 5. Let us further observe that, since the above formula involves only derivatives of Jacobi fields, the (extended) MTW tensor depends on the metric only through its Riemannian curvature. (Compare with Remark 1.2(2).) Modulo identification, J1 (t) sends Tx M to Tγ(t) M , then J0 (t)−1 does the reverse; so S is an endomorphism of Tx M . Accordingly, we shall never need to consider the moving basis (e1 (t), . . . , en (t)), but only the fixed basis (e1 , . . . , en ) which we identify with the canonical basis of Rn . The symmetric matrix S has an eigenvalue 1 on Re1 (the extended squared distance grows quadratically along the geodesic). For the round sphere, all other eigenvalues of S −1 vanish at focalization (t = π). If the metric g is close to the round metric g 0 and t is close to, but strictly less than the focalization time tF (v/|v|x ), we may define Λ by   1 0 (2.8) S= . 0 −Λ−1 (More intrinsically, Λ is the inverse of the restriction of S to (Re1 )⊥ .) Then, for any ε > 0, if kg − g 0 kC 2 ≤ δ and tF − δ ≤ t < tF , δ = δ(ε) small enough, we have (2.9)

0 < Λ ≤ ε In−1 ,

where In−1 can be thought of as the identity on (Re1 )⊥ . The operator Λ is smooth even at focalization, where its determinant vanishes. e In the sequel we shall abuse notation by writing Λ−1 for the operator ξ 7−→ Λ−1 ξ, where ξe is the orthogonal projection of ξ on e⊥ 1 . Note carefully that while Λ is defined only near the focalization time, Λ−1 is defined for any v ∈ NF(x) as the restriction of S to (Re1 )⊥ .

NEARLY ROUND SPHERES LOOK CONVEX

9

2.5. Scheme of proof. Theorem 1.1 is obtained by concatenating three results of independent interest: Theorem 2.5. If (M, g) is a C 4 perturbation of the round sphere, then all its nonfocal domains are uniformly convex. Theorem 2.6. If (M, g) is a C 4 perturbation of the round sphere, then it satisfies an extended uniform Ma–Trudinger–Wang condition of the form (2.10)

∀x ∈ M, ∀v ∈ NF(x) \ {0},  S(x,v) (ξ, η) ≥ κ |ξ|2x + |Λ−1 ξ|2x |η|2x − c hξ, ηi2x ,

where κ, c are positive constants, and we used the notation defined in Subsection 2.4. Theorem 2.7. If (M, g) is a C ∞ compact Riemannian manifold satisfying (2.10), and all its nonfocal domains are uniformly convex, then all its injectivity domains are also uniformly convex. Theorem 2.7 may look surprising and calls for comments. The convexity of the nonfocal domains is a “pseudo-local” property, in the sense that it only depends on the behavior of the metric in the neighborhood of an arbitrary geodesic (before focalization time). The same is true for the positivity of the extended MTW tensor, for instance in the form of (2.10). However, the combination of these two properties will imply a property about injectivity domains, which is of “completely global” nature. So our results can be compared to other theorems relating local curvature conditions to global properties, the most classical being the Bonnet–Myers theorem. We also note that the positivity of the MTW tensor carries more information than just the convexity of tangent injectivity domains, since it implies continuity results for the solution of optimal transport problems (see [14] for the two-dimensional case, and [16] for the general case). Theorems 2.5, 2.6 and 2.7 will be proven respectively in Sections 4, 5 and 6. As a preliminary step, in Section 3 we shall establish useful integral representations for variations of Jacobi fields. A somewhat mysterious step in the proof will be an explicit computation, performed in Subsection 5.4, in which all the potentially dangerous terms in a certain inequality will combine for no apparent reason to form an exact square. This might be the indication of some deeper unexplored structure. 2.6. Bibliographical notes. For the convenience of the reader, we shall present self-contained proofs; but our work builds on a number of earlier conceptual contributions. Here is a short account.

10

A. FIGALLI, L. RIFFORD, AND C. VILLANI

The uniform convexity of nonfocal domains on an perturbation of the sphere was already proven by Castelpietra and Rifford [14] with a “symplectic” approach. In this paper we shall provide a more direct Riemannian approach. After the works of Ma, Trudinger and Wang [29] and Loeper [26], it was known that the positivity of the MTW tensor, together with the convexity of tangent injectivity domains, were sufficient conditions for the regularity of optimal transport on Riemannian manifolds, when the cost function is the squared distance, and cut locus issues are avoided (we refer to the above-mentioned references for precise statements). Then Loeper [27] discovered the relation with sectional curvature. He further showed that the round sphere satisfies a strict form of the MTW condition; this result was improved by Kim and McCann [22], and Figalli and Rifford [14, Appendix]. Loeper and Villani [28] conjectured a general relation between the MTW condition and the shape of the tangent cut locus, and proved that the positivity of the MTW curvature implies the convexity of injectivity domains, under a stringent technical restriction of nonfocality. The focalization problem was the motivation for several progress: a “probabilistic” perturbation lemma for paths crossing the tangent cut locus [17], and more importantly the introduction of the extended Ma–Trudinger–Wang tensor by Figalli and Rifford [14]. Then in [14] the stability of the extended MTW condition around the round two-dimensional sphere was established, and from there the strict convexity of tangent injectivity domains was deduced. In the present work we shall work in higher dimension, and improve the conclusion from strict to uniform convexity. Many of the above-mentioned works use an inequality of maximum principle type, introduced by Loeper [26], simplified by Kim and McCann [22], later simplified again and modified in [33, Theorem 12.36] [14, Lemma 3.3] [28, Theorem 3.1]. Another variant of this inequality will be used in Section 6 below. 3. Variations of Jacobi fields It is well-known that if the matrix-valued function J(t) satifies the Jacobi equation ˙ −1 solves the Ricatti equation U˙ + U 2 + R = 0. In J¨ + RJ = 0, then U = JJ this section, we shall establish equations of a related spirit bearing on first-order variations of Jacobi fields. Lemma 3.1. With the notation from Definition 2.3, (a) J0 J1∗ = J1 J0∗ . (b) J˙0 J˙1∗ = J˙1 J˙0∗ .

NEARLY ROUND SPHERES LOOK CONVEX

11

(c) J˙0 J1∗ − J˙1 J0∗ = In . Lemma 3.2. The general solution of the matrix-valued inhomogeneous Jacobi equation (3.1) J¨(t) + R(t) J (t) = M (t) is given by the formula (3.2) J (t) = J1 (t) J (0) + J0 (t) J˙ (0) + J0 (t)

Z

t ∗

Z

J1 (s) M (s) ds − J1 (t) 0

t

J0 (s)∗ M (s) ds.

0

Proof of Lemma 3.1. Property (a) is equivalent to the symmetry of J1−1 J0 , which follows from Proposition 2.4. Then by time-differentiation of (a) we get J˙0 J1∗ + J0 J˙1∗ = J˙1 J0∗ + J1 J˙0∗ , which means that J˙0 J1∗ − J˙1 J0∗ is symmetric. By time-differentiation again, and use of the Jacobi equation and of (a), we obtain   d ˙ ∗ J0 J1 − J˙1 J0∗ = J˙0 J˙1∗ − J˙1 J˙0∗ + R J1 J0∗ − J0 J1∗ dt = J˙0 J˙1∗ − J˙1 J˙0∗ . This matrix is obviously antisymmetric, but it is also symmetric as the time-derivative of a symmetric matrix; so it vanishes identically, which proves (b). Then J˙0 J1∗ − J˙1 J0∗ is time-independent, and therefore constantly equal to its value at t = 0, which yields (c).  Proof of Lemma 3.2. Both sides of (3.2) have the same conditions at t = 0, so it is sufficient to check that G¨ + R G = M , where G is the right-hand side of (3.2). Thanks to the Jacobi equation and Lemma 3.1(a)(c), we have   ¨ + R G = 2 J˙0 (t) J1 (t)∗ − J˙1 (t) J0 (t)∗ M (t) G   + J0 (t) J˙1 (t)∗ − J1 (t) J˙0 (t)∗ M (t)   + J1 (t) J0 (t)∗ − J0 (t) J1 (t)∗ M˙ (t) = M (t) (observe that J0 J˙1∗ −J1 J˙0∗ = − J˙0 J1∗ − J˙1 J0∗

∗

Now comes the main result of this section:

= −In ), and the proof is complete. 

12

A. FIGALLI, L. RIFFORD, AND C. VILLANI

Proposition 3.3. If J0 (α, t), J1 (α, t) are a family of Jacobi fields, defined by (2.5), and depending smoothly on an extra parameter α, then Ji0 = ∂Ji /∂α and J˙i = ∂Ji /∂t satisfy, whenever J1 is invertible, J1−1 J00 = A0 − KC,

(3.3) (3.4)

J1−1 J˙0 = I + T0 − KD,

J1−1 J10 = C ∗ − KA1 , J1−1 J˙1 = D∗ − KT1 − KR0 ,

where (3.5) (3.6)

K(t) = J1 (t)−1 J0 (t), I = In , R0 = R(0), Z t Z t ∗ 0 J0 (s) R (s) J0 (s) ds, A1 (t) = J1 (s)∗ R0 (s) J1 (s) ds, A0 (t) = 0

0 t

Z (3.7)

˙ J0 (s)∗ R(s) J0 (s) ds,

T0 (t) =

Z

0

Z (3.8)

C(t) =

t

˙ J1 (s)∗ R(s) J1 (s) ds,

T1 (t) = 0

t

J1 (s)∗ R0 (s) J0 (s) ds,

Z D(t) =

0

t

˙ J1 (s)∗ R(s) J0 (s) ds.

0

In particular, A0 , A1 , T0 , T1 , C, D have vanishing first row and first column; moreover K, R0 , A0 , A1 , T0 , and T1 are symmetric. Proof of Proposition 3.3. The fields Ji = Ji0 and Gi = J˙i satisfy ˙ i. J¨i + RJi = −R0 Ji , G¨i + RGi = −RJ Then the conclusion follows in a straightforward way from Lemma 3.2.



4. Convexity of nonfocal domains In this section we prove Theorem 2.5, referring to [28, Appendix B] for some basic properties of uniformly convex sets. We first note that for the round metric g 0 , we have tF (x, σ) = π for all σ ∈ Ux M ' Sn−1 (the space of unit tangent vectors at x). The goal is to prove that for any given ε > 0, if g is close enough to g 0 then for all σ ∈ Ux M , (4.1)

|tF (x, σ) − π| ≤ ε,

|∇σ tF (x, σ)| ≤ ε,

∇2σ tF (x, σ) ≤ ε Id .

The last condition should be interpreted in the weak sense of support functions: to prove (4.1) it is sufficient to show that for every σ there is a C 2 function τ = τ (σ), defined in a neighborhood of σ, such that τ (σ) = tF (x, σ),

τ (σ) ≥ tF (x, σ),

NEARLY ROUND SPHERES LOOK CONVEX

13

and |τ (σ) − π| ≤ ε,

|∇σ τ (σ)| ≤ ε,

|∇2σ τ (σ)| ≤ ε.

The inequalities (4.1) will imply that IITFL (v) ≥ 1 − γ (the second fundamental form of the tangent focal locus, evaluated at v, defined in weak sense, is bounded below by 1 − γ), where γ = γ(ε) → 0. In other words, the tangent focal locus is uniformly convex, and the uniform convexity constant approaches the constant of the round sphere. For any unit tangent vector σ, we can define S(x, σ, t) as in Proposition 2.4, and Λ = Λ(x, σ, t) as the restriction of S −1 (x, σ, t) to σ ⊥ (as in (2.8)). The operator Λ is intrinsically defined, independently of any choice of orthonormal basis (e1 = σ, e2 , . . . , en ), and makes sense near focalization. For the round sphere, by explicit calculation it is equal to Λ0 = −(tan t/t) Id σ⊥ . Then it follows from the definition of tF that n o (4.2) tF (x, σ) = inf tF (x, σ; h); h ∈ Ux M, hh, σix = 0 , where (4.3)

tF (x, σ; h) = inf

n o

t ≥ 0; Λ(x, σ, t)h, h x = 0 .

By what we said above, to prove (4.1) it is sufficient to establish that, for any fixed h in Tx M , h⊥σ, (4.4)

|tF (x, σ; h) − π| ≤ ε,

|∇σ tF (x, σ; h)| ≤ ε,

|∇2σ tF (x, σ; h)| ≤ ε.

So let us fix σ and h, two unit orthogonal tangent vectors, and let

q(σ, t) = Λ(x, σ, t)h, h x . For the round sphere, this is equal to q 0 (σ, t) = −(tan t)/t, so 1 ∂q 0 = − 6= 0. ∂t t=π π It follows by the Implicit Function Theorem that θ(σ) = tF (x, σ; h) is well-defined by the implicit equation (4.5)

q(σ, θ(σ)) = 0

in a neighborhood of θ = π and for g close to g 0 in C 3 topology. (Since q depends on second derivatives of g, this assumption implies that q is close to q 0 in C 1 topology).

14

A. FIGALLI, L. RIFFORD, AND C. VILLANI

Moreover, differentiating (4.5) we find ∂θ =− ∂σ ∂2θ =− ∂σ 2



∂q ∂t

−1 

∂2q ∂σ 2



 +2

∂q ∂t



−2 

∂q ∂t

−1

∂q , ∂σ

  −3 2   ∂q ∂q ∂2q ∂q ∂ q ∂q ⊗ − ⊗ . ∂σ ∂t ∂σ ∂t ∂t2 ∂σ ∂σ

Since ∂q/∂σ, ∂ 2 q/∂σ 2 and ∂ 2 q/∂t ∂σ vanish for the round metric, we conclude that |∂θ/∂σ| and |∂ 2 θ/∂σ 2 | are bounded above by ε for g close enough to g 0 in C 4 topology (so q is close to q 0 in C 2 topology). This concludes the proof.

5. Stability of MTW condition In this section we use Equation (2.7) to compute the extended MTW tensor for the sphere and its perturbations, in order to prove our stability result. Let us remark that, being S(x,v) bilinear in both ξ and η, it suffices to prove the estimate for ξ, η unit tangent vectors at x. So in this whole section we will always assume that |ξ|x = |η|x = 1. 5.1. Computations. Let us fix a geodesic (γ(t))0≤t≤T with γ(0) = x, γ(1) = y, ˙ γ(0) = σ, |σ| = 1. We assume that T ≤ tF (σ), so that the geodesic is nonfocal except maybe at its final point. We pick up (e1 = σ, e2 , . . . , en ) an orthonormal basis of Tx M , and identify tangent vectors at x with their coordinates in this basis. Under this identification the metric gx is given by the canonical scalar product of Rn . Next, we let (γα (τ ))τ ≥0 be the geodesic starting at x with initial velocity σα = (cos α, sin α, 0, . . . , 0). We further define σα⊥ = (− sin α, cos α, 0, . . . , 0). For any α ∈ R, |α| small, and τ ≥ 0, we solve the Jacobi equation (2.5) with e1 (α, 0) = σα , e2 (α, 0) = σα⊥ , ei (α, 0) = ei for i ≥ 3, and R(α, τ ) defined by (2.4) evaluated along the geodesic (γα (τ ))τ ≥0 . If w = τ σα with τ < tF (σα ), then the matrix of S(x,w) in the orthonormal basis (σα , σα⊥ , e3 , . . . , en ) is S(α, τ ) = τ J0 (α, τ )−1 J1 (α, τ ).

NEARLY ROUND SPHERES LOOK CONVEX

15

Let cos α sin α 0 0 − sin α cos α 0 0   0 0 1 0 Q(α) =  0 0 1  0  . . .. ..  .. .. . . 0 0 0 0 

... ... ... ...

 0 0  0  0 . ..  .

... 1

Then the matrix of S(x,w) in the original basis of Rn is Q(−α) S(α, τ ) Q(α); in other words,

(5.1)

D

E

D

E

S(x,τ σα ) ξ, ξ = S(α, τ ) Q(α)ξ, Q(α)ξ .

Let now v = (t, 0, . . . , 0), η = (η1 , η2 , 0, . . . , 0) ∈ Rn ' Tx M . For any s ∈ R, we can write v + sη = τ σα , where

(5.2)

p τ = |v + sη|x = (t + sη1 )2 + (sη2 )2 ,

−1

α = tan



sη2 t + sη1

 .

(Here we used the orthonormality of the basis.) Let us differentiate (5.1) twice with respect to s:

    E hD ∂S  E D dD ∂Q Ei ∂α S(x,τ σα ) ξ, ξ = Qξ, Qξ + 2 SQξ, ξ ds ∂α ∂α ∂s D ∂S  E  ∂τ  + Qξ, Qξ ; ∂τ ∂s

16

A. FIGALLI, L. RIFFORD, AND C. VILLANI

(5.3)

E d2 D S(x,τ σ ) ξ, ξ = ds2   α  E D ∂S   ∂Q  E D ∂2S Qξ, Qξ + 4 ξ, Qξ ∂α2 ∂α ∂α      2  E  2 D D ∂Q ∂Q E ∂ Q ∂α +2 S ξ, ξ + 2 SQξ, ξ 2 ∂α ∂α ∂α ∂s  D 2      E     E D ∂ S ∂S ∂Q ∂τ ∂α + 2 Qξ, Qξ + 4 Qξ, ξ ∂α ∂τ ∂τ ∂α ∂s ∂s     E ∂τ 2 D ∂2S Qξ, Qξ + ∂τ 2 ∂s D     2  E D ∂S ∂Q E ∂ α + Qξ, Qξ + 2 SQξ, ξ ∂α ∂α ∂s2     D ∂S E ∂2τ . + Qξ, Qξ ∂τ ∂s2

Moreover, by direct computation, at s = 0 we have

(5.4)

τ = t,

∂τ = η1 , ∂s

∂2τ η22 = , ∂s2 t

α = 0,

∂α η2 = , ∂s t

∂2α 2 η1 η2 =− 2 . 2 ∂s t

Combining (5.3) with (5.4) we arrive at our final expression for the MTW tensor: writing

(5.5)

 P ξ = ξ1 , ξ2 , 0, . . . , 0 ,

 (P ξ)⊥ = ξ2 , −ξ1 , 0, . . . , 0 ,

NEARLY ROUND SPHERES LOOK CONVEX

17

we have (note the minus sign) (5.6) D ∂ 2 S  E 2 ξ, ξ η12 − S(x,v) (ξ, η) = 2 3 ∂τ  D 2  E 1 D ∂S  E 4 D ∂S  E 1 ∂ S ⊥ ξ, ξ + ξ, ξ + 2 (P ξ) , ξ + 2 t ∂α2 t ∂τ t ∂α  E 2

2 2D ⊥ ⊥ + 2 S (P ξ) , (P ξ) − 2 Sξ, P ξ η2 t t  D 2    E 2 D ∂S E 2 ∂ S + ξ, ξ − 2 ξ, ξ t ∂α ∂τ t ∂α    D E 4

4 ∂S ⊥ ⊥ + ξ, (P ξ) − 2 Sξ, (P ξ) η1 η2 . t ∂τ t In the rest of this paper, we shall systematically use a dot to designate a derivative with respect to τ (“time”), and a prime to designate a derivative with respect to α: S˙ = ∂S/∂τ , S 0 = ∂S/∂α, etc. 5.2. The round sphere. In this subsection we establish a strict form of the MTW condition for the round sphere Sn . We shall not try to get the best possible estimates near focalization; this will be examined more in detail in the next subsection. If the metric is the round metric, then S(τ, α) does not depend on α, and is equal to   1 0 . S(τ ) = cos τ 0 τ sin In−1 τ Without loss of generality, we may choose the orthonormal basis (e1 , . . . , en ) in such a way that v = t e1 , η = η1 e1 + η2 e2 , ξ = ξ1 e1 + ξ2 e2 + ξ3 e3 . Then from (5.6),     1 1 cos t 1 2 2 (5.7) S(x,v) (ξ, η) = 2 2 − ξ η +4 2 − ξ1 ξ2 η1 η2 t t sin t 1 2 t sin2 t     1 t cos t 2 2 cos t 1 2 ξ2 η1 + +2 − 2 ξ22 η22 2 − 3 2 + sin t sin t sin t t sin t t     1 1 t cos t 2 2 cos t +2 − ξ3 η1 + − ξ 2 η2. sin2 t sin3 t sin2 t t sin t 3 2

18

A. FIGALLI, L. RIFFORD, AND C. VILLANI

The following elementary inequalities are established in [14, Appendix]: for all t ∈ [0, π], cos t 2 1 t cos t 1 1 cos t 2 1 ≥ , ≥ , − 2 ≥ 0, (5.8) 2 − 2 − 3 2 + 3 3 sin t t sin t sin t sin t sin t t sin t t s   1 cos t 1 t cos t 1 1 (5.9) − −α −α ≥ 2 − 3 2 − 2 − α ≥ 0, 2 t t sin t sin t sin t sin t t where α > 0 is independent of t ∈ (0, π). Moreover, a slightly more refined analysis than the one in [14, Proof of Lemma A.3] allows to show that the third inequality in (5.8) can be improved: there exists β > 0 such that cos t 2 1 − 2 ≥ β t2 . (5.10) 2 + sin t t sin t t Hence, combining (5.7)–(5.10), we deduce     2 1 cos t 1 1 2 2 ξ1 ξ2 η1 η2 S(x,v) (ξ, η) ≥ 2 2 − − α ξ1 η2 + 4 2 − 3 t t sin t t sin2 t    1 t cos t +2 − α ξ22 η12 + 2α ξ12 η22 + ξ22 η12 2 − 3 sin t sin t  2 + β t2 ξ22 η22 + ξ32 η12 + η22 3 r 2 r 1 1 cos t t cos t − − α |ξ1 η2 | − − − α |ξ2 η1 | ≥2 t2 t sin t sin2 t sin3 t "s   1 cos t 1 t cos t − − −α +4 −α t2 t sin t sin2 t sin3 t  # 1 1 − − − α |ξ1 ξ2 η1 η2 | sin2 t t2    2 2 2 2 2 + 2 α ξ1 η2 + ξ2 η1 − 2 ξ1 ξ2 η1 η2 + β t2 ξ22 η22 + ξ32 η12 + η22  3  ≥ κ ξ12 η22 + ξ22 η12 + ξ32 η12 + ξ32 η22 − 2 ξ1 ξ2 η1 η2  + κ ξ22 η12 + t2 ξ22 η22 + ξ32 η12 + ξ32 η22   2 2 2 e 2 |η|2 , = κ |ξ| |η| − hξ, ηi + κ t2 |ξ|

NEARLY ROUND SPHERES LOOK CONVEX

19

where ξe = ξ2 e2 + ξ3 e3 , and κ > 0 is a small constant.

5.3. Computations again. Now we go back to (5.6), assume the metric to be close to the round metric, and we work near focalization. Before studying the asymptotic ˙ S, ¨ S 0 , S 00 and S˙ 0 in a suitable way, by means of behavior of (5.6) we shall rewrite S, Section 3. ˙ As a first illustration, let us first take care of S: (5.11)

S˙ = −τ J0−1 J˙0 J0−1 J1 + τ J0−1 J˙1 + J0−1 J1 = −τ (J0−1 J1 )(J1−1 J˙0 )(J0−1 J1 ) + τ (J0−1 J1 )(J1−1 J˙1 ) + J0−1 J1 = −τ K −1 (I + T0 − KD)K −1 + τ K −1 (D∗ − KT1 − KR0 ) + K −1 = −τ K −1 (I + T0 )K −1 + τ (DK −1 + K −1 D∗ ) − τ (T1 + R0 ) + K −1 = −τ K −1 (I + T0 )K −1 + τ (DK −1 + K −1 D∗ ) − τ (R0 + T1 ) + K −1 .

Of course this expression is symmetric. A similar computation yields (5.12)

S 0 = −τ J0−1 J00 J0−1 J1 + τ J0−1 J10 = −τ K −1 A0 K −1 + τ (CK −1 + K −1 C ∗ ) − τ A1 .

Then we can iterate the process and derive expressions for second-order variations: thus, using the Jacobi equation (2.5), we get after a bit of algebra: (5.13)

  S¨ = −2J0−1 J˙0 J0−1 J1 + 2J0−1 J˙1 + 2τ J0−1 J˙0 J0−1 J˙0 J0−1 J1 − J0−1 J˙0 J0−1 J˙1 = 2τ K −1 (I + T0 )K −1 (I + T0 )K −1   − 2τ DK −1 (I + T0 )K −1 + K −1 (I + T0 )K −1 D∗ − 2K −1 (I + T0 )K −1 − 2τ K −1 (I + T0 − KD)(D − T1 K − R0 K)K −1 + 2(DK −1 + K −1 D∗ ) + 2τ DK −1 D∗ − 2(T1 + R0 ).

Now the symmetry is not obvious, but comes from the identity (I + T0 − KD)(D − T1 K − R0 K) = J1−1 J˙0 J˙1∗ (J1−1 )∗

20

A. FIGALLI, L. RIFFORD, AND C. VILLANI

and Lemma 3.1(b). The other second-order variations will not be “obviously” symmetric either: (5.14) S 00 = 2τ J0−1 J00 J0−1 J00 J0−1 J1 − τ J0−1 J000 J0−1 J1 − 2τ J0−1 J00 J0−1 J10 + τ J0−1 J100  = 2τ K −1 A0 K −1 A0 K −1 − 2τ CK −1 A0 K −1 + K −1 A0 K −1 C ∗ + 2τ CK −1 C ∗   −1 −1 00 2 −1 00 + τK J1 J1 K + 2A0 A1 K + 2KC − 2A0 C − J1 J0 − 2KCA1 K K −1 . (5.15) S˙ 0 = −J0−1 J00 J0−1 J1 + J0−1 J10 + τ J0−1 J˙0 J0−1 J00 J0−1 J1 − τ J0−1 J˙00 J0−1 J1 + τ J0−1 J00 J0−1 J˙0 J0−1 J1 − τ J0−1 J00 J0−1 J˙1 − τ J0−1 J˙0 J0−1 J10 + τ J0−1 J˙10   −1 −1 −1 −1 −1 −1 = τ K (I + T0 )K A0 K + K A0 K (I + T0 )K  − τ DK −1 A0 K −1 + K −1 A0 K −1 D∗    − τ CK −1 (I + T0 )K −1 + K −1 (I + T0 )K −1 C ∗ + τ DK −1 C ∗ + CK −1 D∗ h − K −1 τ (I + T0 )C + τ J1−1 J˙00 + τ A0 D + A0 − KC − C ∗ K  − τ K(CD + DC) − τ A0 (T1 + R0 ) + (I + T0 )A1 + J1−1 J˙10 K  i + K A1 + τ C(T1 + R0 ) + τ DA1 K K −1 . 5.4. Behavior near focalization. Let us rewrite (5.6) in the form 2 (5.16) S(x,v) (ξ, η) = a11 η12 + a22 η22 + a12 η1 η2 , 3 and compute the coefficients aij . After some computation, we find D E

(5.17) a11 = −2τ K −1 (I + T0 )K −1 ξ, (I + T0 )K −1 ξ − 2τ K −1 D∗ ξ, D∗ ξ

+ 4τ K −1 (I + T0 )K −1 ξ, D∗ ξ + 2hK −1 ξ, K −1 ξi + hZK −1 ξ, K −1 ξi, where Z = 2T0 + 2τ (I + T0 − KD)(D − T1 K − R0 K) − 4D∗ K + 2K(T1 + R0 )K. Recall from Proposition 3.3 that K, A0 , A1 , T0 , T1 , C, D, C ∗ , D∗ , R0 all admit e1 as an eigenvector, and apart from K the associated eigenvalue is 0. It follows that Z has vanishing first row and first column. Moreover, Z(π) = 0 for the round metric

NEARLY ROUND SPHERES LOOK CONVEX

21

g 0 , so Z(α, τ ) is very small (say |Z| ≤ ε) if τ ' π and g ' g 0 . In the sequel, we shall use Z as a generic symbol for a matrix-valued function satisfying these two properties (vanishing of the first row and column, and smallness as τ ' π, g ' g 0 ). Similarly, 4 2

a22 = hK −1 ξ, K −1 ξi − K −1 A0 K −1 ξ, A0 K −1 ξ + hK −1 C ∗ ξ, A0 K −1 ξi τ τ 2 2 2 −1 ∗ − hK C ξ, C ∗ ξi − hK −1 (P ξ)⊥ , (P ξ)⊥ i + hK −1 ξ, P ξi τ τ τ (5.18) 4 −1 4 1 −1 − hK ξ, ξi + hK A0 K −1 (P ξ)⊥ , ξi − hC ∗ (P ξ)⊥ , K −1 ξi τ τ τ 4 4 − h(P ξ)⊥ , K −1 C ∗ ξi + hA1 (P ξ)⊥ , ξi + hZK −1 ξ, K −1 ξi, τ τ where Z = −2D∗ K + K(R0 + T1 )K + T0  1  −1 00 2 −1 00 J J K + 2A0 A1 K + 2KC − 2A0 C − J1 J0 − 2KCA1 K ; − τ 1 1 and (5.19)





a12 = 4 (I + T0 )K −1 ξ, K −1 (P ξ)⊥ − 4 K −1 D∗ ξ, (P ξ)⊥ − 4 DK −1 ξ, (P ξ)⊥



+ 4 (R0 + T1 )ξ, (P ξ)⊥ − 4 K −1 A0 K −1 ξ, (I + T0 )K −1 ξ



+ 4 K −1 A0 K −1 ξ, D∗ ξ + 4 K −1 (I + T0 )K −1 ξ, C ∗ ξ



− 4 K −1 C ∗ ξ, D∗ ξ + ZK −1 ξ, K −1 ξ . where 2h Z= τ (I + T0 )C + τ J1−1 J˙00 + τ A0 D − KC + C ∗ K − τ K(CD + DC) τ     i − τ A0 (T1 + R0 ) + (I + T0 )A1 + J1−1 J˙10 K + τ K C(T1 + R0 ) + DA1 K . Fifteen (!) of the terms in (5.17)–(5.19) combine in (5.16) to form a “perfect square”: 2 − hK −1 w, wi, τ where   w = −(I + T0 )K −1 ξ + D∗ ξ τ η1 + −A0 K −1 ξ + C ∗ ξ + (P ξ)⊥ η2 .

22

A. FIGALLI, L. RIFFORD, AND C. VILLANI

Recalling (2.8), separating the first component from the rest, we write 2 2 2 4 − hK −1 w, wi = − 2 ξ12 η12 − 2 ξ22 η22 + 2 ξ1 ξ2 η1 η2 τ τ τ τ   1 2 + 2 Λ− 2 (I + T0 )Λ−1 ξ + τ D∗ ξ η1 τ    2  −1 −1 ∗ ⊥ + τ A0 Λ ξ + C ξ + (P ξ) η2 , and hK −1 ξ, K −1 ξi =

1 1 2 ξ1 + 2 |Λ−1 ξ|2 2 τ τ

. Thus, we deduce  2 − 1  2 S(x,v) (ξ, η) = 2 Λ 2 (I + T0 )Λ−1 ξ + τ D∗ ξ η1 3 τ    2  −1 −1 ∗ ⊥ + τ A0 Λ ξ + C ξ + (P ξ) η2 (5.20) 2 2 2 2 2 2 4 1 −1 2 2 2 |Λ ξ ξ ξ ξ η η ξ| (2η + η η η ) + − + 1 1 2 2 2 2 2 2 2 2 1 2 1 2 τ τ τ τ   e −1 ξ, (P ξ)⊥ i , + hZK −1 ξ, K −1 ξi + hZK

+

where

 4 Ze = (A1 K − C ∗ ) η22 + 4 (R0 + T1 )K − D η1 η2 . τ (Recall that |η|x = 1, so Ze is small.) Let us observe since Ze has that, vanishing first −1 ⊥ −1 e e row and first column, we have hZΛ ξ, (P ξ) i = hZΛ ξ, ξ1 e2 i . Furthermore, the two terms coming from Ze can be respectively bounded by  c ε |Λ−1 ξ| |ξ1 | η22 ≤ c ε |Λ−1 ξ|2 + ξ12 η22 and  c ε |Λ−1 ξ| |ξ1 | |η1 | |η2 | ≤ c ε |Λ−1 ξ|2 η12 + ξ12 η22 . Hence, we can control the “dangerous” terms in (5.20): hZK −1 ξ, K −1 ξi ≤ c ε |Λ−1 ξ|2 , e −1 ξ, (P ξ)⊥ i ≤ c ε|Λ−1 ξ|2 + ε ξ12 η22 , hZΛ ξ22 η22 ≤ c ε |Λ−1 ξ|2 η22 ,

NEARLY ROUND SPHERES LOOK CONVEX

(5.21)

23

2 |ξ1 ξ2 η1 η2 | ≤ δ −1 ξ22 η12 + δ ξ12 η22 ≤ c δ −1 ε2 |Λ−1 ξ|2 η12 + δ ξ12 η22 ,

where δ is small, c is a positive constant, and we choose ε much smaller than δ and c−1 . With these bounds we conclude that if g is close enough to g 0 and t close enough to π,   2 (5.22) S(x,v) (ξ, η) ≥ κ |Λ−1 ξ|2 (η12 + η22 ) + ξ12 η22 + |Λ−1/2 ω|2 , 3 where κ is a positive constant, and     ω = (I + T0 )Λ−1 ξ + τ D∗ ξ η1 + τ −1 A0 Λ−1 ξ + C ∗ ξ + (P ξ)⊥ η2 .  b 2 , and thanks to Let us write ξb = 0, 0, ξ3 , . . . , ξn . Since |Λ−1 ξ|2 controls ξ22 + |ξ| (5.21), up to slightly reducing the value of κ we deduce from (5.22) h S(x,v) (ξ, η) ≥ κ |Λ−1 ξ|2 (η12 + η22 )   i 2 2 2 2 2 2 2 −1/2 2 b + ξ1 η2 + ξ2 η1 − 2ξ1 ξ2 η1 η2 + |ξ| (η1 + η2 ) + |Λ (5.23) ω|    = κ |Λ−1 ξ|2 |η|2 + |ξ|2 |η|2 − hξ, ηi2 + |Λ−1/2 ω|2 . Remark 5.1. These computations use the fact that all the eigenvalues of K in (Re1 )⊥ vanish simultaneously for the round sphere, so these eigenvalues are still very large for the perturbed sphere. This simultaneous vanishing is of course very particular, but should also be the most degenerate situation. Apart from that, the above arguments do not really take advantage of the closeness to the round metric; we shall see in [15] that, in dimension 2, similar inequalities hold as soon as the nonfocal domains are uniformly convex near the tangent focal cut locus. 5.5. Improved inequality on the sphere. Before going on, let us notice that the results of Subsections 5.2 and 5.4 imply a very strong nonlocal curvature inequality on the round sphere:   e 2 + |Λ−1 ξ|2 |η|2 , (5.24) S(x,v) (ξ, η) ≥ κ |ξ|2 |η|2 − hξ, ηi2 + κ |v|2 |ξ| where ξe denotes the orthogonal projection of ξ on v ⊥ . Remark 5.2. The expression in (5.24) is strictly positive as soon as v 6= 0 and ξ, η are not both parallel to v. In contrast, for v = 0 the right-hand side of (5.24) vanishes as soon as ξ and η are parallel; this was expected since in this case the MTW curvature reduces to sectional curvature. An informal way to state this conclusion is that nonlocality improves the curvature of the sphere. This improvement is all the

24

A. FIGALLI, L. RIFFORD, AND C. VILLANI

more dramatic as we approach the cut locus (|v| → π), since then all the eigenvalues of Λ−1 diverge. 5.6. Stability. Back to the study of perturbations of the round metric, we can now prove Theorem 2.6. Let δ > 0. We define n Θδ = (x, v, ξ, η); x ∈ M, v ∈ NF(x), dist(v, TFL(x)) ≥ δ, o ξ, η ∈ Tx M ; |ξ|x = |η|x = 1, hξ, ηix = 0 . g

From (5.6) we see that S(x,v) = S(x,v) is a smooth function of the metric g as (x, v, ξ, η) varies in the compact set Θδ . It follows from (5.24) that this function is always positive for the round sphere; so it also has a positive lower bound for g close enough to g 0 in C 4 topology. We now observe that, if a smooth function of (z1 , z 0 ) is bounded below by κ on z1 = 0, then it is bounded below by κ − c |z1 | on a compact set. As a consequence, there exist κ > 0 and c > 0 such that for all x ∈ M , v ∈ NF(x) with dist(v, TFL(x)) ≥ δ, and ξ, η unit tangent vectors at x, g

S(x,v) (ξ, η) ≥ κ − c |hξ, ηix |. Hence, since |hξ, ηix | ≤ 1, we obtain g

S(x,v) (ξ, η) ≥

κ2 − c2 hξ, ηi2x κ2 c2 ≥ − hξ, ηi2x . κ + c|hξ, ηix | κ+c κ+c

Observing that Λ−1 is uniformly bounded away from TFL(x), we deduce that (2.10) is satisfied away from the focal locus for a perturbation of the round metric. On the other hand, by Subsection 5.4, the inequality is true also near the focal locus (again, for a perturbation of the round metric), and Theorem 2.6 follows. Remark 5.3. One may ask whether the stronger inequality (5.23) is also stable under perturbation. Informally, this amounts to asking whether the MTW tensor on the perturbed sphere is positive even when evaluated on non-orthogonal tangent vectors ξ, η. According to Delano¨e and Ge [9], the answer is positive in dimension 2. 6. Convexity of injectivity domains In this section we prove Theorem 2.7. We shall need some preparations before we start the core of the proof.

NEARLY ROUND SPHERES LOOK CONVEX

25

6.1. Preliminaries. In this subsection we recall some facts from Riemannian geometry. The first one is the formula of first variation [18, Paragraph 3.31]: if y∈ / cut(x) then  

d(x, y)2 · ζ = − (expx )−1 (y), ζ x . (6.1) dx 2 With the notation c = d(x, y)2 /2, this can be reformulated as (6.2)

dx c = gv,

v = (expx )−1 (y),

where (gv)ζ = g(v, ζ). Next, the map (6.3)

φ : (x, v) 7−→ expx v, −(dv expx )v



is an involution between nonfocal tangent vectors. (If γ(t) = expx (tv), it maps (γ(0), γ(0)) ˙ to (γ(1), −γ(1)).) ˙ Then we have the useful formula (6.4)

h(dv expx )ξ, ηiy = hξ, (dw expy )ηix ,

(y, w) = φ(x, v).

Let us briefly recall the proof of (6.4). Let γ(t) = expx (tv), and let X, Y be Jacobi ˙ fields along γ defined by X(0) = 0, X(0) = ξ, Y (1) = 0, Y˙ (1) = −η. From the properties of Jacobi fields we have d ˙ ˙ Y˙ i, hX(t), Y˙ (t)iγ(t) = hX(t), Y¨ (t)iγ(t) +hX(t), Y˙ (t)iγ(t) = −hX, Riem(Y, γ) ˙ γi+h ˙ X, dt where Riem is the Riemann curvature tensor. This quantity being symmetric in X and Y ,  d ˙ hX(t), Y˙ (t)iγ(t) − hX(t), Y (t)iγ(t) = 0, dt ˙ so hX(t), Y˙ (t)iγ(t) − hX(t), Y (t)iγ(t) is independent of t. Therefore ˙ ˙ hX(1), Y˙ (1)iγ(1) − hX(0), Y˙ (0)iγ(0) = hY (1), X(1)i γ(1) − hY (0), X(0)iγ(0) , ˙ so that hX(1), Y˙ (1)iγ(1) = −hY (0), X(0)i γ(0) , which is the same as (6.4). Now we recall a key result about the size of the cut locus. By [25, Corollary 1.3] (see also [21] and [6]), for any x ∈ M we have (6.5)

Hn−1 [K ∩ cut(x)] < +∞,

where K ⊂ M is any compact set, and Hn−1 is the (n − 1)-dimensional Hausdorff measure.

26

A. FIGALLI, L. RIFFORD, AND C. VILLANI

As a final preparation, we give a partially coordinate-wise expression for the MTW curvature: if we pick up a coordinate system (xi )1≤i≤n around x, and write uij = ∂ 2 u/∂xi ∂xj , then  i j 3 d2 b c x, exp (v + sη) ξ ξ , S(x,v) (ξ, η) = − (6.6) ij x 2 ds2 s=0

where b c(x, expx v) should be understood as b c(x,v) (x, expx v) = |v|2x /2. The key observation is that (6.6) is an intrinsic expression, independent of any choice of coordinates (e.g. geodesic), although cij ξi ξj in itself does not make sense unless we specify a choice of coordinates. To prove (6.6), it will be sufficient to prove this intrinsic property and recall Definition 2.2. But a change of coordinates in the right-hand side of (6.6) induces the replacement of b cij (x, expx (v + sη)) by b cij (x, expx (v + sη)) + Γ`ij (x)b c` (x, expx (v + sη)), ` ` where c` = ∂c/∂x and Γij are smooth functions. According to (6.2), the extra terms Γ`ij (x)b c` (x, expx (v + sη)) are linear in v + sη, and thus disappear under the action of d2 /ds2 in (6.6). 6.2. Main technical ingredients. The next Proposition is the key to the use of the MTW tensor. It is extracted from [14]; precursors appeared in [22] and [28]. Proposition 6.1. Let x, x ∈ M , and let (pt )t0 0 and rj > 0 such that on the interval [tj − εj , tj + εj ] the path ybt is entirely contained in the small ball Bj = B(b ytj , rj ), and the larger ball 2Bj = B(b ytj , 2rj ) does not meet cut(x). If we prove that the path (b yt ) can be approximated on each interval [tj−1 +εj−1 , tj − εj ] by a path (e yt ) meeting cut(x) at most finitely many times, then we can “patch together” these pieces by smooth paths defined on the intervals [tj − εj , tj + εj ] and staying within 2Bj . Obviously the resulting perturbation will meet cut(x) at most finitely many times. All this shows that we just have to treat the case when (yt ) takes values in a small open subset U of Rn and is a straight line. In these coordinates, Σ := cut(x) ∩ U has finite Hn−1 measure by (6.5). Without loss of generality, we can assume that U is the cylinder B(0, σ) × (−τ, τ ) for some σ, τ > 0, and yt = t en for t ∈ (−τ, τ ) (where (e1 , . . . , en ) is an orthonormal basis of Rn ). For any z ∈ B(0, σ) ⊂ Rn−1 , let ytz = (z, t). The goal is to show that Hn−1 (dz)almost surely, ytz intersects Σ in at most finitely many points. To do this one can

NEARLY ROUND SPHERES LOOK CONVEX

29

apply the co-area formula in the following form (see [11, p. 109] and [12, Sections 2.10.25 and 2.10.26]): let f : (z, t) 7−→ z (defined on U ), then Z n−1 H [Σ] ≥ H0 [Σ ∩ f −1 (z)] Hn−1 (dz). f (Σ)

Thanks to (6.5) the left-hand side is finite, and the right-hand side is exactly R #{t; ytz ∈ Σ} Hn−1 (dz); so the integrand is finite for almost all z, and in particular there is a sequence zk → 0 such that each (ytzk ) intersects Σ finitely many often.  Now comes a maximum principle type lemma, borrowed from [33]. Lemma 6.4. Let f : [0, 1] → R be a semiconcave function. Assume that there are 0 = t0 < t1 < . . . < tN = 1 such that for any j ∈ {0, . . . , N − 1}, f is twice continuously differentiable on (tj , tj+1 ) and satisfies f¨ ≤ C |f˙(t)|

(6.10) for some constant C ≥ 0. Then (6.11)

∀t ∈ [0, 1],

f (t) ≥ min(f (0), f (1)).

Proof of Lemma 6.4. Let ε > 0 and fε (t) = f (t) − ε tk , where k ∈ N is such that k > C + 2. Then f¨ε (t) = −ε k(k − 1) tk−2 + f¨(t) ≤ −ε k(k − 1) tk−2 + C |f˙(t)| ≤ −ε k(k − 1) tk−2 + C |f˙ε (t)| + C ε k tk−1 = −ε k(k − 1 − Ct) tk−2 + C |f˙ε (t)|. So (6.12)

f¨ε (t) ≤ −ε k tk−2 + C |f˙ε (t)|.

Let now t∗ ∈ [0, 1] be such that fε is minimum at t∗ . If t∗ ∈ (tj , tj+1 ) for some j ∈ {0, . . . , N − 1} then f˙ε (t∗ ) = 0, so by (6.12) f¨ε (t∗ ) < 0, which is impossible. Thus t∗ = tj for some j ∈ {0, . . . , N }. Let us assume that j ∈ {1, . . . , N − 1}. If f˙ε is discontinuous at tj , then by ˙ + semiconcavity f˙ε (t− j ) > fε (tj ), which is incompatible with tj being a minimum of fε . If on the contrary f˙ε is continuous at tj , then by semiconcavity again fε is differentiable at tj . Because tj is a minimum, f˙ε (tj ) = 0, and by (6.12), f¨ε < 0 in a neighborhood of tj , so f˙ε is positive on the left of tj and negative on the right of tj , which again is impossible.

30

A. FIGALLI, L. RIFFORD, AND C. VILLANI

We conclude that j ∈ {0, N } i.e. fε ≥ min(fε (0), fε (1)), and the claim follows by letting ε → 0.  6.3. Proof of Theorem 2.7. Let M satisfy the assumptions of Theorem 2.7. Let x ∈ M and p0 , p1 ∈ I(x). Fix δ > 0 to be chosen later, and let (pt )0≤t≤1 be a path valued in Tx M , joining p0 to p1 , such that |¨ pt |x ≤ δ |p0 − p1 |2x . If we can show that pt ∈ I(x) for all t ∈ [0, 1], then this will imply the uniform convexity of I(x). Since I(x) ⊂ NF(x) and the latter set is assumed uniformly convex, we know that for δ small enough (pt ) is valued in NF(x). By Lemma 6.3, for any δ 0 > δ and any ε > 0 we may find a path (e pt ), also valued in NF(x), such that |pt − pet |x ≤ ε, 0 2 ¨ p0 − pe1 |x , pe0 , pe1 ∈ I(x), and expx pet meets cut(x) only for finitely many |pet |x ≤ δ |e times t. If we can prove that (e pt ) is valued in I(x) then we are done. In the sequel, for simplicity we shall note δ for δ 0 and pt for pet . Let d(x, yt )2 |pt |2x − , yt = expx pt . 2 2 Let j ∈ {0, . . . , N − 1}. When t varies in (tj , tj+1 ) we may define (6.13)

`(t) =

q t = −(dpt expx )pt ,

qt = (expyt )−1 (x).

Then by Proposition 6.1 (using the convexity of NF(yt ) for all t) ˙ = −hy˙ t , qt − q t iyt , `(t)

(6.14) ¨ = −2 (6.15) `(t) 3

Z 0

1

(1 − s) S(yt ,(1−s)qt +sqt ) (y˙ t , qt − q t ) ds − (dqt expyt (qt − q t ), p¨t x .

So our curvature assumptions imply  ¨ ≤ − κ |y˙ t |2 + |Λ−1 y˙ t |2 |qt − q t |2 (6.16) `(t) yt yt yt 3 c + hy˙ t , qt − q t i2yt + (dqt expyt )(qt − q t ) x |¨ pt |x . 3 At this point, we note that Z 1 Z 1 |¨ ps |x ds |p0 − p1 |x ≤ |p˙s |x ds ≤ |p˙t |x + 0

0

≤ |p˙t |x + δ |p0 − p1 |2x ≤ |p˙t |x + 2δ diam(M ) |p0 − p1 |x ;

NEARLY ROUND SPHERES LOOK CONVEX

31

so if δ ≤ (4 diam(M ))−1 then  |p0 − p1 |x . |p˙t |x ≥ 1 − 2δ diam(M ) |p0 − p1 |x ≥ 2 Also, recalling the definition of Λ from (2.8), it is easily seen that

(6.17)

(6.18)

|y˙ t |2yt + |Λ−1 y˙ t |2yt ≥ ν |p˙t |2x

for some constant ν > 0. Next, by Taylor’s formula, for |qt − q t |yt ≤ α small enough, the equality expyt qt = expyt q t implies 2 (d q ) (6.19) exp )(q − qt t t ≤ B |qt − q t |yt yt x

for some constant B > 0; this inequality is also obviously true for |qt − q t |yt ≥ α. Combining (6.17), (6.18) and (6.19) with (6.14) and (6.15), we deduce   κν c ˙ 2 ¨ − Bδ |qt − q t |2yt |p0 − p1 |2x + |`(t)| `(t) ≤ − 3  12  (6.20) κν 2 2 ˙ − − Bδ |qt − q t |yt |p0 − p1 |x + C |`(t)|, 12 where the constant C is an upper bound for (c/3) supt |y˙ t | |qt − q t |yt (which depends only on M if δ is sufficiently small, and is of order c diam(M )2 ). ¨ ≤ C |`(t)| ˙ If δ is small enough then (6.20) implies `(t) for t ∈ (tj , tj+1 ). Hence, 2 since y 7→ d(x, y) is semiconcave, we may apply Lemma 6.4 to deduce ∀t ∈ (0, 1),

`(t) ≥ min(`(0), `(1)).

But since p0 , p1 ∈ I(x) we have d(x, expx pi ) = |pi |x for i = 0, 1; that is, `(0) = `(1) = 0. It follows that `(t) ≥ 0 for all t, i.e. |pt |2x ≤ d(x, expx pt )2 . The reverse inequality is obviously true, so pt is a minimizing velocity, that is pt ∈ I(x), and the proof is complete. Remark 6.5. In this section we have shown that, if all NF(x) are uniformly convex, and the strong version of the extended MTW condition given in (2.10) holds, then all I(x) are uniformly convex. It is actually possible to prove also some “weaker” versions of this result, which are important for applications to the regularity theory of optimal transport: • If all NF(x) are convex, and S(x,v) (ξ, η) ≥ 0 for all ξ ⊥ η, then all I(x) are convex.

32

A. FIGALLI, L. RIFFORD, AND C. VILLANI

• If all NF(x) are strictly convex, and S(x,v) (ξ, η) ≥ 0 for all ξ ⊥ η with strict inequality unless ξ = 0 or η = 0, then all I(x) are strictly convex. We refer to [16] for a proof of these results. References [1] U. Abresch and W.T. Meyer. Injectivity radius estimates and sphere theorems. Comparison geometry. Math. Sci. Res. Inst. Publ. Vol. 30, Cambridge Univ. Press, 1997, pp. 1–47. [2] M. Berger. A panoramic view of Riemannian geometry. Springer-Verlag, Berlin, 2003. [3] M.A. Buchner. Stability of the cut locus in dimensions less than or equal to 6. Invent. Math. 43, 3 (1977), 199–231. [4] M.A. Buchner. The structure of the cut locus in dimension less than or equal to six. Compositio Math. 37, 1 (1978), 103–119. [5] P. Cannarsa and C. Sinestrari. Semiconcave functions, Hamilton-Jacobi equations, and optimal control. Progress in Nonlinear Differential Equations and their Applications, 58. Birkh¨ auser Boston Inc., Boston, 2004. [6] M. Castelpietra and L. Rifford. Regularity properties of the distance function to conjugate and cut loci for viscosity solutions of Hamilton–Jacobi equations and applications in Riemannian geometry. ESAIM Control Optim. Calc. Var., to appear. ¨ger. A Riemannian [7] D. Cordero-Erausquin, R.J. McCann and M. Schmuckenschla interpolation inequality ` a la Borell, Brascamp and Lieb. Invent. Math. 146, 2 (2001), 219–257. ¨ and Y. Ge. Regularity of optimal transportation maps on compact, locally [8] Ph. Delanoe nearly spherical, manifolds. J. Reine Angew. Math., to appear. ¨ and Y. Ge. Work in progress. [9] Ph. Delanoe [10] M.P. Do Carmo. Riemannian geometry. Birkh¨auser, Boston, MA, 1992. [11] L.C. Evans and R. Gariepy. Measure theory and fine properties of functions. CRC Press, Boca Raton, FL, 1992. [12] H. Federer. Geometric measure theory. Die Grundlehren der mathematischen Wissenschaften, Band 153. Springer-Verlag New York Inc., New York, 1969. [13] A. Figalli. Regularity of optimal transport maps (after Ma-Trudinger-Wang and Loeper). S´eminaire Bourbaki. Vol. 2008/2009, Exp. No. 1009. [14] A. Figalli and L. Rifford. Continuity of optimal transport maps and convexity of injectivity domains on small deformations of S2 . Comm. Pure Appl. Math., to appear. [15] A. Figalli, L. Rifford and C. Villani. On the Ma–Trudinger–Wang curvature tensor on surfaces. Preprint, 2009. [16] A. Figalli, L. Rifford and C. Villani. Continuity of optimal transport on Riemannian manifolds in presence of focalization. Preprint, 2009. [17] A. Figalli and C. Villani. An perturbation lemma about the cut locus, with applications in optimal transport theory. Meth. Appl. Anal. 15, 2 (2008), 149–154. [18] S. Gallot, D. Hulin and J. Lafontaine. Riemannian geometry, second ed. Universitext. Springer-Verlag, Berlin, 1990. [19] K. Grove and K. Shiohama. A generalized sphere theorem. Ann. of Math. 106, 2 (1977), 201–211.

NEARLY ROUND SPHERES LOOK CONVEX

33

[20] J.-I. Itoh and K. Kiyohara. The cut loci and the conjugate loci on ellipsoids. Manuscripta Math. 114, 2 (2004), 247–264. [21] J.-I. Itoh and M. Tanaka. The Lipschitz continuity of the distance function to the cut locus. Trans. Amer. Math. Soc. 353, 1 (2001), 21–40. [22] Y.H. Kim and R.J. McCann. Continuity, curvature, and the general covariance of optimal transportation. J. Eur. Math. Soc., to appear. [23] Y.H. Kim and R.J. McCann. Towards the smoothness of optimal maps on Riemannian submersions and Riemannian products (of round spheres in particular). Preprint, 2008. ¨ [24] W. Klingenberg. Uber Riemannsche Mannigfaltigkeiten mit positiver Kr¨ ummung. Comment. Math. Helv. 35 (1961), 47–54. [25] Y. Li and L. Nirenberg. The distance function to the boundary, Finsler geometry, and the singular set of viscosity solutions of some Hamilton-Jacobi equations. Comm. Pure Appl. Math. 58, 1 (2005), 185–146. [26] G. Loeper. On the regularity of solutions of optimal transportation problems. Acta Math., to appear. [27] G. Loeper. Regularity of optimal maps on the sphere: The quadratic cost and the reflector antenna. Arch. Ration. Mech. Anal., to appear. [28] G. Loeper and C. Villani. Regularity of optimal transport in curved geometry: the nonfocal case. Preprint, 2008. [29] X.N. Ma, N.S. Trudinger and X.J. Wang. Regularity of potential functions of the optimal transportation problem. Arch. Ration. Mech. Anal. 177, 2 (2005), 151–183. [30] C. Mantegazza and A.C. Mennucci. Hamilton–Jacobi equations and distance functions on Riemannian manifolds. Appl. Math. Optim. 47, 1 (2003), 1–25. ´. Sur les lignes g´eod´esiques des surfaces convexes. American M. S. Trans. 6 [31] H. Poincare (1905), 237–274. [32] D. Singer and H. Gluck. The existence of nontriangulable cut loci. Bull. Amer. Math. Soc. 82, 4 (1976), 599–602. [33] C. Villani. Optimal transport, old and new. Grundlehren des mathematischen Wissenschaften, Vol. 338, Springer-Verlag, Berlin-New York, 2009. [34] C. Villani. Stability of a 4th-order curvature condition arising in optimal transport theory. J. Funct. Anal. 255, 9 (2008), 2683–2708. [35] A. Weinstein. The cut locus and conjugate locus of a Riemannian manifold. Ann. of Math. 87 (1968), 29–41.

Alessio Figalli ´matiques Laurent Schwartz, UMR 7640 Laboratoire de Mathe ´ Ecole Polytechnique 91128 Palaiseau Cedex FRANCE email: [email protected]

34

A. FIGALLI, L. RIFFORD, AND C. VILLANI

Ludovic Rifford ´ de Nice–Sophia Antipolis Universite ´, UMR 6621 Labo. J.-A. Dieudonne Parc Valrose, 06108 Nice Cedex 02 FRANCE email: [email protected] ´dric Villani Ce ENS Lyon & Institut Universitaire de France UMPA, UMR CNRS 5669 ´e d’Italie, 69364 Lyon Cedex 07 46 alle FRANCE e-mail: [email protected]