Roth s theorem in the primes

Annals of Mathematics, 161 (2005), 1609–1636 Roth’s theorem in the primes By Ben Green* Abstract We show that any set containing a positive proporti...

Author: Julia Powers

32 downloads 0 Views 318KB Size

Report

Download PDF

Recommend Documents

Fermat s last theorem for regular primes

FERMAT S LAST THEOREM FOR REGULAR PRIMES

DIRICHLET S THEOREM ON PRIMES IN ARITHMETIC PROGRESSIONS

KUMMER, REGULAR PRIMES, AND FERMAT S LAST THEOREM. 1. Introduction

A COMPLETE VINOGRADOV 3-PRIMES THEOREM UNDER THE RIEMANN HYPOTHESIS

KUMMER S PROOF OF FERMAT S LAST THEOREM FOR REGULAR PRIMES: A MODERN VIEWPOINT

Tao: Primes in. Christian Skau. Primes in arithmetic progressions

LANDAU S PROBLEMS ON PRIMES

HALL S MATCHING THEOREM

POMPEIU S THEOREM REVISITED

The music of the primes

Fermat s Last Theorem

Fermat s Last Theorem:

Unravelling the mystery surrounding Bernoulli s Theorem

Chen s theorem in short intervals

Nathan Zuckerman s Role in Philip Roth s American Trilogy

The Solving of Fermat s Last Theorem

Fermat s Last Theorem in the XIX th century

Proof of the Fermat s Last Theorem

The primes of our times?

Operator Theory : Lin s Theorem

Beyond Fermat s Last Theorem

Van der Waerden s Theorem

IRREGULARITIES IN THE DISTRIBUTIONS OF PRIMES IN FUNCTION FIELDS

Annals of Mathematics, 161 (2005), 1609–1636

Roth’s theorem in the primes By Ben Green*

Abstract We show that any set containing a positive proportion of the primes contains a 3-term arithmetic progression. An important ingredient is a proof that the primes enjoy the so-called Hardy-Littlewood majorant property. We derive this by giving a new proof of a rather more general result of Bourgain which, because of a close analogy with a classical argument of Tomas and Stein from Euclidean harmonic analysis, might be called a restriction theorem for the primes. 1. Introduction Arguably the second most famous result of Klaus Roth is his 1953 upper bound [21] on r3 (N ), defined 17 years previously by Erd˝ os and Tur´ an to be the cardinality of the largest set A ⊆ [N ] containing no nontrivial 3-term arithmetic progression (3AP). Roth was the first person to show that r3 (N ) = o(N ). In fact, he proved the following quantitative version of this statement. Proposition 1.1 (Roth). r3 (N ) " N/ log log N . There was no improvement on this bound for nearly 40 years, until HeathBrown [15] and Szemer´edi [22] proved that r3 " N (log N )−c for some small positive constant c. Recently Bourgain [6] provided the best bound currently known. Proposition 1.2 (Bourgain). r3 (N ) " N (log log N/ log N )1/2 . *The author is supported by a Fellowship of Trinity College, and for some of the period during which this work was carried out enjoyed the hospitality of Microsoft Research, Redmond WA and the Alfr´ed R´enyi Institute of the Hungarian Academy of Sciences, Budapest. He was supported by the Mathematics in Information Society project carried out by R´enyi Institute, in the framework of the European Community’s Confirming the International Roˆle of Community Research programme.

1610

BEN GREEN

The methods of Heath-Brown, Szemer´edi and Bourgain may be regarded as (highly nontrivial) refinements of Roth’s technique. There is a feeling that Proposition 1.2 is close to the natural limit of this method. This is irritating, because the sequence of primes is not covered by these results. However it is known that the primes contain infinitely many 3APs.1 Proposition 1.3 (Van der Corput). The primes contain infinitely many 3APs. Van der Corput’s method is very similar to that used by Vinogradov to show that every large odd number is the sum of three primes. Let us also mention a paper of Balog [1] in which it is shown that for any n there are n primes p1 , . . . , pn such that all of the averages 12 (pi + pj ) are prime. In this paper we propose to prove a common generalization of the results of Roth and Van der Corput. Write P for the set of primes. Theorem 1.4. Every subset of P of positive upper density contains a 3AP. In fact, we get an explicit upper bound on the density of a 3AP-free subset of the primes, but it is ridiculously weak. Observe that as an immediate consequence of Theorem 1.4 we obtain what might be termed a van der Waerden theorem in the primes, at least for progressions of length 3. That is, if one colours the primes using finitely many colours then one may find a monochromatic 3AP. We have not found a written reference for the question answered by Theorem 1.4, but M. N. Huxley has discussed it with several people [16]. To prove Theorem 1.4 we will use a variant of the following result. This says that the primes enjoy what is known as the Hardy-Littlewood majorant property. Theorem 1.5. Suppose that p ! 2 is a real number, and let PN = P ∩ [1, N ]. Let {an }n∈PN be any sequence of complex numbers with |an | " 1 for all n. Then ! ! ! ! !" ! !" ! ! ! ! ! (1.1) an e(nθ)! " C(p) ! e(nθ)! , ! ! ! ! ! n∈PN

Lp (T)

n∈PN

Lp (T)

where the constant C(p) depends only on p.

It is perhaps surprising to learn that such a property does not hold with any set Λ ⊆ [N ] in place of PN . Indeed, when p is an even integer it is 1

In April 2004 the author and T. Tao published a preprint showing that the primes contain arbitrarily long arithmetic progressions.

1611

ROTH’S THEOREM IN THE PRIMES

rather straightforward to check that any set does satisfy (1.1) (with C(p) = 1). However, there are sets for which (1.1) fails badly when p is not an even integer. For a discussion of this see [10] and for related matters including connections with the Kakeya problem, see [18], [20]. We will apply a variant of Theorem 1.5 for p = 5/2, when it certainly does not seem to be trivial. To prove it, we will establish a somewhat stronger result which we call a restriction theorem for primes. The reason for this is that our argument is very closely analogous to an argument of Tomas and Stein [24] concerning Fourier transforms of measures supported on spheres. A proof of the restriction theorem for primes was described, in a different context, by Bourgain [4]. Our argument, being visibly analogous to the approach of Tomas, is different and has more in common with Section 3 of [5]. This more recent paper of Bourgain deals with restriction phenomena of certain sets of lattice points. To deduce Theorem 1.4 from (a variant of) Theorem 1.5 we use a variant of the technique of granularization as developed by I. Z. Ruzsa and the author in a series of papers beginning with [9], as well as a “statistical” version of Roth’s theorem due to Varnavides. We will also require an argument of Marcinkiewicz and Zygmund which allows us to pass from the continuous setting in results such as (1.1) – that is to say, T – to the discrete, namely Z/N Z. Finally, we would like to remark that it is possible, indeed probable, that Roth’s theorem in the primes is true on grounds of density alone. The best known lower bound on r3 (N ) comes from a result of Behrend [3] from 1946. Proposition 1.6 (Behrend). r3 (N ) ! N e−C constant C.

√

log N

for some absolute

This may well give the correct order of magnitude for r3 (N ), and if anything like this could be proved Theorem 1.4 would of course follow trivially. 2. Preliminaries and an outline of the argument Although the main results of this paper concern the primes in [N ], it turns out to be necessary to consider slightly more general sets. Let m " log N be a positive integer and let b, 0 " b " m − 1, be coprime to m. We may then define a set Λb,m,N = {n " N | nm + b is prime} .

We expect Λb,m,N to have size about mN/φ(m) log N , and so it is natural to define a function λb,m,N supported on Λb,m,N by setting # φ(m) log(nm + b)/mN if n ∈ Λb,m,N λb,m,N (n) = 0 otherwise.

1612

BEN GREEN

For simplicity we write X = Λb,m,N for the next few pages. We will abuse notation and consider λb,m,N as a measure on X. Thus for example λb,m,N (X), $ which is defined to be n λb,m,N (n), is roughly 1 by the prime number theorem in arithmetic progressions. We use Lp (dλb,m,N ) norms and also the inner $ product &f, g'X = f (n)g(n)λb,m,N (n) without further comment. It is convenient to use the wedge symbol %for the Fourier transforms on both T and Z, which we define by f ∧ (n) = f (θ)e(−nθ) dθ and g ∧ (θ) = $ 2πiα . n g(n)e(nθ) respectively. Here, of course, e(α) = e For any measure space Y let B(Y ) denote the space of continuous functions on Y and define a map T : B(X) → B(T) via (2.1)

T : f )−→ (f λb,m,N )∧ .

The object of this section is to give a new proof of the following result, which may be called a restriction theorem for primes. Theorem 2.1 (Bourgain). Suppose that p > 2 is a real number. Then there is a constant C(p) such that for all functions f : X → C, (2.2)

*T f *p " C(p)N −1/p *f *2 .

Remember that the L2 norm is taken with respect to the measure λb,m,N . Theorem 2.1 probably has most appeal when b = m = 1, in which case we may derive consequences for the primes themselves. Later on, however, we will take m to be a product of small primes, and so it is necessary to have the more general form of the theorem. We turn now to an outline of the proof of Theorem 2.1. The analogy between our proof and an argument by Tomas [24], giving results of a similar nature for spheres in high-dimensional Euclidean spaces, is rather striking. In fact, the reader may care to look at the presentation of Tomas’s proof in [23], whereupon she will see that there is an almost exact correspondence between the two arguments. To begin with, the proof proceeds by the method of T and T ∗ , a basic technique in functional analysis. One can check that the operator T ∗ : B(T) → B(X) is given by (2.3)

T ∗ : g )−→ g ∧ |X ,

by verifying the relation & " &T f, g'T = (f λb,m,N )∧ (θ)g(θ) dθ = f (n)g ∧ (n)λb,m,N (n) = &f, T ∗ g'X . n

The equation (2.3) explains the term restriction. Using (2.3) we see that the operator T T ∗ is the map from B(T) to itself given by (2.4)

T T ∗ : f )−→ f ∗ λ∧ b,m,N .

ROTH’S THEOREM IN THE PRIMES

1613

Now Theorem 2.1 may be written, in obvious notation, as (2.5)

*T *2→p " C(p)N −1/p .

The principle of T and T ∗ , as we will use it, states that (2.6)

*T *22→p = *T T ∗ *p! →p = *T ∗ *2p! →2 .

We would like to emphasise that there is nothing mysterious going on here – this result is just an elegant and convenient way of bundling together some applications of H¨ older’s inequality. The proof of the part that we will need, that is to say the inequality *T *22→p " *T T ∗ *p! →p , is simply *T f *p = sup &T f, g' 'g'p! =1

= sup &f, T ∗ g' 'g'p! =1

" *f *2 sup *T ∗ g*2 'g'p! =1

= *f *2 sup &g, T T ∗ g'1/2 'g'p! =1

1/2

" *f *2 *T T ∗ *p! →p . Thus we will, for much of the paper, be concerned with showing that the operator T T ∗ as given by (2.4) satisfies the bound (2.7)

*T T ∗ *p! →p " C ( (p)N −2/p .

The preceding remarks show that a proof of this will imply Theorem 2.1. To get such a bound one splits λ into certain dyadic pieces, that is, a sum (2.8)

λb,m,N =

K "

ψj + ψK+1 .

j=1

The slightly curious way of writing this indicates that the definition of ψK+1 will be a little different from that of the other ψj . We will define these pieces so that they satisfy the L1 -L∞ estimates (2.9)

*f ∗ ψj∧ *∞ "ε 2−(1−ε)j *f *1

for some ε < (p − 2)/2, and also the L2 -L2 estimates (2.10)

*f ∗ ψj∧ *2 "ε

2εj *f *2 . N

Applying the Riesz-Thorin interpolation theorem (see [11, Ch. 7]) will then give *f ∗ ψj∧ *p " 2−δj N −2/p *f *p!

1614

BEN GREEN

for some positive δ (depending on ε). Summing these estimates from j = 1 to K + 1 will establish (2.7) and hence Theorem 2.1. To define the decomposition (2.8) we need yet more notation. From the outset we will suppose that we are trying to prove Theorem 2.1 for a particular value of p; the argument is highly and essentially nonuniform in p. Write A = 4/(p − 2). Let 1 < Q " (log N )A . If b, m, N are as before (recall that (Q) m " log N ) then we define a measure λb,m,N on Z by setting  , -−1 +   N −1 p!Q 1 − p1 if n " N and p | (nm + b) ⇒ p > Q (Q) p!m λb,m,N (n) =   0 otherwise. (1)

Define λb,m,N (n) = 0 for all n.

(Q)

As Q becomes large the measures λb,m,N look more and more like λb,m,N . Much of Section 4 will be devoted to making this principle precise. We will (Q) sometimes refer to the support of λb,m,N as the set of Q-rough numbers. Now let K be the smallest integer with (2.11)

2K >

A 1 10 (log N )

and define (2j )

(2.12)

(2j−1 )

ψj = λb,m,N − λb,m,N

for j = 1, . . . , K and define (2.13)

(2K )

ψK+1 = λb,m,N − λb,m,N ,

so that (2.8) holds. In the next two sections we prove the two required estimates, (2.9) and (2.10). Let us note here that the main novelty in our proof of Theorem 2.1 lies in the definition of the dyadic decomposition (2.8). By contrast, the analogous dyadic decompositions in [5] take place on the Fourier side, requiring the introduction of various smooth cutoff functions not specifically related to the underlying arithmetic structure. 3. An L2 -L2 estimate It turns out that the proof of (2.10), the L2 -L2 estimate, is by far the easier of the two estimates required. We have *f ∗ ψj∧ *2 = *f.ψj *2

" *ψj *∞ *f.*2

= *ψj *∞ *f *2 .

ROTH’S THEOREM IN THE PRIMES

1615

Suppose first of all that 1 " j " K. Then (2j )

(2j−1 )

*ψj *∞ " *λb,m,N *∞ + *λb,m,N *∞ 1 1 / 0 /0 1 −1 1 −1 −1 −1 =N +N . 1− 1− p p j j+1 p!2 p!m

p!2 p!m

The two products here may be estimated using Merten’s formula [14, Ch. 22]: / e−γ (1 − p−1 ) ∼ . log Q p!Q

This gives

(3.1) and hence

*ψj *∞ " j/N,

j *f *2 , N which is certainly of the requisite form (2.10). For j = K + 1 we have (3.2)

*f ∗ ψj∧ *2 " (2K )

*ψK+1 *∞ " *λb,m,N *∞ + *λb,m,N *∞ so that

" log N/N,

log N *f *2 . N This also constitutes an estimate of the type (2.10) for some ε < (p − 2)/2. Indeed, recalling our choice of A and K (viz. (2.11)) one can check that 2K ! (log N )1/ε for some such ε. (3.3)

∧ *2 " *f ∗ ψK+1

4. An L1 -L∞ estimate This section is devoted to the rather lengthy task of proving estimates of the form (2.9). Introduction. The first step towards obtaining an estimate of the form (2.9) is to observe that (4.1)

*f ∗ ψj∧ *∞ " *ψj∧ *∞ *f *1 .

We will prove that *ψj∧ *∞ is not too large by proving Proposition 4.1. Suppose that Q " (log N )A . Then we have the estimate (Q)∧

*λ∧ b,m,N − λb,m,N *∞ " log log Q/Q.

1616

BEN GREEN

The detailed proof of this fact will occupy us for several pages. Let us begin, however, by using (4.1) to see how it implies an estimate of the form (2.9). If 1 " j " K then, (4.2)

(2j )∧

(2j−1 )∧

*ψj∧ *∞ = *λb,m,N − λb,m,N *∞ (2j )∧

(2j−1 )∧

∧ " *λ∧ b,m,N − λb,m,N *∞ + *λb,m,N − λb,m,N *∞

" log j/2j .

This is certainly of the form (2.9). The estimate for j = K + 1 is even easier, being immediate from Proposition 4.1. To prove Proposition 4.1 we will use the Hardy-Littlewood circle method. Thus we divide T into two sets, traditionally referred to as the major and minor arcs. It is perhaps best if we define these explicitly at the outset. Thus let p be the exponent for which we are trying to prove Theorem 2.1. Recall that A = 4/(p − 2), and set B = 2A + 20. These numbers will be fixed throughout the proof. By Dirichlet’s theorem on approximation, every θ ∈ T satisfies (4.3)

2 2 2θ − 2

2 a 22 (log N )B " q2 qN

for some q " N (log N )−B and some a, (a, q) = 1. The major arcs consist of those θ for which q can be taken to be at most (log N )B . We will write this collection using the notation 3 Ma,q . M= q!(log N )B (a,q)=1 (Q)∧

For these θ, the Fourier transforms λb,m,N and λ∧ b,m,N depend on the distribution of the almost-primes and primes along arithmetic progressions with common difference at most (log N )B . The minor arcs m consist of all other θ. (Q)∧ Here different techniques apply, and one can conclude that both λb,m,N and ∧ λb,m,N are small. The triangle inequality then applies. The ingredients are as follows. The almost-primes are eminently suited to applications of sieve techniques. To keep the paper as self-contained as possible, we will follow Gowers [8] and use the arguably simplest sieve, that due to Brun, on both the major and minor arcs. The genuine primes, on the other hand, are harder to deal with. Here we will quote two well-known results from the literature. The information concerning distribution along arithmetic progressions to small moduli comes from the prime number theorem of Siegel and Walfisz.

ROTH’S THEOREM IN THE PRIMES

1617

Proposition 4.2 (Siegel-Walfisz). Suppose that q " (log N )B , that (a, q) = 1 and that 1 " N1 " N2 " N . Then (4.4)

"

log p =

N1 2, 2p 8 9p/2 & 22" 2 " 2 2 p/2−1 2 f (n) log ne(nθ)2 dθ "p N f (n) log n . 2 2 2 n n Therefore

2p 9p/2 8 & 22 " 2 " |an |2 2 2 p/2−1 an e(nθ)2 dθ "p N 2 2 2 log n n∈PN

"p N

p−1

n∈PN −p

(log N )

.

However it is an easy matter to check that 2p 2p 2 & & 22 " 2 2 2" 2 2 2 2 e(nθ) dθ ! e(nθ) 2 2 dθ 1 N p−1 (log N )−p . 2 2 2 2 2 2 |θ|!1/2N n∈PN

n∈PN

This proves Theorem 1.5 for p > 2. For p = 2 it is trivial by Parseval’s identity.

1627

ROTH’S THEOREM IN THE PRIMES

6. Roth’s theorem in the primes Let A0 be a subset of the primes with positive relative upper density. By this we mean that there is a positive constant α0 such that, for infinitely many integers n, we have |A ∩ Pn | ! α0 n/ log n.

(6.1)

This is not a particularly convenient statement to work with, and our first lemma derives something more useful from it. Lemma 6.1. Suppose that there is a set A0 ⊆ P with positive relative density, but which contains no 3APs. Then there are a positive real number α and infinitely many primes N for which the following is true. There are a set A ⊆ {1, . . . , 2N/23}, and an integer W ∈ [ 18 log log N, 14 log log N ] such that • A contains no 3APs, • λb,m,N (A) ! α for some b with (b, m) = 1, where m =

+

p!W

p.

Proof. Take any n ! α0−3 for which (6.1) holds. Let W = 2 14 log log n3, + and set m = p!W p. Choose N to be any prime in the range (2n/m, 4n/m]. Now there are certainly no more than m elements of A0 which share a factor with m, and no more than n3/4 elements x ∈ A0 with x " n3/4 . Thus " " A0 (x) log x ! α0 n/2, b:(b,m)=1

x!n x≡b(mod m)

and for some choice of b we have " (6.2) A0 (x) log x ! α0 n/2φ(m). x!n x≡b(mod m)

Write A = m−1 ((A0 ∩ [n]) − b). This set, being a part of A0 subjected to a linear transformation, contains no 3-term AP. It is also clear that A ⊆ {1, . . . , 2N/23}. Furthermore (6.2) is equivalent to " A(x) log(mx + b) ! α0 n/2φ(m), x!N mx+b is prime

which implies that λb,m,N (A) ! α0 n/2mN ! α0 /8. The lemma follows, with α = α0 /8. The reason we stipulate that A be contained in {1, . . . , 2N/23} is that A does not contain any 3APs when considered as a subset of ZN = Z/N Z. This

1628

BEN GREEN

allows us to make use of Fourier analysis on ZN . If f : ZN → C is a function we will write, for any r ∈ ZN , " f (x)e(−rx/N ). fB(r) = x∈ZN

Observe that f may also be considered as a function on Z via the embedding ZN -→ [N ], and then fB(r) = f ∧ (r/N ). For notational simplicity write µ = λb,m,N . We will consider A and µ as functions on ZN . Write a = Aµ. We will continue to abuse notation by using µ and a as measures. Thus, for example, a(ZN ) ! α. Now if A contains no (nontrivial) 3APs then " " (6.3) a(x)a(x + d)a(x + 2d) = a(x)3 x

x,d

"

"

µ(x)3

x

" (log N )3 /N 2 . We are going to show that this forces α to be small. We will do this by constructing a new measure a1 on ZN which is set-like, which means that a1 behaves a bit like N −1 times the characteristic function of a set of size ∼ αN . The new measure a1 will be fairly closely related to a, and in fact we will be able to show that " (6.4) a1 (x)a1 (x + d)a1 (x + 2d) is small. x,d

This, it turns out, is impossible; an argument of Varnavides based on Roth’s theorem tells us that a dense subset of ZN contains lots of 3APs. We will adapt his argument in a trivial way to show that the same is true of set-like measures. The arguments of this section, then, fall into two parts. First of all we must define a1 , define the notion of “set-like” and then show that a1 is indeed set-like. The key ingredient here is Lemma 6.2, which says that µ B is small away from zero. Secondly, we must formulate and prove a result of the form (6.4). For this we need Theorem 2.1, the restriction theorem for primes. The idea of constructing a1 , and the technique for constructing it, has its origins in the notions of granularization as used in a paper of I.Z. Ruzsa and the author [9]. In the present context things look rather different however and, in the absence of anything which might be called a “grain”, we think the terminology of [9] no longer appropriate. Let us proceed to the definition of a1 . Let δ ∈ (0, 1) be a real number to be chosen later, and set a(r)| ! δ} . R = {r ∈ ZN : |B

ROTH’S THEOREM IN THE PRIMES

1629

Let k = |R|, and write R = {r1 , . . . , rk }. Let ε ∈ (0, 1) be another real number to be chosen later, and write B(R, ε) for the Bohr neighbourhood ! xr ! C D ! i! x ∈ ZN : ! ! " ε ∀i ∈ [k] . N Write B = B(R, ε) and set β(x) = B(x)/|B|. Define (6.5)

a1 = a ∗ β ∗ β.

It is easy to see that

a1 (ZN ) ! α.

(6.6)

In Lemma 6.3 below we will show that *a1 *∞ " 2/N , provided that a certain inequality between ε, k and W is satisfied. This is what we mean by the statement that a1 is set-like. Lemma 6.2. Suppose that N , and hence W , is sufficiently large. Then, sup |B µ(r)| " 2 log log W/W. r+=0

Proof. Recall that µ B(r) = µ∧ (r/N ). There are three different cases to consider. Case 1. r/N ∈ M0,1 ; that is to say |r/N | " (log N )B /N . Then by Lemma 4.8 we have the asymptotic µ B(r) = τ (r/N ) + O(log N )−A .

Observe, however, that τ (r/N ) = 0 provided that r 5= 0. Case 2. r/N ∈ Ma,q . Then Lemma 4.8 gives 1 0 1 0 χq µ(q) r abm a τ + O(log N )−A , µ B(r) = e − − φ(q) q N q

where

χq = Since m =

+

p!W

#

1 (q, m) = 1 0 otherwise.

p, we certainly have χq = 0 for q " W . Thus indeed

|B µ(r)| " sup φ(n)−1 + O(log N )−A " 2 log log W/W. n"W

Case 3. r/N ∈ m. Then Lemma 4.9 gives µ B(r) = µ∧ (r/N ) = O((log N )−A ).

1630

BEN GREEN

Lemma 6.3. Suppose that εk ! 2 log log W/W . Then the measure a1 is set-like, in the sense that *a1 *∞ " 2/N . Proof. Indeed a1 (x) = a ∗ β ∗ β(x)

" µ ∗ β ∗ β(x) " B 2 e(rx/N ) µ B(r)β(r) = N −1 r

"N

−1

"N

−1

=N

−1

B 2 + N −1 µ B(0)β(0) +N

−1

sup |B µ(r)| r+=0

−1

+ |B|

" r+=0

"

sup |B µ(r)|

r

B 2 |B µ(r)||β(r)| B 2 |β(r)|

r+=0

2 log log W . " N −1 + W |B|

Now by a well-known application of the pigeonhole principle we have |B| ! εk N , from which the lemma follows immediately. We move on now to the second part of our programme, which includes a statement and proof of a result of the form (6.4). Proposition 6.4. There is an inequality " 1 , 12 2 −5/2 2 ε δ a1 (x)a1 (x + d)a1 (x + 2d) " C ( N −3/2 + + Cδ 1/2 . N x,d

We will require several lemmas. The most important is a “discrete majorant property”. Before we state and prove this, we give an elegant argument of Marcinkiewicz and Zygmund [27]. We outline the argument here since we like it and, possibly, it is not particularly well-known. Lemma 6.5 (Marcinkiewicz-Zygmund). Let N be a positive integer, and let f : [N ] → C be any function. Consider f also as a function on ZN . Let p > 1 be a real number. Then & N −1 " " p ∧ p B |f (r)| = |f (r/N )| " C(p)N |fB(θ)|p dθ. r∈ZN

r=0

Proof. Consider the function 0 0 1 1 |n| |n| g(n) = 2 1 − χ|n|!2N − 1 − χ|n|!N . 2N N

1631

ROTH’S THEOREM IN THE PRIMES

This function is equal to 1 for all n with |n| " N . Its Fourier transform, g ∧ (θ), is equal to 2K2N (θ) − KN (θ), a difference of two Fej´er kernels. Thus we have f ∧ = f ∧ ∗ (2K2N − KN ) , and so |fB(r)|p = |f ∧ (r/N )|p 2& 2p 2 2 ∧ 2 = 2 f (θ) (2K2N (r/N − θ) − KN (r/N − θ)) dθ22 2p 2& 2p 1 0 2& 2 2 2 2 p−1 p2 ∧ ∧ 2 2 2 2 f (θ)K2N (r/N − θ) dθ2 + 2 f (θ)KN (r/N − θ) dθ22 "3 0 & 1 & p−1 p ∧ p ∧ p "3 2 |f (θ)| K2N (r/N − θ) dθ + |f (θ)| KN (r/N − θ) dθ by two applications of Jensen’s inequality. It is necessary, of course, to use the fact that the Fej´er kernels are nonnegative. To conclude the proof, one only has to show that N −1 " r=0

KN (r/N − θ) " CN,

together with a similar inequality for K2N . But this is a straightforward matter using the bound N −1 " r=0

KN (r/N − θ) "

N −1 "

sup

j j+1 j=0 φ∈[ N , N ]

KN (φ)

together with the estimate KN (φ) " min(N, N −1 |φ|−2 ), valid for |φ| " 1/2. Lemma 6.6 (Discrete majorant property). Suppose that p > 2. Then there is an absolute constant C(p) (not depending on a) such that " |B a(r)|p " C(p). r

Proof. A direct application of Theorem 2.1 gives & |a∧ (θ)|p dθ " C ( (p)N −1 .

The lemma is immediate from this and Lemma 6.5.

1632

BEN GREEN

2 2 2 2 2 " 212 ε2 . B B 4 β(−2r) Lemma 6.7. Suppose that r ∈ R. Then 21 − β(r) 2

Proof. We have

2 2 2 2" 2 2 1 2 2 2 2 B (1 − e(rx/N ))2 2 21 − β(r)2 = 2 |B| 2 x∈B 2 2 2 1 22 " 2 = (1 − cos(2πrx/N ))2 2 2 |B| 2 x∈B

" 4π sup *rx/N *2 2

x∈B 2

" 16ε .

A very similar calculation shows that 2 2 2 2 B 21 − β(−2r) 2 " 64ε2 ,

and the lemma follows quickly.

Proof of Proposition 6.4. By (6.3) we have, observing that aB1 = B aβB2 ,

(6.7) "

a1 (x)a1 (x + d)a1 (x + 2d) "

"

a1 (x)a1 (x + d)a1 (x + 2d) " − a(x)a(x + d)a(x + 2d) + (log N )3 N −2

= O(N −3/2 ) , " 2 B B 4 β(−2r) . B a(r)2 B a(−2r) 1 − β(r) −N −1 r

Split the sum in (6.7) into two parts, one over r ∈ R and the other over r ∈ / R. When r ∈ R we use Lemma 6.7 to get , " 2 B B 4 β(−2r) B a(r)2 B a(−2r) 1 − β(r) " 212 ε2 |R| r∈R

" Cε2 δ −5/2 ,

this last inequality following from Lemma 6.6 with p = 5/2. To estimate the sum over r ∈ / R, we again use Lemma 6.6 with p = 5/2. Indeed using H¨ older’s inequality we have 2 2 2" , -2 " 2 2 2 B B 4 β(−2r) B a(r)2 B a(−2r) 1 − β(r) a(r)|1/2 |B a(r)|5/2 2 2 " 2 sup |B 2 2 r ∈R / r ∈R /

This concludes the proof of Proposition 6.4.

r

" Cδ

1/2

.

ROTH’S THEOREM IN THE PRIMES

1633

By (6.6) and Lemma 6.3, a1 behaves a bit like a measure associated to a set of size αN . As promised, we use this information together with an argument originally due to Varnavides [25] to get a lower bound on $ a1 (x)a1 (x + d)a1 (x + 2d). Lemma 6.8. For some absolute constant C2 , " 5 6 a1 (x)a1 (x + d)a1 (x + 2d) ! exp −C2 α−2 log(1/α) N −1 . x,d∈ZN

Proof. Let A( = {x ∈ ZN : a1 (x) ! α/2N }. By Lemma 6.3 we have " 2|A( | α α" a1 (x) " + |A(c |, N 2N

which implies that |A( | ! αN/4. We will give a lower bound for Z, the number $ of 3APs in A( . It is clear that a1 (x)a1 (x + d)a1 (x + 2d) is at least α3 Z/8N 3 . 3 Now by Bourgain’s theorem [6] there is a constant C1 such that if 5 6 M ! exp C1 α−2 log(1/α)

then any subset of {1, . . . , M } of density at least α/8 contains a 3AP with nonzero common difference. Now there are exactly N (N − 1) nontrivial arithmetic progressions of length M in ZN , and A( will have density at least α/8 on many of them. To estimate exactly how many, fix a common difference d 5= 0, $ and let I = {0, d, 2d, . . . , (M − 1)d}. We have x A( ∗ I(x) ! αN M/4, but A( ∗ I(x) " M for every x. Thus another simple averaging argument shows that A( ∗ I(x) ! αM/8 for at least αN/8 values of x. In total, then, there are at least αN 2 /8 progressions of length M on which ( A has density at least α/8. Each of them contains a 3AP consisting of elements of A( . No 3AP thus counted can arise from more than M 2 progressions of length M . Thus we have two different ways of bounding Z, and putting them together gives Z ! αN 2 /8M 2 . The lemma follows. Combining this with Proposition 6.4, we get 5 6 C ( N −1/2 + 212 ε2 δ −5/2 + Cδ 1/2 ! exp −C2 α−2 log(1/α) . (6.8) There are constants C3 , C4 so that if we choose 5 6 δ = exp −C3 α−2 log(1/α) 3

We could equally well use Roth’s original theorem here, at the expense of making any bounds for the relative density in Theorem 1.4 even worse.

1634

BEN GREEN

and 5 6 ε = exp −C4 α−2 log(1/α)

then (6.8) cannot hold, and we will have derived a contradiction to the assumption that A contains no 3APs. We are permitted to choose any values of ε and δ so that the condition of Lemma 6.8 is satisfied. Recalling that k " Cδ −5/2 (a consequence of Lemma 6.6) and that W ! log log N/8, we see that (6.8) can indeed be contradicted provided that E log5 N α!C (6.9) . log4 N The subscripts indicate the number of iterated logarithms, not the base to which those logarithms are taken! Let us remind the reader of what it is that we have contradicted. We assumed that there was a subset A0 ⊆ P of positive relative upper density, containing no 3AP. The number α was related to the relative upper density of A0 , via the slightly technical reductions made in Lemma 6.1. A bound of the form (6.9) also holds for α0 . That is, any subset of Pn with cardinality at least Cn(log5 n)1/2 / log n(log4 n)1/2 contains a 3AP. By far the most important reason for our getting such a poor bound was the need to prove Lemma 6.2, which says that by passing to a subprogression + of common difference m = p!W p one can make the primes look somewhat uniform. This is a rather crude trick but we have not been able to get around it. Even if we could, the resultant bounds would surely be many miles from the probable truth, which is that any subset of [N ] of cardinality N (log N )−1000 contains 3APs. Let us conclude by remarking that the methods of this section use rather little about the primes. In fact by the same argument one could establish a Roth-type theorem relative to any measure µ : ZN → R+ for which one had good control on supr+=0 |B µ(r)| together with bounds for *fB*p , for some p ∈ (2, 3) and any f satisfying 0 " f (x) " µ(x) pointwise. In practise bounds of this latter type will come by restriction theory arguments of the type given in Section 5. A more general setting for our arguments, along the lines just described, is given in [13]. 7. Acknowledgements The author would like to thank Tim Gowers for his insights into Vinogradov’s three-primes theorem, which played a substantial part in the development of this paper. He would also like to thank Imre Ruzsa for helpful conversations, Jean Bourgain for drawing his attention to the references [4], [5] and the students who attended the course [11] for their enthusiasm.

ROTH’S THEOREM IN THE PRIMES

1635

Trinity College, University of Cambridge, Cambridge, United Kingdom E-mail address: [email protected]

References [1]

A. Balog, Linear equations in primes, Mathematika 39 (1992), 367–378.

[2]

A. Balog and A. Perelli, Exponential sums over primes in an arithmetic progression, Proc. Amer. Math. Soc. 93 (1985), 578–582.

[3]

F. A. Behrend, On sets of integers which contain no three terms in arithmetical pro-

gression, Proc. Nat. Acad. Sci. U.S.A. 32 (1946), 331–332. [4]

J. Bourgain, On Λ(p)-subsets of squares, Israel J. Math. 67 (1989), 291–311.

[5]

——— , Fourier transform restriction phenomena for certain lattice subsets and applications to nonlinear evolution equations. I. Schr¨ odinger equations, Geom. Funct. Anal . 3 (1993), 107–156.

[6]

——— , On triples in arithmetic progression, Geom. Funct. Anal . 9 (1999), 968–984.

[7]

H. Davenport, Multiplicative Number Theory, Third edition, Grad. Texts Math. 74

Springer-Verlag, New York, 2000. [8]

W. T. Gowers,Vinogradov’s three-primes theorem; notes available at http://www.dpmms.cam.ac.uk/˜wtg10/3primes.dvi .

[9]

B. J. Green and I. Z. Ruzsa, Counting sumsets and sum-free sets modulo a prime, Studia Sci. Math. Hungar . 41 (2004), 285–293.

[10] ——— , On the Hardy-Littlewood majorant problem, Math. Proc. Camb. Phil. Soci. 137 (2004), 511–517. [11] B. J. Green, Restriction and Kakeya phenomena, notes from a course given in Part III of the Cambridge Mathematical Tripos (2002); Available at http://www.dpmms.cam.ac.uk/˜bjg23/rkp.html . [12] ——— , Some minor arcs estimates relevant to the paper “Roth’s theorem in the primes”; preprint, available at http://www.dpmms.cam.ac.uk/˜bjg23/papers/BG 11 minorarcs.pdf [13] B. J. Green and T. C. Tao, Restriction theory of the Selberg sieve, with applications, preprint, available at http://www.arxiv.org/abs/math.NT/0405581 [14] G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers, Fifth edition, The Clarendon Press, Oxford University Press, New York, 1979. [15] D. R. Heath-Brown, Integer sets containing no arithmetic progressions, J. London Math. Soc. 35 (1987), 385–394. [16] M. N. Huxley, Personal communication. [17] A. F. Lavrik, Analytic method of estimates of trigonometric sums by the primes of an arithmetic progression (Russian), Dokl. Akad. Nauk SSSR 248 (1979), 1059–1063. [18] G. Mockenhaupt, Bounds in Lebesgue spaces of oscillatory integrals, Habilitation thesis, Universit¨ at Siegen, 1996; Available at http://http://www.math.gatech.edu/˜gerdm/research/ . [19] F. Mockenhaupt and T. Tao, Restriction and Kakeya phenomena in finite fields, Duke Math. J . 121 (2004), 35–74. [20] G. Mockenhaupt and W. Schlag, The Hardy-Littlewood majorant property for random sets, preprint.

1636

BEN GREEN

[21] K. F. Roth, On certain sets of integers, J. London Math. Soc. 28 (1953), 104–109. [22] E. Szemere´di, Integer sets containing no arithmetic progressions, Acta Math. Hungar . 56 (1990), 155–158. [23] T. Tao, Notes from a course given at UCLA, available at http://www.math.ucla.edu/˜tao/254b.1.99s/notes1.dvi . [24] P. Tomas, A restriction theorem for the Fourier transform, Bull. Amer. Math. Soc. 81 (1975), 477–478. [25] P. Varnavides, On certain sets of positive density, J. London Math. Soc. 34 (1959), 358–360. [26] R. C. Vaughan, Sommes trigonom´etriques sur les nombres premiers, C. R. Acad. Sci. Paris S´er. A-B 285 (1977), A981–A983. [27] A. Zygmund, Trigonometric Series, 2nd ed., Vols. I, II, Cambridge Univ. Press, New York, 1959.

(Received February 27, 2003)