Lambda calculus. Inge Bethke

10 Lambda calculus Inge Bethke In this chapter, we present the paradigm higher-order rewriting system – λ-calculus. We thoroughly work through the fun...

Author: Rosanna Johns

12 downloads 0 Views 362KB Size

Report

Download PDF

Recommend Documents

Untyped Lambda-Calculus

Untyped Lambda Calculus

Stochastic Lambda-Calculus

Logical Structures in Natural Language: Lambda calculus

The Lambda Calculus. A Brief Introduction

Russell s Anticipation of the Lambda Calculus

Stream Associative Nets and Lambda-mu-calculus

The Lambda Calculus: a minimal ML?

Soft lambda-calculus: a language for polynomial time computation

Lambda calculus. The functional computational model. Example. Alonzo Church 1932

Learning Synchronous Grammars for Semantic Parsing with Lambda Calculus

FUNCTIONAL PEARL Linear lambda calculus and PTIME-completeness

Normalization by evaluation for typed lambda calculus with coproducts

A tutorial implementation of a dependently typed lambda calculus

Matthias Bethke

Sonde Lambda Lambda Sensors Lambda Sonden Sondes de richesse Sondas Lambda Sondas Lambda

Lambda-Notation. Semantik und Pragmatik. Lambda-Notation. Lambda-Notation. Lambda-Notation. Lambda-Notation

Lambda Fenster Lambda Windows

BACHELORARBEIT Maxime Bethke

Sonde Lambda Lambda Sensors Lambda Sonden Sondes de richesse Sondas Lambda Sondas Lambda

Zusatzprofile Lambda Supplementary profiles Lambda

Bethke Elementary. Honoring All Learners

Lambda Transmitter LT2 Lambda Probe LS2

Sistemi modulari - Modular systems LAMBDA. arcluce.it. lambda

10 Lambda calculus Inge Bethke In this chapter, we present the paradigm higher-order rewriting system – λ-calculus. We thoroughly work through the fundamentals, discussing such topics as confluence, finiteness of developments, parallel moves, standardization and normalization, and definability. We explore the ramifications of extensions of the basic theory and study their confluence properties. We round off the chapter with a short discussion of a typed version of the calculus which exhibits a rather different character as compared with the untyped version.

In Section 3.3, combinatory logic was introduced. Here we will study a variant of CL based on a function notation invented by Alonzo Church in the 1930s. Like CL, this formal system, called λ-calculus, has been designed to capture the most basic aspects of the ways that operators or functions can be combined to form other operators. However, unlike CL, λ-calculus makes use of bound variables and is therefore only an ARS which does not comply with the first-order TRS format given in Sections 2.1, 2.2. But λ-calculus is the prime example of a higher-order rewrite system (to be introduced in Chapter 11) and for this reason we offer the present introduction.

10.1. Basics First we present a sequence of simple definitions. 10.1.1. Definition. The alphabet of the λ-calculus consists of a countably infinite set of variables x0 , x1 , . . . also denoted by x, y, z, x′ , y ′, . . ., an abstractor λ and parentheses (, ). From this alphabet the set Ter(λ) of λ-terms is defined inductively as follows: (i) x, y, z, . . . ∈ Ter(λ), (ii) if M, N ∈ Ter(λ), then (MN) ∈ Ter(λ), and (iii) if M ∈ Ter(λ) and x is a variable, then (λx.M) ∈ Ter(λ). 10.1.2. Example. The following are examples of λ-terms: x, (xy), (λx.(xy)), (λz.(xy)), ((λx.(xy))z), ((λz.((λx.(xy))z))u).

Sometimes we will consider λ-terms over a set of constants Γ. In that case we change Ter(λ) into Ter(λΓ) in the above clauses and add 548

10.1 Basics

549

(iv) Γ ⊆ Ter(λΓ). If Γ is a singleton, say Γ = {γ}, we shall also write Ter(λγ) instead of Ter(λ{γ}). A term of the form (MN) will be called an application term, and a term of the form (λx.M) will be called an abstraction term. Capital italic letters M, N, L, . . . will denote arbitrary λ-terms. When no confusion can occur, parentheses will be omitted in such a way that, for example, MNP Q denotes (((MN)P )Q). This convention is called association to the left. Other abbreviations will be λx.P Q for (λx.(P Q)) and λx0 . . . xn .M for (λx0 .(λx1 .(. . . (λxn .M) . . .))). 10.1.3. Exercise. Insert the full amount of parentheses and abstractors in the abbreviated terms ux(yz)(λv.vy), (λxyz.xz(yz))uvw, w(λxyz.xz(yz))uv.

10.1.4. Definition. Let M be a λ-term. (i) The set of free variables occurring in M, notation FV (M), is defined inductively by (1) FV (x) = {x} for each variable x, (2) FV (MN) = FV (M) ∪ FV (N), and (3) FV (λx.M) = FV (M) − {x}. A variable x occurs free in M, if x ∈ FV (M). Terms not containing any free variable are called closed terms or combinators, and Ter 0 (λ) is the set of closed terms. (ii) The set of subterms of M, notation Sub(M), is defined inductively by (1) Sub(x) = {x} for each variable x, (2) Sub(MN) = Sub(M) ∪ Sub(N) ∪ {MN}, and (3) Sub(λx.M) = Sub(M) ∪ {x} ∪ {λx.M}. A term N occurs in M, notation N ⊆ M, if N ∈ Sub(M). A variable x occurs bound in M, if λx.N ⊆ M for some term N. At least two different definitions of the notion of ‘subterm’ of a given λ-term and what it means that a variable ‘occurs bound’ in it can be found in the literature. E.g. in Barendregt [1984], a variable x is a subterm of a term λx.M only if it is a subterm of M, and occurs bound in λx.M only if it occurs in M. We here adopt the definition given by e.g. Hindley and Seldin [1986] which we find more convenient for our purposes. 10.1.5. Example. Note that a variable can occur both bound and free in a term. For example, in the term (λx.y(λx.x))x the variable x occurs bound and free; y occurs free in it. The terms λx.x, λxy.x and λxyz.xz(yz) do not contain any free variable; these terms are combinators. 10.1.6. Exercise. (i) Determine FV ((λx.(λy.yx)z)(λx.yv)x).

550

Lambda calculus

(ii) Does the term x(yz) occur in ux(yz)? Look at the terms in Exercise 10.1.3; in which of these does (λxyz.xz(yz))u occur?

10.1.7. Notation. Deviating slightly from Section 1.1, M ≡ N denotes throughout this chapter that M and N are identical terms or can be obtained from each other by renaming bound variables in a way such that capture of free variables is avoided. E.g. λx.xz ≡ λy.yz 6≡ λz.zz. 10.1.8. Definition. The result of substituting N for the free occurrences of x in M, notation M[x := N], is defined inductively by (i) x[x := N] ≡ N, (ii) y[x := N] ≡ y for x 6≡ y, (iii) (M1 M2 )[x := N] ≡ (M1 [x := N])(M2 [x := N]), and (iv) (λy.M)[x := N] ≡ λy.(M[x := N]). 10.1.9. Remark. Note that this definition of substitution yields ‘dishonest’ substitutions like (λy.yx)[x := y] ≡ λy.yy where the free variable y becomes bound after substitution in the term λy.yx, and (λy.y)[y := x] ≡ λy.x where the variable x is substituted for the bound variable y. We shall avoid such problems throughout this chapter. That is, in writing expressions like M [x := N ] we shall tacitly assume – that variables occurring free in N do not become bound after substitution in the term M , and – that the substitution variable x does not occur bound in M . This can of course always be achieved by a suitable renaming of the bound variables in M . There are several other ways to get around substitution problems: in Barendregt [1984] a similar method is used; another way as in e.g. Curry and Feys [1958] and Hindley and Seldin [1986] is to sharpen Definition 10.1.8 with a proviso on bound and free variables; a third method is to work with a different notation for λ-terms such that bound variables do not occur at all (see e.g. de Bruijn [1972]).

10.1.10. Lemma (Substitution Lemma). Let M, N, L ∈ Ter(λ). If x 6≡ y and x 6∈ FV (L), then M[x := N][y := L] ≡ M[y := L][x := N[y := L]] Proof. By induction on the structure of M. We shall only consider the case where M ≡ λz.M ′ . Using the induction hypothesis we obtain λz.(M ′ [x := N][y := L]) ≡ λz.(M ′ [y := L][x := N[y := L]]) Thus (λz.M ′ )[x := N][y := L] ≡ λz.(M ′ [x := N][y := L]) ≡ λz.(M ′ [y := L][x := N[y := L]]) ≡ (λz.M ′ )[y := L][x := N[y := L]]

10.1 Basics

551

10.1.11. Exercise. (i) Evaluate the substitutions (λy.x(λz.z))[x := λy.xy] and (y(λv.xv))[x := λu.zu]. (ii) Show that M [x := N ] ≡ M , if x 6∈ FV (M ), (iii) Complete the proof of the Substitution Lemma.

Contexts are λ-terms in Ter(λ2) containing some ‘empty places’, that is, occurrences of the constant 2, also called holes. An analogous notion of context was introduced in Section 2.1.1 for first-order TRSs. 10.1.12. Notation. A λ-context is generally denoted by C. If C is a λcontext containing n holes, and M1 , . . . , Mn ∈ Ter(λ), then C[M1 , . . . , Mn ] denotes the result of replacing the holes in C from left to right by M1 , . . . , Mn . In this act, variables occurring free in M1 , . . . , Mn may become bound in C[M1 , . . . , Mn ]. Almost always we will only need contexts containing precisely one hole; multihole contexts will be used exclusively in Section 10.4. 10.1.13. Example. C ≡ λx.x(λy.2) is a λ-context. If M ≡ xy, then C[M ] ≡ λx.x(λy.xy).

Lambda calculus comes along with two reduction relations: β- and η-reduction. These Greek-letter names come from Curry and Feys [1958] and are now more or less standard terminology. The first reduction, β-reduction, is the very soul of the enterprise. It provides an interpretation of application terms of the form (λx.M)N, called β-redexes. An abstraction term of the form λx.M represents an operator whose value at a certain object N is calculated by substituting N for x in M. Hence a redex (λx.M)N can be evaluated to its contractum M[x := N]. This is called β-reduction; it may be performed within an arbitrary context. 10.1.14. Definition. The relation of β-reduction, →β ⊆ Ter(λ) × Ter(λ), is defined by →β = {(C[(λx.M)N], C[M[x := N]]) | M, N ∈ Ter(λ), C ∈ Ctx (λ)} 10.1.15. Remark. We have identified terms that differ only in the names of bound variables. An alternative is to add to the λ-calculus the following α-reduction relation: →α = {(C[λx.M ], C[λy.M [x := y]]) | y 6∈ FV (λx.M ), C ∈ Ctx (λ)} The equivalence relation generated by →α is called α-equivalence, or α-conversion. Originally →β was introduced, after →α , as the second reduction relation, whence its name.

552

Lambda calculus

Beta reduction uses a concept of function equality that is called intensional, that is, it does not include the assumption used in most of mathematics that functions with identical graphs are necessarily equal. This difference is overcome by the second relation, η-reduction, which identifies terms having the same applicative behaviour. 10.1.16. Definition. The relation →η ⊆ Ter(λ) × Ter(λ), called ηreduction, is defined by →η = {(C[λx.Mx], C[M]) | x 6∈ FV (M), C ∈ Ctx (λ)} We will henceforth refer to the ARS (Ter(λ), →β ) as λβ-calculus or simply λ-calculus, and to the ARS (Ter(λ), →β ∪ →η ) as λβη-calculus . Moreover, we will apply all ARS notions considered in Chapter 1. In particular, we will write →βη instead of →β ∪ →η . 10.1.17. Examples. (i) Let M ≡ (λuz.u(uz))yx. Then we have M →β (λz.u(uz))[u := y]x ≡ (λz.y(yz))x →β y(yz)[z := x] ≡ y(yx) Here y(yx) is a β-normal form of M . (ii) Let Ω ≡ (λx.xx)(λx.xx). Then we have Ω →β (xx)[x := λx.xx] ≡ Ω →β (xx)[x := λx.xx] ≡ Ω →β · · · This β-reduction is the only possible one for Ω. So Ω does not reduce to a β-normal form. (iii) Let K ≡ λxy.x, I ≡ λx.x and N ≡ KIΩ for the above Ω. Then N can be reduced in at least two different ways, thus (1) N → →β I, (2) N →β N (by contracting Ω). So N reduces to the β-normal form I, but also has an infinite β-reduction sequence. (iv) The term λx.x(λz.xz) is in β-normal form. However, λx.x(λz.xz) →η λx.xx. So λx.x(λz.xz) is not a βη-normal form, but reduces to the βη-normal form λx.xx. 10.1.18. Exercise. Prove that adding η-reduction to λ-calculus yields an extensional system. That is, prove that for terms M, N and x 6∈ FV (M N ), M x =βη N x ⇒ M =βη N . 10.1.19. Exercise (Lercher [1976]). A term M is called minimal with respect to β-reduction if and only if for every term N we have M → →β N ⇒ M ≡ N . Of course, every β-normal form is minimal. But there are minimal terms that are not normal forms. Find one such. In fact, if we require that the term be also a β-redex, then there is only one such term.

10.1 Basics

553

There is a close correspondence, in motivation and results, between λ-calculus and combinatory logic introduced in Section 3.3. We shall now make this correspondence more precise. In what follows, we shall write Ter(CL) for the set of CL terms, →CL for the one-step reduction relation generated by the rewrite rules given in Table 3.3 and → →CL for its reflexive–transitive closure; =CL will denote convertibility in CL. The basis of the correspondence between λ-calculus and CL is given by the following, very natural, mapping. 10.1.20. Definition. The mapping –λ : Ter(CL)→Ter(λ) is defined inductively by (i) (ii) (iii) (iv) (v)

xλ ≡ x for each variable x, Sλ ≡ λxyz.xz(yz), Kλ ≡ λxy.x, Iλ ≡ λx.x, and (tt′ )λ ≡ tλ t′λ .

10.1.21. Exercise. Show that (SKI)λ → →β Iλ .

10.1.22. Proposition. For t, t′ ∈ Ter(CL), t → →CL t′ ⇒ tλ → →β t′λ . 10.1.23. Exercise. Prove the above proposition.

The converse implication, however, is false. In particular, as in Exercise 10.1.21, t may be a CL-normal form, while tλ is not a normal form for β-reduction. Note also that abstraction in λ- and λβη-calculus satisfies the so-called ξ-rule: for all λ-terms M, N and all variables x, (ξ) M →χ N ⇒ λx.M →χ λx.N for χ ∈ {β, βη}. Abstraction in CL, however, fails to observe this rule, i.e. in general it is not the case that [x]t → →CL [x]t′ whenever t → →CL t′ : e.g. SKxy → →CL y; however, [y]SKxy ≡ S(K(SKx))I 6→ →CL I ≡ [y]y. A better correspondence exists between λβη-calculus and the extension of CL obtained by defining the so-called ‘strong reduction’ in CL. See e.g. Curry and Feys [1958], Curry, Hindley and Seldin [1972], Stenlund [1972] and Barendregt [1984]. We end this section with an indication of the expressive power of λ-calculus. One concrete evidence that λ-calculus is far from being trivial is the following theorem. It says that every operator has a fixed point. That is, for each term M there is a term N such that MN → →β N. Furthermore, there is a combinator Y which finds these fixed points, i.e. such that Y M is a fixed point of M for all terms M. (On the topic of fixed-point combinators see also Section 3.3.3.)

554

Lambda calculus

10.1.24. Theorem. There is a combinator Y such that Y M → →β M(Y M) for all terms M. Proof. Y is not unique; there are many others. This one is due to Turing [1937]. Define Z ≡ λzx.x(zzx) and Y ≡ ZZ. Then we have Y M ≡ (λzx.x(zzx))ZM →β ≡ →β ≡

(λx.x(zzx))[z := Z]M (λx.x(ZZx))M (x(ZZx))[x := M] M(ZZM) ≡ M(Y M)

10.1.25. Exercise. Prove that Y in the proof of the above theorem does not reduce to β-normal form. 10.1.26. Exercise (B¨ohm [1966]). Let Y be a fixed-point combinator. (i) Prove that Y ′ ≡ Y (SI) is again a fixed-point combinator. (ii) Verify that if one takes for Y Curry’s fixed-point combinator, YC ≡ λf.(λx.f (xx))(λx.f (xx)), then one obtains as Y ′ Turing’s fixed-point combinator YT ≡ (λxf.f (xxf ))(λxf.f (xxf )). 10.1.27. Remark. Starting from YC one obtains according to the preceding exercise an infinite sequence of fixed-point combinators Yn : Y0 ≡ YC , Y1 ≡ YT ≡ Y0 (SI), Y2 ≡ Y1 (SI), . . . , Yn+1 ≡ Yn (SI), . . . In B¨ohm [1966] it is shown, using advanced techniques, that for n 6= m, Yn and Ym are not β-convertible.

The main application of fixed-point combinators is that one can ‘define’ a term in an implicit way in terms of itself. In particular, for every context C, there is a term M such that M → →β C[M], namely M ≡ Y (λx.C[x]). 10.1.28. Exercise. A combinator M is said to be an omnivore if M N → →β M for every N ∈ Ter (λ). Construct such a term.

10.2. Fundamental theorems In this section, we will prove important results on β(η)-reduction such as confluence, finiteness of developments, the Lemma of Parallel Moves and standardization. 10.2.1. Confluence of λβ- and λβη-calculus There are many ways to prove the Church–Rosser property for λ-calculus. The first proof was of course given by Church and Rosser [1936]. The proof we shall present here follows the proof in Takahashi [1995] and is based on

10.2 Fundamental theorems

555

the inductive proof method of Tait and Martin-L¨of. It is perhaps the shortest among all the proofs known. For the sake of the proof, we define a reduction relation −→ ◦ on Ter(λ), such that →β ⊆ −→ ◦ ⊆→ →β and such that −→ ◦ satifies the diamond property (DP, see Definition 1.1.8(v)). By an appeal to Proposition 1.1.11 we then have confluence of →β . The proof falls in the category of inductive Church–Rosser proofs, discussed in Section 4.7 for first-order TRSs. As a matter of fact, −→ ◦ is the λ-calculus version of multi-step reduction as defined in Definition 4.7.11. In Section 11.6.2 it is explained why in an inductive proof of CR for a higherorder system, such as the λ-calculus, we really need a notion of simultaneous q that reduction (−→), ◦ instead of the weaker notion of parallel reduction (−→) suffices in the first-order case. See especially Example 11.6.20(ii). 10.2.1. Definition. −→ ◦ ⊆ Ter(λ)2 is defined inductively by (i) (ii) (iii) (iv)

x −→ ◦ x for every variable x, if M −→ ◦ M ′ and N −→ ◦ N ′ , then MN −→ ◦ M ′N ′, if M −→ ◦ M ′ and N −→ ◦ N ′ , then (λx.M)N −→ ◦ M ′ [x := N ′ ] and if M −→ ◦ N, then λx.M −→ ◦ λx.N.

Note that, since −→ ◦ is closed under term formation, it includes the identity on λ-terms, i.e. M −→ ◦ M holds for each M, and is closed under contexts, i.e. we have C[M] −→ ◦ C[N] whenever M −→ ◦ N. Note also that if λx.M −→ ◦ N, then N ≡ λx.M ′ with M −→ ◦ M ′ , since (iv) is the only clause that applies. Moreover, −→ ◦ is substitutive in the following sense. 10.2.2. Lemma. For M, M ′ , N, N ′ ∈ Ter(λ) and every variable x, if M −→ ◦ M ′ and N −→ ◦ N ′ , then M[x := N] −→ ◦ M ′ [x := N ′ ]. Proof. We proceed by induction on the generation of M −→ ◦ M ′ . The base case is trivial. For the induction step we have to distinguish three cases, only one of which is not entirely straightforward. Suppose that M ≡ (λy.L)O −→ ◦ L′ [y := O ′ ] ≡ M ′ is a direct consequence of L −→ ◦ L′ and O −→ ◦ O ′ . Then M[x := N] ≡ (λy.L[x := N])(O[x := N]) −→ ◦ L′ [x := N ′ ][y := O ′ [x := N ′ ]] by the induction hypothesis. Now observe that L′ [x := N ′ ][y := O ′ [x := N ′ ]] ≡ L′ [y := O ′][x := N ′ ] ≡ M ′ [x := N ′ ] by the Substitution Lemma (10.1.10). 10.2.3. Exercise. Complete the proof of the above lemma.

Instead of proving the diamond property for −→, ◦ we can prove the following stronger statement more easily: M −→ ◦ N ⇒ N −→ ◦ M∗

556

Lambda calculus

where M ∗ is a term determined by M independently of N. This is the triangle property (TP) for −→. ◦ Intuitively, M ∗ will be obtained from M by contracting all the β-redexes occurring in M simultaneously.1 This is the content of the following proposition, originating with Barendregt et al. [1976]; see also Barendregt [1984]. It is the key proposition in the proof of CR in Takahashi [1995]. 10.2.4. Proposition. There exists a function ∗ : Ter(λ) → Ter(λ) such that for all M, N ∈ Ter(λ), N −→ ◦ M ∗ whenever M −→ ◦ N. Proof. Define M ∗ by induction on the complexity of M by (i) x∗ ≡ x for every variable x, ∗ L [x := N ∗ ] if M ≡ λx.L, ∗ (ii) (MN) ≡ M ∗N ∗ otherwise, ∗ ∗ (iii) (λx.M) ≡ λx.M . The statement is now proved by induction on the generation of M −→ ◦ N. The base case is trivial. For the induction step we have to distinguish three cases: Case 1. M ≡ M1 M2 −→ ◦ N1 N2 ≡ N is a direct consequence of M1 −→ ◦ N1 and M2 −→ ◦ N2 . Then N1 −→ ◦ M1∗ and N2 −→ ◦ M2∗ by the induction hypothesis. We distinguish two subcases. (a) M1 ≡ λx.L: then for some L′ with L −→ ◦ L′ , N1 ≡ λx.L′ . Since L′ −→ ◦ L∗ by the induction hypothesis, N ≡ (λx.L′ )N2 −→ ◦ L∗ [x := ∗ ∗ M2 ] ≡ M . (b) M is not a β-redex: then N ≡ N1 N2 −→ ◦ M1∗ M2∗ ≡ M ∗ . Case 2. M ≡ (λx.M1 )M2 −→ ◦ N1 [x := N2 ] ≡ N is a direct consequence ◦ M2∗ by the of M1 −→ ◦ N1 and M2 −→ ◦ N2 . Then N1 −→ ◦ M1∗ and N2 −→ induction hypothesis, and hence N ≡ N1 [x := N2 ] −→ ◦ M1∗ [x := M2∗ ] ≡ M ∗ by Lemma 10.2.2. Case 3. M ≡ λx.M ′ −→ ◦ λx.N ′ ≡ N is a direct consequence of M ′ −→ ◦ N ′. Then N ′ −→ ◦ M ′∗ by the induction hypothesis, and hence N ≡ λx.N ′ −→ ◦ ′∗ ∗ λx.M ≡ M . 10.2.5. Theorem. The reduction relation →β is confluent. Proof. By Proposition 10.2.4 we have that −→ ◦ satisfies the triangle property, and hence the diamond property. By Proposition 1.1.11 it therefore suffices to prove that →β ⊆ −→ ◦ ⊆→ →β . Since −→ ◦ is reflexive, →β ⊆ −→ ◦ follows by clause (iii) of Definition 10.2.1. Based on the inductive definition of −→ ◦ one also shows easily that −→ ◦ ⊆→ →β . The above proof of CR can be extended easily to λβη-calculus by adding 1

As a matter of fact we have M ∗ ≡ FGK (M ), that is, M ∗ is the result of performing one step according to the Gross–Knuth reduction strategy as defined in Definition 4.9.5(v) – in the version for the λ-calculus.

10.2 Fundamental theorems

557

(v) if M −→ ◦ N and x 6∈ FV (M), then λx.Mx −→ ◦ N to the inductive definition given in Definition 10.2.1, and by altering clause (iii) in the proof of Proposition 10.2.4 in the following way: ′∗ M if M ≡ M ′ x and x 6∈ FV (M ′ ), ∗ (λx.M) = λx.M ∗ otherwise. 10.2.6. Theorem. The reduction relation →βη is confluent. 10.2.7. Exercise. Prove the above theorem along the aforementioned lines.

10.2.2. Finiteness of developments The Finite Developments Theorem says that if we mark a set R of redex occurrences in a given term M and reduce only marked redex occurrences and redex occurrences which descend from marked redex occurrences, the reduction process always terminates. If we reduce every marked redex occurrence, then the order in which such reduction is performed does not matter: R uniquely determines a term N to which M reduces under any complete reduction of marked redex occurrences. If we, in addition, mark another set R′ of redex occurrences in M and follow this set throughout a complete (R)reduction, the redex occurrences from R′ may be transformed by substitution or copied. However, no matter in what way we perform a complete (R)reduction, the set of redex occurrences in N which descend from R′ is again uniquely determined. This theorem has important consequences, among them being the Parallel Moves Lemma and the Standardization Theorem. For the purpose of the theorem, we introduce a coloured version of λcalculus with the two colours red and green. Green redexes correspond to redexes which will be reduced, red redexes to redexes which will be followed and colourless redexes to redexes which are of no interest. 10.2.8. Definition. We extend the alphabet of λ-calculus by adding two coloured abstractors λg , λr . The set Ter(λc ) of coloured λ-terms is defined similarly to Ter(λ), but with the following clause added to Definition 10.1.1: (iv) if M, N ∈ Ter(λc ) and x is a variable, then (λg x.M)N, (λr x.M)N ∈ Ter(λc ). The βc -reduction relation, notation →c , on Ter(λc ) consists of β-reduction of green redexes: →c = {(C[(λg x.M)N], C[M[x := N]]) | C ∈ Ctx (λc ), M, N ∈ Ter(λc )} where Ctx (λc ), the set of coloured λ-contexts, is defined similarly to Ctx (λ), and substitution on coloured terms is defined as for Ter(λ) and assumed to be ‘safe’ as in Remark 10.1.9. 10.2.9. Exercise. Complete the above definition in the following way.

558

Lambda calculus

(i) Provide a rigorous definition of Ctx (λc ). (ii) Show that Ter (λc ) is closed under βc -reduction, i.e. show that if M ∈ Ter (λc ) and M → →c N , then N ∈ Ter (λc ). 10.2.10. Example. An example of a βc -reduction is e.g. (λg x.xx)((λg y.y)((λr z.u)v)) →c (λg x.xx)((λr z.u)v) →c ((λr z.u)v)((λr z.u)v) with ((λr z.u)v)((λr z.u)v) a βc -normal form.

Observe that coloured abstraction terms, i.e. expressions of the form λg x.M, λr x.M, are not elements of Ter(λc ). Colours can only occur as colours of ‘head’-lambdas in expressions of the forms (λg x.M)N and (λr x.M)N. It follows that βc -normal forms never contain any green. Note also that an essential feature of βc -reductions is that there is no creation of (sub)expressions of the form (λg x.M)N and (λr x.M)N. That is, in a βc -reduction sequence M0 →c M1 →c · · · every subterm (λg x.M)N, (λr x.M)N stems from some subterm (λg x.M ′ )N ′ , (λr x.M ′ )N ′ , respectively, in M0 . There can be βredexes created, though. 10.2.11. Definition. Let L ∈ Ter(λ). (i) The set of β-redex occurrences in L, notation RO(L), is defined by RO(L) = {h(λx.M)N | Ci | C ∈ Ctx (λ), C[(λx.M)N] ≡ L}. We let r, r ′ , r0 , r1 , . . . denote arbitrary redex occurrences. (ii) For R ⊆ RO(L), put – R1 = {hN | Ci | ∃M ∈ Ter(λ) hN | CMi ∈ R}, – R2 = {hN | Ci | ∃M ∈ Ter(λ) hN | MCi ∈ R}, – Rλ = {hN | Ci | ∃z hN | λz.Ci ∈ R}. Roughly speaking, if L is an application term and R is some set of β-redex occurrences in L, then R1 and R2 are those β-redex occurrences from R which occur in the first and second parts of the term, respectively; likewise, if L is an abstraction term, then Rλ contains those β-redex occurrences from R which occur in the body of L. 10.2.12. Exercise. Let L ∈ Ter (λ) and R ⊆ RO(L). Show that (i) if L ≡ M N , then R1 ⊆ RO(M ), (ii) if L ≡ M N , then R2 ⊆ RO(N ), (iii) if L ≡ λx.L′ , then Rλ ⊆ RO(L′ ).

10.2.13. Definition. (i) Let M ∈ Ter(λ) and G, R ⊆ RO(M). MG,R ∈ Ter(λc ), the coloured λ-term obtained from M by colouring in M the redex occurrences from G green and the redex occurrences from R−G red, is defined inductively by

10.2 Fundamental theorems

559

(1) xG,R ≡ x for every variable x,   (λg x.L′ )NG2 ,R2 if M ≡ λx.L and hMN | 2i ∈ G, (λr x.L′ )NG2 ,R2 if M ≡ λx.L and hMN | 2i ∈ R − G, (2) (MN)G,R ≡  MG1 ,R1 NG2 ,R2 otherwise ′ where L ≡ L(G1 )λ ,(R1 )λ , (3) (λx.M)G,R ≡ λx.MGλ ,Rλ . (ii) For M ∈ Ter(λc ), C ∈ Ctx (λc ) and σ a βc -reduction sequence, we write |M|, |C| and |σ| for the corresponding λ-term, λ-context and β-reduction sequence obtained by erasing the colours, respectively. 10.2.14. Definition. Let M ∈ Ter(λ), G, R ⊆ RO(M) and σ : MG,R ≡ M0 →c M1 →c · · · be a βc -reduction sequence. Then |σ| is called a development of M with respect to G, R. If σ terminates in a βc -normal form Mn , then |σ| is said to be a complete development of M with respect to G, R. In that case, there exists a unique R′ ⊆ RO(|Mn |) such that |Mn |∅,R′ ≡ Mn . 10.2.15. Example. Let N ≡ (λy.y)((λz.u)v), G = {h(λx.xx)N | 2i, hN | (λx.xx)2i}, R = {h(λz.u)v | (λx.xx)((λy.y)2)i} and M ≡ (λx.xx)N . Then MG,R ≡ (λg x.xx)((λg y.y)((λr z.u)v)) and σ : (λx.xx)((λy.y)((λz.u)v)) →β (λx.xx)((λz.u)v) →β ((λz.u)v)((λz.u)v) is a development of (λx.xx)((λy.y)((λz.u)v)) with respect to G, R obtained from the βc -reduction sequence in Example 10.2.10. In fact, σ is complete. For other examples, note that every single β-reduction step results from some complete development. For, if σ : C[(λx.M )N ] →β C[M [x := N ]], then σ is a complete development with respect to G = {h(λx.M )N | Ci} and R = ∅ obtained from the βc -reduction step C[(λg x.M )N ] →c C[M [x := N ]] which results in a βc -normal form.

In order to prove that every development is finite, we shall prove that →c is strongly normalizing. The method that is going to be used in the proof below is taken from van Raamsdonk and Severi [1995]. Different proofs can be found in e.g. Schroer [1965], Hyland [1973], Barendregt et al. [1976], Hindley [1976] and de Vrijer [1985]. In the present book there are alternative proofs of the finiteness of developments in Sections 4.4, 4.5 and in Subsection 11.5.1, in the respective settings of first- and higher-order term rewriting. 10.2.16. Definition. The set FD is the smallest set of coloured λ-terms satisfying (i) x ∈ FD for every variable x, (ii) if M ∈ FD, then λx.M ∈ FD, (iii) if M, N ∈ FD, then MN ∈ FD, and

560

Lambda calculus

(iv) if M[x := N], N ∈ FD, then (λg x.M)N, (λr x.M)N ∈ FD. 10.2.17. Exercise. Show that (i) FD is closed under substitution, i.e. show that if M, N ∈ FD, then M [x := N ] ∈ FD for every variable x, and (ii) if M [x := N ] ∈ Ter (λc ), then M ∈ Ter (λc ) for every variable x.

10.2.18. Proposition. Ter(λc ) = FD Proof. Using Exercise 10.2.17(i), one shows by induction on M ∈ Ter(λc ) that M ∈ FD. This yields Ter(λc ) ⊆ FD. Likewise, one shows by induction on M ∈ FD that M ∈ Ter(λc ) employing Exercise 10.2.17(ii). Combining both inclusions, one has Ter(λc ) = FD. 10.2.19. Exercise. Complete the proof of the above proposition.

So finiteness of developments is equivalent to the fact that each term in FD is strongly βc -normalizing. 10.2.20. Theorem. Every M ∈ FD is strongly βc -normalizing. Proof. We proceed by induction on M ∈ FD. The base case and the abstraction case are trivial. It remains to prove the last two induction steps with respect to Definition 10.2.16. (iii) Suppose that M ≡ NO with N, O ∈ FD. Since green abstraction terms are not elements of FD, M is not a βc -redex. This means that every βc -reduct of M is of the form N ′ O ′ with N → →c N ′ and O → →c O ′. By the induction hypothesis there are no infinite βc -reduction sequences of N or O. Hence also M must be strongly βc -normalizing. (iv) Suppose that M ≡ (λg x.N)O with N[x := O], O ∈ FD and let σ : M ≡ M0 →c M1 →c M2 →c · · · be a βc -reduction sequence of M. We distinguish two subcases: (a) For some i, Mi ≡ (λg x.N ′ )O ′ and Mi+1 ≡ N ′ [x := O ′ ] with N → →c N ′ ′ and O → →c O . Then N[x := O] is strongly βc -normalizing by the induction hypothesis and N[x := O] → →c N ′ [x := O ′]. Hence σ must be finite. (b) All terms in σ are of the form (λg x.N ′ )O ′ with N → →c N ′ and O → →c O ′ . Since N[x := O] is strongly βc -normalizing by the induction hypothesis, N is so. Moreover, O is strongly βc -normalizing by the induction hypothesis. So σ must be finite. The remaining case, M ≡ (λr x.N)O, is proved as in (b). We will now prove that all complete developments of an M with respect to some sets G, R ⊆ RO(M) terminate with the same result. In terms of →c this means that βc -normal forms are unique. 10.2.21. Proposition. The reduction relation →c is confluent.

10.2 Fundamental theorems

561

Proof. Since →c is strongly normalizing, it suffices by Newman’s Lemma, Theorem 1.2.1, to prove that →c is weakly confluent. Thus suppose that M1 c← M →c M2 in order to construct an M3 such that M1 → →c M3 c← ← M2 . For i = 1, 2, let Pi ≡ (λg x.Ni )Oi be the βc -redexes contracted and Pi′ their respective contracta. One has to check all the possible relative positions of P1 and P2 in M. E.g. case P1 ⊆ P2 : This case splits into the following two subcases. (i) P1 ⊆ N2 : Then M ≡ C[(λg x.C ′ [P1 ])O2 ], M1 ≡ C[(λg x.C ′ [P1′ ])O2 ] and M2 ≡ C[C ′ [P1 ][x := O2 ]] for some C, C ′ ∈ Ctx (λc ). Now, since P1 [x := O2 ] →c P1′ [x := O2 ], it follows that M1 →c M3 ≡ C[C ′ [P1′ ][x := O2 ]] ← M2 . (ii) P1 ⊆ O2 : Then M ≡ C[(λg x.N2 )C ′ [P1 ]], M1 ≡ C[(λg x.N2 )C ′ [P1′ ]] and M2 ≡ C[N2 [x := C ′ [P1 ]]] for some C, C ′ ∈ Ctx (λc ). Thus, since N2 [x := C ′ [P1 ]] → →c N2 [x := C ′ [P1′ ]], we can take M3 ≡ C[N2 [x := C ′ [P1′ ]]] and have M1 → →c M3 c← ← M2 . 10.2.22. Definition. Let σ : M → →β N and R ⊆ RO(M). Define the set of descendants of R in N relative to σ, denoted by R/σ, by induction on the length of σ as follows (i) If σ is the empty reduction sequence, then R/σ = R. (ii) Suppose σ : M → →β N ′ ≡ C[(λx.O)P ] →β C[O[x := P ]] ≡ N and let σ ′ : M → →β N ′ be σ minus the last reduction step. Colour in N ′ the redex occurrence h(λx.O)P | Ci green and the (remaining) redex occurrences ′ ′ ′ ′ from R/σ ′ red. Then N{h(λx.O)P |Ci},R/σ′ ≡ C [(λg x.O )P ] for certain unique C ′ ∈ Ctx (λc ) and O ′ , P ′ ∈ Ter(λc ) such that |C ′ | ≡ C, |O ′| ≡ O and |P ′| ≡ P . Now let R/σ be the unique set R′ ⊆ RO(N) such that N∅,R′ ≡ C ′ [O ′ [x := P ′ ]]. If R is a singleton, say R = {r}, we also write r/σ instead of {r}/σ, and if σ consists of a single reduction step in which e.g. the redex occurrence r is contracted, we write R/r instead of R/σ. 10.2.23. Example. Let P1 , P2 be two β-redexes, let M ≡ (λxy.x)P1 P2 and N1 ≡ (λy.P1 )P2 and let P1′ be the contractum of P1 . Furthermore, let R ⊆ RO(M ) be the set {hP1 | (λxy.x)2P2 i, hP2 | (λxy.x)P1 2i}, σ1 : M →β N1 , σ2 : M →β N1 →β P1 and σ3 : M →β N1 →β P1 →β P1′ . Then (i) R/σ1 = {hP1 | (λy.2)P2 i, hP2 | (λy.P1 )2i}, (ii) R/σ2 = {hP1 | 2i}, and (iii) R/σ3 = ∅. 10.2.24. Exercise. Let σ : M → →β N be a development of M with respect to some sets G, R ⊆ RO(M ) obtained from the βc -reduction sequence σ ′ : M ′ → →c N ′ ′ ′ ′ by erasing the colours. Show that for all G , R ⊆ RO(N ), if NG′ ,R′ ≡ N , (i) then R/σ = R′ − G′ , (ii) and R/σ = R′ provided σ is complete.

562

Lambda calculus

10.2.25. Theorem (Finite Developments Theorem). Let M ∈ Ter(λ) and G, R, R′ ⊆ RO(M). (i) Every development of M with respect to G, R is finite and can be extended to a complete development of M with respect to G, R. (ii) If σ, σ ′ are complete developments of M with respect to G, R and G, R′ , respectively, then σ, σ ′ end with the same term. (iii) If σ, σ ′ are complete developments of M with respect to G, R, then R/σ = R/σ ′ . 10.2.26. Exercise. Prove the above theorem.

10.2.27. Notation. In the context of complete developments, we shall in the sequel omit the mention of any red redex occurrences. That is, we shall speak of complete developments of some term M with respect to some G ⊆ G

RO(M) rather, and denote any such development simply by M −→ → N. Two developments of that kind may proceed differently; their respective end terms, however, coincide by the Finite Developments Theorem 10.2.25(ii). Moreover, for any R ⊆ RO(M) and any complete development σ of M with R/G

respect to G, R, we let N −−−→ → O denote any complete development of N with respect to R/σ. Note that this notation is licenced by the Finite Developments Theorem 10.2.25(iii). If G is a singleton, say G = {r}, we also r

R/r

G

write M → N instead of M −→ → N. Likewise, we write N −−−→ → O. If r = hN | Ci and we may assume that no confusion can arise as to which R/N

occurrence of N in M is contracted, we even write N −−−→ → O. 10.2.28. Exercise. Confluence of →β can also be obtained as a corollary to the Finite Developments Theorem 10.2.25 in the following way: define a binary relation G

−→ ⋄ on Ter (λ) by M → N if and only if for some G ⊆ RO(M ), M −→ → N . Now apply Exercise 1.3.3. That is, prove (i) that −→ ⋄ is subcommutative, and (ii) that −→ −→ ⋄ = → →β .

The concepts of marking and marked reduction in the λβ-calculus generalize directly to the λβη-calculus and can be used in a similar way to show that the Finite Developments Theorem holds for the λβη-calculus as well (see e.g. Barendregt, Bergstra, Klop and Volken [1976]). 10.2.3. Parallel moves In this section, we will show that there exists a strategy for constructing a common reduct of the end terms of two co-initial finite reduction sequences.

10.2 Fundamental theorems

563

This strategy consists of filling up a diagram as indicated in

viz. by stepwise adjoining more elementary diagrams where the right and lower sides are determined by sets of descendants relative to the upper and left sides. (See also Section 14.1 about reduction diagrams.) G

r

10.2.29. Lemma. Let M ∈ Ter(λ) and M → N0 , M −→ → N1 . Then there exists N2 ∈ Ter(λ) such that r / N0 M G/r

G r/G

N1

//

r

N2 G

Proof. Observe that both M → N0 and M −→ → N1 may be regarded as (partial) developments of M with respect to G ∪ {r}. Hence by the Finite Developments Theorem 10.2.25, they can be extended to a complete development of M with respect to G∪{r} such that the end terms of these extensions coincide. Clearly, the common end term N2 results from a complete development of N0 with respect to (G ∪ {r})/r and a complete development of N1 with respect to (G ∪ {r})/G. Now note that (G ∪ {r})/r = G/r and (G ∪ {r})/G = r/G. 10.2.30. Exercise. Prove that the set of descendants is determined by the descendants of a single redex occurrence relative to a single reduction step. That is, r let σ : M → →β N → L and σ ′ : M → →β N be σ minus the last reduction step. Then for every R ⊆ RO(M ), (i) R/σ = (R/σ ′ )/r, and S (ii) R/σ = r∈R (r/σ).

r

10.2.31. Corollary (Parallel Moves Lemma). Let M → N and σ : M ≡ r

r

ri+1

rn−1

0 M0 → · · · →i Mi+1 −→ · · · −→ Mn . Let, moreover, for 0 ≤ i ≤ n, σi be the initial part of σ of length i. Then the diagram of parallel moves (see Figure 10.1) commutes.

10.2.32. Exercise. Prove the Parallel Moves Lemma employing Lemma 10.2.29 and Exercise 10.2.30(i).

564

Lambda calculus

σ:

M ≡ M0 · · · r

ri

Mi

/

Mi+1 · · · Mn r/σi+1

r/σi

N ≡ N0 · · ·

Ni

//

ri /(r/σi )

r/σ

Ni+1 · · · Nn

Figure 10.1: Diagram of parallel moves r

r

ri+1

rn−1

0 10.2.33. Notation. Given σ : M ≡ M0 → · · · →i Mi+1 −→ · · · −→ Mn and r M → N, we shall henceforth denote the lower side of a diagram of parallel moves as in Figure 10.1, i.e. any reduction sequence of the form

r0 /r

ri−1 /(r/σi−1 )

ri+1 /(r/σi+1 )

ri /(r/σi )

rn−1 /(r/σn−1 )

N ≡ N0 −→ → · · · −−−−− −− −→ → Ni −− −−−→ → Ni+1 −−−−− −− −→ → · · · −−−−−−− −→ → Nn r

by σ/r. We call it the projection of σ over →. If r = hN | Ci and we may assume that no confusion can arise as to which occurrence of N in M is contracted, we write σ/N. 10.2.34. Example. Let ω ≡ λx.xx, L ≡ ωM , M ≡ ωN and N ≡ (λx.x)O. Then we have reductions and projections as indicated in Figure 10.2. G

r

10.2.35. Exercise. Let σ : M → → M ′ and M → N . Show that σ/r is a complete development of N with respect to G/r.

The Parallel Moves Lemma does not generalize directly to the λβη-calculus, but carries over for a modified concept of descendants (see Klop [1980] and Bethke, Klop and de Vrijer [2000]). 10.2.4. Standardization and normalization The Standardization Theorem is a useful result stating that if M → →β N, then there is a reduction sequence from M to N ‘standard’ in the sense that contractions are made from left to right, possibly with some jumps in between. One of its consequences is that normal forms, if existing, can be found by leftmost reduction sequences. We will present a proof due to Klop [1980]. Other proofs can be found in e.g. Curry and Feys [1958] and Mitschke [1979]. 10.2.36. Definition. Let C[N] ∈ Ter(λ). L(N, C) ⊆ RO(C[N]) is the set of redex occurrences from RO(C[N]) which properly contain the occurrence of N or lie entirely to the left of it in C[N]. More formally: (i) L(N, 2) = ∅,

10.2 Fundamental theorems

N:

N

L

/

ω(ωO)

L/N

L N

N/L :

MM

N / ωO(ωO)

ωOM /

M/N

M

(N/L)/M :

565

N / N / ONM OOM

NNM

M/(N/L)

N/ OO(ωO)

N

((N/L)/M)/N :

ONM

ONM

N

/

OOM

N

/

OO(ωO)

Figure 10.2: Projections

(ii) L(N, MC) = {hN ′ | C ′ (C[N])i | hN ′ | C ′ i ∈ RO(M)} ∪ {hN ′ | MC ′ i | hN ′ | C ′ i ∈ L(N, C)} if M is not an abstraction term, (iii) L(N, MC) = {hM(C[N]) | 2i} ∪ {hN ′ | C ′ (C[N])i | hN ′ | C ′ i ∈ RO(M)} ∪ {hN ′ | MC ′ i | hN ′ | C ′ i ∈ L(N, C)} if M is an abstraction term, (iv) L(N, CM) = {hN ′ | C ′ Mi | hN ′ | C ′ i ∈ L(N, C)} if C is not of the form λx.C ′ for some C ′ , x, (v) L(N, CM) = {h(C[N])M | 2i} ∪ {hN ′ | C ′ Mi | hN ′ | C ′ i ∈ L(N, C)}, if C ≡ λx.C ′ for some C ′ , x, (vi) L(N, λx.C) = {hN ′ | λx.C ′ i | hN ′ | C ′ i ∈ L(N, C)}. 10.2.37. Exercise. Let r, r ′ ∈ RO(M ). Show that either r = r ′ , r ∈ L(r ′ ) or r ′ ∈ L(r). r

r

0 1 10.2.38. Definition. Let σ : M0 → M1 → · · · be a finite or infinite reduction sequence and for i, j ∈ {0, 1, . . .} with i < j, σi,j be the subsequence of σ from Mi to Mj . Then σ is called a standard reduction sequence if for each i and any i < j, rj is not a descendant of a redex in Mi that is to the left of ri relative to the given reduction sequence from Mi to Mj . That is, σ is a standard reduction sequence if ∀i, j (i < j ⇒ rj 6∈ L(ri )/σi,j ).

10.2.39. Example. Consider the following reduction sequences.

566

Lambda calculus

(i) λx.(λy.(λz.z)yy)u →β λx.(λy.yy)u →β λx.uu, (ii) λx.(λy.(λz.z)yy)u →β λx.(λz.z)uu →β λx.uu. The two reduction sequences start, and end, with the same term. The first one is not standard: the occurrence of the redex contracted in the second reduction step, h(λy.yy)u | λx.2i, is a descendant of h(λ.y(λz.z)yy)u | λx.2i ∈ L((λz.z)y, λx.(λy.2y)u) relative to the first reduction step; the second reduction sequence is standard. r

r

0 1 10.2.40. Definition. Let σ : M0 → M1 → · · · be a finite or infinite reduction sequence and for i ∈ {0, 1, . . .} let σi be the initial subsequence of σ of length i.

(i) r ∈ RO(M) is contracted in σ if for some i, ri ∈ r/σi . The set of contracted redex occurrences from RO(M) is denoted by CO(σ). (ii) If σ is non-empty, then – lmc(σ) is the leftmost redex in M that is contracted in σ, i.e. lmc(σ) is the unique r ∈ CO(σ) with L(r) ∩ CO(σ) = ∅; – π(σ) = σ/lmc(σ). + 10.2.41. Exercise. Let σ0 : M0 →+ β M1 , σ1 : M1 →β M2 and σ be the concatenation of σ0 and σ1 . Prove that

(i) if lmc(σ) = lmc(σ0 ), then lmc(σ)/σ0 = ∅, (ii) if lmc(σ) 6= lmc(σ0 ), then lmc(σ)/σ0 = {lmc(σ1 )}, (iii) lmc(σ)/σ = ∅. lmc(σ)

′ 10.2.42. Lemma. If σ : M →+ →β M ′ . β M and M −−−→ N, then π(σ) : N →

Proof. Immediately by the Parallel Moves Lemma 10.2.31, since the set of descendants of lmc(σ) in M ′ relative to σ is empty. 10.2.43. Definition. Let σ : M → →β M ′ be a reduction sequence. Put 0 n+1 π (σ) = σ and for n ∈ N, π (σ) = π(π n (σ)). The standardization procedure for σ is the reduction sequence lmc(σ)

lmc(π(σ))

lmc(π 2 (σ))

lmc(π 3 (σ))

σs : M ≡ M0 −−−→ M1 −−−−− −→ M2 −−−−− −→ M3 −−−−− −→ · · · (See the figure below). It stops at Mn , if n is the least natural number such that π n (σ) is the empty reduction sequence.

10.2 Fundamental theorems σ:

M0 ≡ M

567

/ / M′ β

lmc(σ)

π(σ) :

M1

/ / M′ β

lmc(π(σ))

π 2 (σ) :

M2

/ / M′ β

lmc(π 2 (σ)) π 3 (σ) :

M3

/ / M′ β

The first thing we want to show now is that σs is finite.

G

10.2.44. Lemma. Let σ : M → → M ′ . Then σs is finite.

Proof. By Exercise 10.2.41, every π n (σ) is a complete development of Mn with respect to (. . . (G/lmc(σ)) . . .)/lmc(π n−1 (σ)). Hence every redex contracted in σs is a descendant of some redex in G relative to an initial subsequence of σs . Thus σs is a development of M with respect to G and hence is finite by the Finite Developments Theorem 10.2.25.

r

10.2.45. Exercise. Let σ : M → N , σ ′ : N → →β M ′ and ρ be the concatenation of σ and σ ′ . Define, as depicted below, R0 = {r}, Rn+1 = Rn /lmc(π n (ρ)) and Rn′ = lmc(π n (ρ))/Rn .

568

Lambda calculus ρ:

M0 ≡ M

r

M1

R1

/ / N1

/ / M′

β

R1′

lmc(π(ρ))

π 2 (ρ) :

/ / M′

β

R0′

lmc(ρ)

π(ρ) :

/ N ≡ N0

M2

R2

/ / N2

/ / M′

β

Employ Exercise 10.2.41 in order to show that for all n ∈ N, either (i) Rn′ = ∅ and then lmc(π n (ρ)) = lmc(π n (σ)), or (ii) Rn′ = {lmc(π n (σ ′ ))}.

10.2.46. Proposition. Let ρ : M → →β M ′ . Then ρs is finite. Proof. We prove the proposition by induction on the length of ρ. The base case is trivial. Suppose ρ has length n+1. Then ρ is the concatenation of some r reduction step σ : M → N and a reduction sequence σ ′ : N → →β M ′ of length n. By the induction hypothesis, σs′ is finite and thus π k (σ ′ ) is the empty reduction sequence for some k ∈ N. Assuming its notation, it follows from Exer′ cise 10.2.45 that for all k ≤ m, Rm = ∅ and hence lmc(π m (ρ)) = lmc(π m (σ)). As σ is a complete development, it follows moreover from Lemma 10.2.44 that σs is finite. Thus for some k ≤ l, π l (σ) is the empty reduction sequence. So π l (ρ) must be empty, i.e. ρs is finite. Now we will show that the standardization procedure indeed yields a standard reduction sequence. lmc(σ)

10.2.47. Exercise. Let σ : M → →β M ′ , M −−−→ M1 and R ⊆ L(lmc(σ)). Show that R/lmc(σ) ⊆ L(lmc(π(σ))).

10.2.48. Theorem (Standardization Theorem). Let σ be a finite βreduction sequence. Then σs is a standard reduction sequence for σ, i.e. (i) σ and σs have the same first, and last, term, and (ii) σs is a standard reduction sequence. Proof. (i) By its very definition, σ and σs start with the same term. Moreover, since σs is finite, π n (σ) must be empty for some n ∈ N. Hence Mn coincides with the last term of σ.

10.2 Fundamental theorems

569

Part (ii) follows from Exercise 10.2.47, since the descendants of any redex to the left of some lmc(π i (σ)) stay to the left of any lmc(π j (σ)) with i < j (relative to the reduction sequence from Mi to Mj ). One of the uses of the Standardization Theorem 10.2.48 is in proving statements of the form ‘M does not reduce to N’ or ‘every reduct of M has property P ’. 10.2.49. Exercise. Let N ≡ λxyz.z(xxy). Show that if N N y → →β M , then y ∈ FV (M ).

But it has many others, for example it is needed in the proof of the so-called Normalization Theorem. 10.2.50. Definition. (i) r ∈ RO(M) is the leftmost redex occurrence in M iff L(r) = ∅. r0 r1 (ii) Let σ : M0 → M1 → · · · be a finite or infinite reduction sequence. Then σ is called a leftmost reduction sequence if for each i, ri is the leftmost redex occurrence in Mi . 10.2.51. Theorem (Normalization Theorem). All leftmost reduction sequences are normalizing. That is, for each M ∈ Ter(λ), if M reduces to the normal form M ′ , then there is a leftmost reduction sequence σ : M → →β M ′ . 10.2.52. Exercise. (i) Let σ : M → →β N where N is a normal form. Show that σs is a leftmost reduction sequence. (ii) Prove Theorem 10.2.51.

The Standardization and Normalization Theorem hold also for λβη-calculus on the understanding that the concept of ‘descendant’ has to be modified in such a way that the Parallel Moves Lemma holds (see Klop [1980]). By a method very similar to the proof of the Standardization Theorem 10.2.48, one can also prove the completeness of inside-out reductions, as it is called in Welch [1975] and L´evy [1976]. Here the definition of inside-out reduction sequence (not to be confused with innermost reduction) is analogous to Definition 10.2.36(ii) but with to the left of replaced by proper subterm of. 10.2.53. Definition. (i) Let C[N] ∈ Ter(λ). S(N, C) ⊆ RO(C[N]) is the set of redex occurrences from RO(C[N]) which are properly contained in the occurrence of N. More formally: (1) (2) (3) (4)

S(N, 2) = RO(N) − {hN | 2i}, S(N, MC) = {hN ′ | MC ′ i | hN ′ | C ′ i ∈ S(N, C)}, S(N, CM) = {hN ′ | C ′ Mi | hN ′ | C ′ i ∈ S(N, C)}, S(N, λx.C) = {hN ′ | λx.C ′ i | hN ′ | C ′ i ∈ S(N, C)}.

570

Lambda calculus r

r

0 1 (ii) Let σ : M0 → M1 → · · · be a finite or infinite reduction sequence and for i, j ∈ {0, 1, . . .} with i < j, σi,j be the subsequence of σ from Mi to Mj . Then σ is called an inside-out reduction sequence if for each i and any i < j, rj 6∈ S(ri )/σi,j .

10.2.54. Examples. Consider the following reduction sequences. (i) (λx.xx)((λz.y)u) →β ((λz.y)u)((λz.y)u) →β y((λz.y)u) →β yy, (ii) (λx.xx)((λz.y)u) →β (λx.xx)y →β yy. The first reduction sequence is not inside-out, the second is.

10.2.55. Definition. Let σ be a finite or infinite reduction sequence of M. r ∈ CO(σ) is an innermost contracted redex iff S(r) ∩ CO(σ) = ∅. imc(σ) ⊆ RO(M) is the set of innermost contracted redex occurrences in M. We now define, analogously to the definition of the standardization procedure for a reduction sequence σ, inside-out reduction sequences for σ by repeated contractions of an innermost contracted redex instead of the leftmost contracted redex. 10.2.56. Definition. Let σ : M → →β M ′ be a reduction sequence. An r0 r1 inside-out reduction sequence for σ is a reduction sequence M ≡ M0 → M1 → · · · with ri ∈ imc(˜ π i (σ)), π ˜ 0 (σ) = σ and for n ∈ N, π ˜ n+1 (σ) = π ˜ n (σ)/rn . It n stops at Mn , if n is the least natural number such that π ˜ (σ) is the empty reduction sequence. We let IO(σ) be the set of inside-out reduction sequences for σ. The proof that every ρ ∈ IO(σ) terminates and is indeed an inside-out reduction sequence is entirely analogous to the corresponding proofs for σs . Now let σ:

M

β L

// N β

β L

be the reduction diagram corresponding to the construction of some M → →β L ∈ IO(σ). Then the bottom side is, as before, the empty reduction sequence; the right side, however, will be in general not empty. So we have 10.2.57. Proposition. Let σ : M → →β N be a reduction sequence. Then (i) IO(σ) 6= ∅, and (ii) for all ρ ∈ IO(σ), if ρ : M → →β L, then N → →β L. 10.2.58. Exercise. Prove the above proposition along the aforementioned lines.

10.3 Lambda definability

571

10.3. Lambda definability In λ-calculus, one can define several data types amongst which are the Booleans, the natural numbers and numeric functions. 10.3.1. Definition (Booleans). (i) true ≡ λxy.x, false ≡ λxy.y, (ii) not ≡ λx.x false true, and ≡ λxy.xy false. 10.3.2. Example. Observe that negation and conjunction behave appropriately on the truth values, i.e. (i) (ii) (iii) (iv)

not true →β true false true → →β false, not false →β false false true → →β true, and false x → →β false x false →β false, and true x → →β true x false →β x.

10.3.3. Exercise. Define the conditional combinator con ∈ Ter (λ) by con ≡ λxyz.xyz. (i) Show that con true xy → →β x and con false xy → →β y. (ii) Construct a combinator iff ∈ Ter (λ) such that iff true x → →β x and iff false x → →β not x.

There are several ways to define the numerals within λ-calculus. The following definition is due to Church. 10.3.4. Definition (Church numerals). (i) For M, N ∈ Ter(λ) and n ∈ N, define M n N inductively by M 0 N ≡ N, and M n+1 N ≡ M(M n N). (ii) The Church numerals p0q, p1q, p2q, . . . are defined by pnq ≡ λxy.xn y.

10.3.5. Proposition. Define add , times ∈ Ter(λ) by add ≡ λxyuv.xu(yuv) and times ≡ λxyz.x(yz) Then one has for all n, m ∈ N, (i) add pnq pmq → →β pn + mq, and (ii) times pnq pmq → →β pn × mq. 10.3.6. Exercise. (i) Prove that for n, m ∈ N, (pnqx)m y → →β xn×m y. (ii) Prove the above proposition.

572

Lambda calculus

This is further evidence that λ-calculus is not the trivial notation game that it might seem to be. In fact, λ-calculus is deep enough to give an alternative definition of the concept of computable function. For an extensive survey on the representation of recursive functions inside λ-calculus, the reader is referred to e.g. Barendregt [1984]. There are of course limits to what is definable in λ-calculus. For example, Berry [1978a] showed that the computation of a λ-definable function is of a sequential nature rather than of a parallel one. 10.3.7. Exercise. Show that there is no combinator δ ∈ Ter (λ) such that for all M, N ∈ Ter (λ), we have true if M =β N δM N =β false otherwise 10.3.8. Exercise (Selinger [1996] ). A function fm : Ter (λ)3 →Ter (λ) such that fm (N, N, M ) = M and fm (M, N, N ) = M for all M, N ∈ Ter (λ) is called a Mal’cev operator. Mal’cev operators are not definable within λ-calculus. (This is Exercise 16.5.8 in Barendregt [1984].) Show this non-definability result by proving the stronger fact that adding a combinator m to λ-calculus such that for all M, N ∈ Ter (λ), (i) mN N M =β M , (ii) mM N N =β M is inconsistent (i.e. all terms are then β-convertible).

10.4. Extensions of λ-calculus In this section, we shall study the effect of extensions of λ-calculus on confluence. 10.4.1. Definition. Let Γ be a set of constants not containing 2. (i) The set of λΓ-contexts, notation Ctx (λΓ), is the set Ter(λ(Γ ∪ {2})). (ii) Define the reduction relation →β,Γ ⊆ Ter(λΓ) × Ter(λΓ) by →β,Γ = {(C[(λx.M)N], C[M[x := N]]) | M, N ∈ Ter(λΓ), C ∈ Ctx (λΓ)}. (iii) Let R ⊆ Ter(λΓ) × Ter(λΓ). The extension of λ-calculus with Γ and R, notation λ ∪ R, is the ARS (Ter(λΓ), →β,Γ ∪ R). 10.4.2. Exercise (Mitschke: cf. Theorem 15.3.3 in Barendregt [1984]). Extend λ-calculus by a constant D and rules  DZ1 . . . Zn → M1 if P1 (Z1 , . . . , Zn )       .. .. .. ∆ . . .       DZ1 . . . Zn → Mk if Pk (Z1 , . . . , Zn )

10.4 Extensions of λ-calculus

573

Here the Mi are closed λD-terms, the Zi are meta-variables, and the Pi are n-ary predicates on Ter (λD). Prove that λ ∪ ∆ is CR provided that the predicates Pi are (i) pairwise disjoint, (ii) closed under β, D ∪ ∆-reduction, i.e. Pi (Z1 , . . . , Zn ) and Z1 · · · Zn →β,D∆ Z1′ · · · Zn′ ⇒ Pi (Z1′ , . . . , Zn′ ), (iii) closed under substitution, i.e. Pi (Z1 , . . . , Zn ) ⇒ Pi (Z1σ , . . . , Znσ ). 10.4.3. Exercise (Jacopini [1975]). Let M, N be λ-terms. The equation M = N is consistent with λ-calculus if λ + M = N is consistent, that is, not every equation P = Q is derivable from λ + M = N . Here we employ the usual notion of derivability; more explicitly, we say that the equation P = Q is derivable in λ-calculus extended by the equation M = N , notation λ + M = N ⊢ P = Q, if either (i) (ii) (iii) (iv) (v)

P ≡ M and Q ≡ N , or P =β Q, or P ≡ C[P ′ ], Q ≡ C[Q′ ] and λ + M = N ⊢ P ′ = Q′ , or λ + M = N ⊢ Q = P , or λ + M = N ⊢ P = L, λ + M = N ⊢ L = Q for some λ-term L.

Note that λ + M = N is consistent iff λ + M = N 6⊢ S = K. A λ-term M is called easy if it can consistently be equated to an arbitrary λ-term N . (i) Prove that Ω3 (≡ (λx.xxx)(λx.xxx)) is not easy. (ii) Prove that Ω (≡ (λx.xx)(λx.xx)) is easy. 10.4.4. Exercise. (i) Let M, N be λ-terms. Prove M easy ⇒ M N easy. (ii) Use (1) to prove that Y Ω is easy. 10.4.5. Exercise (Berarducci and Intrigila [1993]). Define: (i) A zero term is a closed λ-term that is not convertible (or equivalently, not reducible) to an abstraction term λx.P . (ii) C ⊆ Ter 0 (λ) is a confining class for M ∈ Ter (λ) if (a) X ∈ C ⇒ X is a zero term, (b) C is closed under β-reduction, (c) (‘the critical pair condition’) C[ ] 6≡ 2 and X ∈ C and C[X] ∈ C ⇒ C[M ] ∈ C (i) Prove the following theorem. Let C be a confining class for M . Let →µ be the reduction relation generated by B → M for every B ∈ C. Then λ∪ →µ is CR. (ii) Prove the corollary: if there is a confining class C for M , then λ + M = N is consistent for every N ∈ C. (iii) Using the corollary, prove that Ω, Ω3 I and Y Ω are easy.

574

Lambda calculus

Clearly, not every extension of λ-calculus is confluent. The absence of common reducts, however, is only due to either R or possible interferences of →β,Γ and R. For, as is easy to see, additional constants do not spoil the CR property for β-reduction, i.e. for every set Γ of constants, (Ter(λΓ), →β,Γ ) is confluent. Let (Σ, R) be an applicative TRS, i.e. let Σ consist of a binary application operator · and a set Γ of constants, and let R be a set of reduction rules. If we, as is usual, omit · when writing terms over Σ, we may regard the set of applicative terms over Σ, Ter(Σ), as a proper subset of Ter(λΓ), and in doing so we can consider extensions of λ-calculus with reduction relations generated by the reduction rules in R. 10.4.6. Definition. Let R = ({·} ∪ Γ, R) be an applicative TRS. (i) A substitution is a mapping σ : Ter(λΓ)→Ter(λΓ) such that – σ(c) = c for each c ∈ Γ, – σ(MN) = σ(M)σ(N), and – σ(λx.M) = λx.σ(M). We also write M σ instead of σ(M) and assume substitution to be safe as in Remark 10.1.9. (ii) Define the reduction relation →R,λ ⊆ Ter(λΓ) × Ter(λΓ) by →R,λ = {(C[lσ ], C[r σ ]) | l → r ∈ R, C ∈ Ctx (λΓ), σ is a substitution} (iii) The extension of λ-calculus by R , denoted by λ ∪ R, is the ARS λ ∪ →R,λ . 10.4.7. Exercise. Let (Σ, R) be an applicative TRS. Show that (Σ, R), considered as an ARS, and λβ-calculus are sub-ARSs of λ ∪ R. 10.4.8. Examples. (i) (ii) (iii) (iv)

λ ∪ {J0xy → y, J(Sz)xy → x(Jzxy)}, λ-calculus with iterator , λ ∪ {Rxy0 → x, Rxy(Sz) → yz(Rxyz)}, λ-calculus with recursor , λ ∪ {P0 (P xy) → x, P1 (P xy) → y}, λ-calculus with pairing, λ ∪ {P0 (P xy) → x, P1 (P xy) → y, P (P0 x)(P1 x) → x}, λ-calculus with surjective pairing , (v) λ ∪ {Dxx → E}, λ-calculus with test for syntactic equality .

As we will see in the sequel, only the extensions of λ-calculus with iterator, recursor and pairing are confluent; the other two extensions are not. This is due to possible interference of →β,Γ and →R,λ . For, as we will show below, if the TRS ({·} ∪ Γ, R) is confluent, then so is the ARS (Ter(λΓ), →R,λ ). 10.4.9. Proposition. Let ({·} ∪ Γ, R) be a confluent applicative TRS. Then (Ter(λΓ), →R,λ ) is confluent.

10.4 Extensions of λ-calculus

575

Proof. Choose a fresh constant L 6∈ Γ and let ι : Ter(λΓ) →Ter({·, L} ∪ Γ) be the mapping inductively defined by (i) ι(x) = x for every variable x, (ii) ι(c) = c for every constant c ∈ Γ, (iii) ι(MN) = ι(M)ι(N), and (iv) ι(λx.M) = Lxι(M). Observe that (ι(Ter(λΓ)), R′ ), where (t, t′ ) ∈ R′ iff t →R t′ and t, t′ ∈ ι(Ter(λΓ)), is a sub-ARS of ({·, L} ∪ Γ, R) (considered as an ARS). Thus, since ({·, L} ∪ Γ, R) is confluent (see Exercise 5.8.12), (ι(Ter (λΓ)), R′) is confluent too (see Exercise 1.3.21). Now confluence of (Ter(λΓ), →R,λ ) follows from the fact that for all M, N ∈ Ter(λΓ), M →R,λ N if and only if ι(M) →R′ ι(N) 10.4.10. Exercise. Complete the proof of the above proposition.

There are two syntactic conditions on confluent applicative TRSs which ensure confluence of the corresponding λ-calculus extensions. They are inspired by the following counterexamples. 10.4.11. Example. Consider the confluent TRSs R1 = ({·}, {xy → x}) and R2 = ({·, 0, 1, M, S}, {M xx → 0, M (Sx)x → 1}). (i) (M¨ uller [1992]) R1 is what will be called variable-applying. This causes λ ∪ {xy → x} to be not even weakly confluent: λx.x ←R,λ (λx.x)y →β y but λx.x and y do not have any common reduct. (ii) (Huet [1980], Breazu-Tannen [1988]) R2 is not left-linear and, as a result of this, λ ∪ {M xx → 0, M (Sx)x → 1} is not confluent: let Y be Turing’s fixed-point combinator as introduced in the proof of Theorem 10.1.24. Recall that Y N → →β N (Y N ) for every term N . Then 0 ←R,λ M (Y S)(Y S) → →β,Γ M (S(Y S))(Y S) →R,λ 1 but 0, 1 have no common reduct.

10.4.12. Definition. Let R = ({·} ∪ Γ, R) be an applicative TRS. R is called variable-applying if there is a reduction rule C[xt] → r ∈ R for some context C, variable x and term t. Following M¨ uller [1992], we shall show that λ-calculus extensions obtained from non-variable-applying, left-linear applicative TRSs are confluent. 10.4.13. Exercise. Let ({·} ∪ Γ, R) be a left-linear, non-variable-applying applicative TRS and C[(λx.M )N ], P ∈ Ter (λΓ) be such that C[(λx.M )N ] → →R,λ P . Show that β-redexes behave like variables under → →R,λ , i.e. show that there are some context C ′ ∈ Ctx (λΓ) and M1 , . . . , Mn , N1 , . . . , Nn ∈ Ter (λΓ) such that

576

Lambda calculus

→R,λ Ni , (i) for 1 ≤ i ≤ n, M → →R,λ Mi and N → ′ (ii) P ≡ C [(λx.M1 )N1 , . . . , (λx.Mn )Nn ], and (iii) for all O, O1 , . . . , On ∈ Ter (λΓ), if O → →R,λ Oi for 1 ≤ i ≤ n, then C[O] → →R,λ C ′ [O1 , . . . , On ].

10.4.14. Proposition. Let R = ({·}∪Γ, R) be a confluent applicative TRS. If R is left-linear and non-variable-applying, then →β,Γ commutes with →R,λ . Proof. Suppose that O ′′ ← ←β,Γ O → →R,λ O ′. We employ induction on the length of the reduction sequence O → →β,Γ O ′′ . The base case is trivˆ →β,Γ O ′′ . From the induction hypothial. Now suppose that O → →β,Γ O ˆ → esis it follows that for some P ∈ Ter(λΓ), O →R,λ P ← ←β,Γ O ′ . Since ˆ ≡ C[(λx.M)N] for some context C and β-redex (λx.M)N, it follows from O Exercise 10.4.13 that P ≡ C ′ [(λx.M1 )N1 , . . . , (λx.Mn )Nn ] for some context C ′ and M1 , . . . , Mn , N1 , . . . , Nn ∈ Ter(λΓ) such that for 1 ≤ i ≤ n, M → →R,λ Mi and N → →R,λ Ni . Observe that we then also have that M[x := N] → →R,λ Mi [x := Ni ] for all 1 ≤ i ≤ n. Hence using Exercise 10.4.13(iii), a common reduct of O ′′ and P can be found as suggested in the diagram below. O

R, λ

β, Γ

β, Γ ˆ ≡ C[(λx.M )N ] O

R, λ

/ / P ≡ C ′ [(λx.M1 )N1 , . . . , (λx.Mn )Nn ]

β, Γ

β, Γ O′′ ≡ C[M [x := N ]]

/ / O′

R, λ

/ / C ′ [M1 [x := N1 ], . . . , Mn [x := Nn ]]

10.4.15. Theorem. Let R = ({·} ∪ Γ, R) be a confluent applicative TRS. If R is left-linear and not variable-applying, then λ ∪ R is confluent. Proof. By the above proposition →β,Γ commutes with →R,λ . Then, since →β,Γ and →R,λ are confluent, it follows that →β,Γ ∪ →R,λ is confluent too (see Exercise 1.3.4). 10.4.16. Corollary. The following λ-calculus extensions are confluent: (i) λ-calculus with iterator, (ii) λ-calculus with recursor, and

10.4 Extensions of λ-calculus

577

(iii) λ-calculus with pairing. Proof. Observe that the added TRSs are orthogonal and hence confluent by Theorem 4.3.4. Thus, as they are also not variable-applying, the above theorem is applicable. As a second corollary to Theorem 10.4.15 we obtain that a much larger class of TRSs qualify for extension of λ-calculus. In fact, every so-called curried TRS (see also Subsection 3.3.5), constructed from a left-linear confluent TRS by omitting arities and treating function symbols as constants, can be added to λ-calculus without loss of confluence. This is of particular interest for the design and implementation of functional programming languages. 10.4.17. Definition. Let R = (Σ, R) be a TRS. (i) The function cur : Ter(Σ)→Ter({·} ∪ Σ) is defined inductively as follows. (1) cur(x) = x for every variable x, and (2) cur(F (t1, . . . , tn )) = F cur(t1) · · · cur(tn ) for every n ∈ N, every n-ary function symbol F ∈ Σ and all t1 , . . . , tn ∈ Ter(Σ). (ii) The curried TRS Rcur is the applicative TRS ({·} ∪ Σ, Rcur ) where R = {cur(l) → cur(r) | l → r ∈ R}. cur

10.4.18. Example. Consider the TRS of Example 2.3.1 which specifies the natural numbers with addition, multiplication, successor and zero. Its reduction rules are r1 r2 r3 r4

A(x, 0) A(x, S(y)) M (x, 0) M (x, S(y))

→ → → →

x S(A(x, y)) 0 A(M (x, y), x)

Currying produces an applicative TRS with the rules r1 r2 r3 r4

Ax0 Ax(Sy) M x0 M x(Sy)

→ → → →

x S(Axy) 0 A(M xy)x

Note that cur is never surjective: Ter({·} ∪ Σ) is much bigger than Ter(Σ), containing terms such as xx or, as in the example above, xA and Mxyz. This makes the proof that currying preserves confluence very complicated. Exceptionally, we will skip the preservation proof and refer the interested reader to Kahrs [1995a]. 10.4.19. Corollary. Let R = (Σ, R) be a confluent left-linear TRS. Then λ ∪ Rcur is confluent.

578

Lambda calculus

Proof. By a result of Kahrs [1995a], Theorem 5.2, Rcur is confluent. Moreover, Rcur is left-linear and not variable-applying. Thus Theorem 10.4.15 applies. In the remainder of this section, we shall show in this order that neither λcalculus with test for syntactic equality nor λ-calculus with surjective pairing is confluent (both systems, however, do have the unique normal form property: see Klop [1980], Klop and de Vrijer [1989]). In both cases we will invoke the Standardization Theorem 10.2.48 which, as is easy to see, also holds in the case of additional constants. That is, for every set Γ, M, N ∈ Ter(λΓ) and σ : M → →β,Γ N, there is a reduction sequence σ ′ : M → →β,Γ N in which the successive redex contractions take place from left to right. 10.4.20. Notation. On Ter(λ{D, E}), we shall write →β , → →β instead of →β,{D,E} and → →β,{D,E} , respectively, and denote the reduction relation generated by {Dxx → E} by →D . Moreover, we put →βD =→β ∪ →D . 10.4.21. Definition. For λ ∪ {Dxx → E}, let CD ≡ Y (λxy.Dy(xy)) and AD ≡ Y CD where Y ≡ (λxy.y(xxy))(λxy.y(xxy)) is Turing’s fixed-point combinator as introduced in the proof of Theorem 10.1.24. 10.4.22. Exercise. Apply Theorem 10.1.24 in order to prove (i) AD → →β CD AD , (ii) CD AD → →β DAD (CD AD ). Conclude that CD E ← ←βD CD AD → →βD E.

We will now prove that the above AD is a counterexample to confluence in λ ∪ {Dxx → E}. 10.4.23. Proposition. λ ∪ {Dxx → E} has the property PP(β, D), i.e. for all M, N ∈ Ter(λ{D, E}), if M → →βD N then M → →β N ′ → →D N for some ′ N ∈ Ter(λ{D, E}). Proof. According to Exercise 1.3.5(ii), it suffices to show that →β commutes with ←D . Now it is easily checked that for all M, N, N ′ ∈ Ter(λ{D, E}), if N →D M →β N ′ then there exists N ′′ ∈ Ter(λ{D, E}) such that M O

/ N′ OO

β

D

D N

β

/

N ′′

.

Hence →β and ←D are indeed commuting by Exercise 1.3.6.

10.4 Extensions of λ-calculus

579

10.4.24. Theorem. Lambda calculus with test for syntactic equality is not confluent. Proof. Consider co-initial reductions CD E ← ←βD CD AD → →βD E as in Exercise 10.4.22. We claim that CD E and E lack a common reduct, or equivalently (since E is a normal form), that CD E 6→ →βD E. For suppose that CD E → →βD E; then by the above proposition and the Standardization Theorem 10.2.48, there are reduction sequences σ : CD E → →β N, σ ′ : N → →D E for some N where σ is a standard β-reduction sequence. Now let τ be a shortest reduction sequence from CD E to E obtained by concatenating reduction sequences σ, σ ′ as above. Since the first part of τ is standard, it is easy to see that τ must be of the form CD E ≡ Y (λxy.Dy(xy))E ↓β (λy.y(Y y))(λxy.Dy(xy))E ↓β (λxy.Dy(xy))CD E ↓↓β  DE(CD E)     ↓↓β   N τ′   ↓↓D     DEE ↓D E

However, the reduction sequence τ ′ , indicated above, contains in an evident sense a reduction sequence τ ′′ from CD E to E, which is shorter than τ and which is again the concatenation of a standard reduction sequence CD E → →β ′ ′ ′ N and an N → →D E for some N . This contradicts the minimality of τ . Hence CD E 6→ →βD E. 10.4.25. Exercise. Let R be the applicative TRS with constants 0, +, − and the set R = {0 + x → x, (x + y) + z → x + (y + z), (−x) + x → 0} of rewrite rules. (Instead of +M N we use the infix notation M + N .) Show that λ ∪ R is not confluent. (Hint: Copy the preceding proof replacing CD by Y (λxy.(−y) + (xy)).)

For λ-calculus with surjective pairing, we argue similarly. 10.4.26. Notation. On Ter(λ{P, P0 , P1 }), we shall write →β , → →β instead of →β,{P,P0 ,P1 } and → →β,{P,P0 ,P1 } , respectively, and denote the reduction relation generated by {P0 (P xy) → x, P1 (P xy) → y, P (P0x)(P1 x) → x} by →sp . Moreover, we put →βsp =→β ∪ →sp . 10.4.27. Definition. For λ-calculus with surjective pairing, let (i) Csp ≡ Y (λxy.P (P0(Ωy))(P1 (Ω(xy)))) and (ii) Bsp ≡ Ω(Csp Asp )

580

Lambda calculus

where Asp ≡ Y Csp and Y is again Turing’s fixed-point combinator. 10.4.28. Exercise. Prove that Bsp ← ←βsp Csp Asp → →βsp Csp Bsp .

For λ-calculus with surjective pairing we do not have postponement of spreduction steps as before. (E.g. consider P (P0 I)(P1 I)I →sp II →β I.) Locally, however, the situation is the same; to be more precise: the reduction graph of Csp Asp , G(Csp Asp ), has the property PP(β, sp). The proof of this property requires more complex arguments, though. 10.4.29. Exercise. Prove that for all C[P0 M ], C[P1 M ] ∈ G(Csp Asp ), M ≡ ΩM ′ for some M ′ ∈ Ter (λ{P, P0 , P1 }). Conclude that G(Csp Asp ) is a subARS of λ ∪ {(C[P (P0 (ΩM ))(P1 (ΩM ))], C[ΩM ]) | M ∈ Ter (λ{P, P0 , P1 }), C ∈ Ctx (λ{P, P0 , P1 })}.

10.4.30. Proposition. G(Csp Asp ) has the property PP(β, sp). Proof. We make use of the auxiliary TRS R = ({·} ∪ Γ, R) where Γ = {E, P, P0, P1 } and R = {Ex → P (P0(Ex))(P1 (Ex))}. Observe that R is confluent, since it is orthogonal (see Theorem 4.3.4). Moreover, since R is not variable-applying, we may invoke Proposition 10.4.14. Thus →β,Γ commutes with →R,λ . Furthermore, by the above exercise, sp-reduction in G(Csp Asp ) is restricted to Ω-prefixed P -reduction. This means that sp-reduction in G(Csp Asp ) can be thought of as the converse of the reduction given by R. To be more precise, define ι : Ter(λ{P, P0 , P1 })→Ter(λ{E, P, P0, P1 }) inductively by – ι(x) = x for every variable x, – ι(c) = c for every constant c ∈ {P, P0 , P1 }, E if MN ≡ Ω, – ι(MN) ≡ ι(M)ι(M ′ ) otherwise, – ι(λx.M) = λx.ι(M). Then one easily verifies that ι is an embedding of G(Csp Asp ) into R satisfying for all M, M ′ ∈ G(Csp Asp ), N ∈ Ter(λΓ), – ι(M) →β,Γ N iff ∃N ′ ∈ G(Csp Asp ) (N = ι(N ′ ) and M →β N ′ ), – M →sp M ′ iff ι(M ′ ) →R,λ ι(M). We now show that →β commutes with ←sp in G(Csp Asp ). It then follows from Exercise 1.3.5(ii) that PP(β, sp) holds on G(Csp Asp ). Thus let M, N, O ∈ G(Csp Asp ) be such that M → →sp O → →β N. Then ι(M) ← ←R,λ ι(O) → →β,Γ ι(N). Hence, since →β,Γ commutes with →R,λ , ι(M) → →β,Γ O ′ ← ←R,λ ι(N) for some O ′ ∈ Ter(λΓ). Now let O ′′ ∈ G(Csp Asp ) be such that ι(O ′′ ) = O ′. Then we have M → →β O ′′ → →sp N. 10.4.31. Theorem. Lambda calculus with surjective pairing is not confluent.

10.5 Simply typed λ-calculus

581

←βsp Csp Asp → →βsp Csp Bsp as in ExProof. Consider the reductions Bsp ← ercise 10.4.28. We claim that Bsp and Csp Bsp lack a common reduct. For suppose Bsp → →βsp L ← ←βsp Csp Bsp for some L. By the above proposition and the Standardization Theorem 10.2.48 we may assume that Csp Bsp → →βsp L is special in the sense that it consists of a standard β-reduction sequence followed by a sp-reduction sequence. Now let τ : Csp Bsp → →βsp L be a shortest special reduction sequence for which a reduction sequence Bsp → →βsp L exists. Observe that, since Bsp → →βsp L, L ≡ ΩL′ for some L′ . Analogously to the proof of Theorem 10.4.24, τ then must be of the form Csp Bsp ≡ Y (λxy.P (P0 (Ωy))(P1 (Ω(xy))))Bsp ↓β (λy.y(Y y))(λxy.P (P0 (Ωy))(P1 (Ω(xy))))Bsp ↓β (λxy.P (P0 (Ωy))(P1 (Ω(xy))))Csp Bsp ↓↓β  P (P0 (ΩBsp ))(P1 (Ω(Csp Bsp )))     ↓↓β   ′ P (P0 (ΩM ))(P1 (ΩM )) τ′   ↓↓sp     ′ ′ P (P0 (ΩL ))(P1 (ΩL )) ↓↓sp ΩL′

But then the above indicated reduction sequence τ ′ clearly contains a shorter special reduction sequence Csp Bsp → →βsp L′ for which a reduction sequence Bsp → →βsp L′ exists. This contradicts the minimality of τ . Hence Bsp and Csp Bsp lack a common reduct. 10.4.32. Exercise. Let R be the applicative TRS with constants ⊤, ⊥, if then else and the set R = {if ⊤ then x else y → x, if ⊥ then x else y → y, if x then y else y → y} of rewrite rules. (Instead of if then else xyz we use the infix notation if x then y else z.) Show that λ ∪ R is not confluent. (Hint: Copy the preceding proof replacing Csp by Y (λxy.if Ω then Ωy else Ω(xy)).)

10.5. Simply typed λ-calculus In this section we consider simply typed λ-calculus which is derived from the (type free) λ-calculus but has a rather different character as a result of the typing. For example, self-application, which has been at the root of many of the problems we have encountered, is outlawed, there are no fixed-point combinators and all terms are strongly normalizing. An extensive survey of typed λ-calculi can be found in Barendregt [1992]. There are two approaches that can be taken in defining a typed calculus. The first, originated by Curry [1934]; see also Curry and Feys [1958] and Curry et al. [1972], is called implicit typing: the terms are the same as in the

582

Lambda calculus

type free calculus and each term has a set of possible types assigned to it. The second approach, originated by Church [1940], is called explicit typing: terms are annotated with type information which uniquely determines a type for the term. In the following, we will follow Church’s approach. 10.5.1. Definition. The set of types, Typ, is defined inductively by (i) 0 ∈ Typ, (ii) if ϑ, ϑ′ ∈ Typ, then ϑ → ϑ′ ∈ Typ, (iii) if ϑ, ϑ′ ∈ Typ, then ϑ × ϑ′ ∈ Typ. 10.5.2. Definition. The alphabet of the simply typed λ-calculus consists of distinct, countably infinite sets of variables V ar ϑ = {xϑ0 , xϑ1 , . . .} for each ϑ ∈ Typ, an abstractor λ and parentheses (, ). From this alphabet the sets Ter(λϑ ) of λ-terms of type ϑ are defined inductively as follows: (i) xϑ0 , xϑ1 , . . . ∈ Ter(λϑ ), ′ ′ (ii) if M ∈ Ter(λϑ→ϑ ) and N ∈ Ter(λϑ ), then (MN) ∈ Ter(λϑ ), and ′ ′ (iii) if M ∈ Ter(λϑ ), then (λxϑ .M) ∈ Ter(λϑ→ϑ ). S The set Ter(λτ ) of typed λ-terms is ϑ∈Typ Ter(λϑ ).

Free and bound variables, closed terms, subterms, substitution and the set of contexts Ctx (λτ ) are defined in the obvious way by analogy to the type free calculus. But the types of terms have to fit: that is, M[xϑ := N] is defined only if N ∈ Ter(λϑ ); likewise C[M] is defined only if C[M] ∈ Ter(λτ ). 10.5.3. Notation. The type superscripts of the variables will be suppressed where possible, viz. whenever the type is either clear from the context or not essential. We shall adopt the same convention tacitly for the type-labelled relation, operations and constants yet to be defined. The notions of reduction in the typed λ-calculus are the obvious analogues of the notions that we introduced in the type free case. 10.5.4. Definition. The relations →β τ , →ητ ⊆ Ter(λτ ) × Ter(λτ ) are defined by →β τ = {(C[(λx.M)N], C[M[x := N]]) | (λx.M)N ∈ Ter(λτ ), C ∈ Ctx (λτ )} and →ητ = {(C[λx.Mx], C[M]) | x 6∈ FV (M), C ∈ Ctx (λτ )} We will refer to the ARS (Ter(λτ ), →β τ ) as λτ -calculus , and to the ARS (Ter(λ), →β τ ∪ →ητ ) as λβη τ -calculus and denote its reduction relation by →βητ .

10.5 Simply typed λ-calculus

583

Strong normalization of λβη τ -calculus was proved independently within the scope of a stronger system by Dragalin [1968], Gandy [1980], Hinata [1967], Hanatani [1966], Tait et al. [1967]. Here we will present a proof based on de Vrijer [1987a] which in addition to termination also determines upper bounds of lengths of reduction sequences. The proof proceeds by associating with every typed λ-term M an increasing functional. This is a typical application of the semantical proof methods introduced in Section 6.2 to a higher-order rewriting system (see also van de Pol [1994]). 10.5.5. Definition. For ϑ ∈ Typ, the class of increasing functionals of type ϑ, notation IF ϑ , and the well-founded order