Lecture Notes. Formal Languages

Lecture Notes Formal Languages Winterterm 2011/2012 Prof. Dr. Heribert Vollmer [email protected] Institut f¨ ur Theoretische Informatik Uni...

Author: Kathleen Cole

17 downloads 2 Views 484KB Size

Report

Download PDF

Recommend Documents

Lecture Notes on Programming Languages

Formal Languages and Automata

Software Verification and Testing. Lecture Notes: Introduction to Formal Methods

Automata Theory and Formal Languages

Formal Languages Finite State Machines

Automata Theory and Formal Languages

Formal Languages, Grammars, and Automata

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Lecture notes - Lecture 1

Principles of Programming Languages Topic: Formal Languages I

FORMAL LANGUAGES FOR THE RELATIONAL MODEL

THEORY OF FORMAL LANGUAGES WITH APPLICATIONS

Formal Properties of XML Grammars and Languages

Automata and Formal Languages - CM0081 Turing Machines

Lecture 8: Programming Languages: Syntax

Formal Languages and Compilers Lecture V: Parse Trees and Ambiguous Gr

Lecture Notes on Pricing

Psychology 365 Lecture Notes

Lecture Notes in Physics

Lecture notes Physiotherapist

Lecture Notes in Mathematics

Lecture Notes Formal Languages Winterterm 2011/2012 Prof. Dr. Heribert Vollmer [email protected]

Institut f¨ ur Theoretische Informatik Universit¨at Hannover

Version of 13th April 2012

Contents

i

Contents 1 Regular Languages 1.1 Finite Automata: Definitions and Examples 1.2 Theorem of Myhill-Nerode . . . . . . . . . . 1.3 Minimal Automata . . . . . . . . . . . . . . 1.4 Automata and Semigroups . . . . . . . . . . 1.4.1 Computation of syntactic monoids . 1.5 Finite Automata with Output . . . . . . . . 1.6 Two-way automata . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

1 1 3 6 9 11 14 18

2 Context-free Languages 2.1 Chomsky Normal Form and CYK-Algorithm . . 2.2 Greibach Normal Form and Pushdown Automata 2.3 Deterministic context-free Languages . . . . . . . 2.4 Decidability Questions . . . . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

21 23 27 35 37

3 Context-sensitive Languages and Type-0-Languages 3.1 Machine characterizations for languages of type 0 and typ 1 . . . . . . . . 3.2 Decidability and Closure Properties . . . . . . . . . . . . . . . . . . . . .

46 46 48

Symbols list

53

. . . . . . .

. . . . . . .

1 Regular Languages

1

1 Regular Languages 1.1 Finite Automata: Definitions and Examples Definition: An alphabet is a finite set of symbols or letters. Notation: Σ, Γ, ∆, . . ., for letters: a, b, c . . .. A word (or a string) over the alphabet Σ is a finite sequence of symbols from Σ. Notation u, v, w, x, y, z. Notation: w = a1 a2 a3 . . . ak , where a1 , . . . , ak are in this order the sequence values. Thus one can understand w as a mathematical sequence with finite domain w : N → Σ such that w(i) = wi = ai and w = w(1) · · · w(n) holds. The length |w| of a word w is the number of elements in the sequence w. Let Σ be an alphabet, then Σ∗ is the set of all words over Σ. If w ∈ Σ∗ , a ∈ Σ, then |w|a is the number of occurrences of a in w. The empty word is denoted with ε and it holds that |ε| = 0. For u, v ∈ Σ∗ , let u◦v or uv be the concatenation of the words u and v. Thus, u◦v : N → Σ is defined as ( u(i) , 1 ≤ i ≤ |u| u ◦ v(i) = v(i − |u|) , |u| < i ≤ |u| + |v|. The free monoid over Σ is then (Σ, ◦) (◦ is associative and ε is the identity). Definition: A (deterministic) finite automaton (short: DFA) is a 5-tuple M = (Z, Σ, δ, z0 , E), where – Z is a finite set of states, – Σ an alphabet, – δ : Z × Σ → Z the transition function, – z0 ∈ Z the initial state und – E ⊆ Z the set of accepting states. Define δˆ : Z × Σ∗ → Z, the extended transition function, inductively by ˆ ε) = z δ(z, ˆ ax) = δ(δ(z, ˆ δ(z, a), x) ˆ x). for all z ∈ Z, a ∈ Σ and x ∈ Σ∗ . Notation: zx or z · x for δ(z, ˆ 0 , x) ∈ E}. The language accepted by M is L(M ) = {x ∈ Σ∗ | δ(z

1.1

Finite Automata: Definitions and Examples

2

Definition: A nondeterministic automaton (short: NFA) is a 5-tuple M = (Z, Σ, δ, z0 , E), where – Z, Σ, z0 and E are defined as for DFAs, – δ : Z × Σ → P(Z) is the transition function. Define δˆ : P(Z) × Σ∗ → P(Z) by ˆ ε) = Y δ(Y, [ ˆ ax) = ˆ δ(Y, δ(δ(z, a), x) z∈Y

for all Y ⊆ Z, a ∈ Σ and x ∈ Σ∗ . ˆ 0 }, x) ∩ E 6= ∅}. The language accepted by M is L(M ) = {x ∈ Σ∗ | δ({z Nondeterministic and deterministic finite automata have the same expressivity: every DFA can be read as a NFA. For every NFA M = (Z, Σ, δ, z0 , E) there exists a DFA M 0 with L(M ) = L(M 0 ), i.e., M 0 = (P(Z), Σ, δ 0 , {z0 }, {Y ⊆ Z | Y ∩ E 6= ∅}) with δ 0 (Y, a) =

S

z∈Y

δ(z, a).

Now it holds that L(M ) = L(M 0 ). The syntax of regular expressions over Σ is defined as follows: – ∅ is a regular expression. – For a ∈ Σ also a is a regular expression. – Let α and β be regular expressions, then also αβ, (α + β) and (α)∗ are regular expressions. The language accepted by the regular expression α, in symbols L(α), is defined as follows: – L(∅) = ∅. – L(a) = {a} for all a ∈ Σ. – L(α + β) = L(α) ∪ L(β) for regular expressions α, β. – L(αβ) = L(α) ◦ L(β) for regular expressions α, β, where L0 ◦ L1 = {w ∈ Σ∗ | there is a u ∈ L0 and a v ∈ L1 mit w = uv}. – L(α∗ ) = L(α)∗ for a regular expression α, where [ L∗ = Lk with L0 = {ε} and Lk+1 = Lk ◦ L for k ≥ 0. k≥0

1.2

Theorem of Myhill-Nerode

Often one writes L+ =

S

k≥1

3

Lk .

Theorem: Let L be a language. The following statements are equivalent – L is regular, i.e., there is a type-3-grammar which generates L. – There exists a DFA M with L = L(M ). – There exists a NFA M with L = L(M ). – There exists a regular expression α with L = L(α). Theorem (Pumping-Lemma for regular languages): Let L be regular. Then there exists a natural number n ∈ N, such that for all words x ∈ L with |x| ≥ n there exist u, v, w with x = uvw, and the following holds (i) |v| ≥ 1 (ii) |uv| ≤ n (iii) For all i ∈ N it holds that: uv i w ∈ L. Example: Define Σ = {a, b}. (i) (aa + ab + ba + bb)∗ is the language of all words of even length. (ii) (a∗ ba∗ b)∗ a∗ is the language of all words containing an even number of b’s. (iii) L = {an bn | n ≥ 0} is not regular. Assume that L is regular: choose n as in the Pumping-Lemma and consider x = an bn . It holds that |x| ≥ n. Choose u, v and w with |v| ≥ 1, |uv| ≤ n. Then it holds that uv 2 w ∈ / L and that is a contradiction to the assumption L being regular. (iv) L = {w | |w|a = |w|b } is not regular. Assume L is regular: a∗ b∗ is regular, thus also the language L0 = L ∩ L(a∗ b∗ ) = {an bn | n ≥ 0} must be regular. But L0 is not regular due to (iii).

1.2 Theorem of Myhill-Nerode Definition: A relation ∼ ⊆ M × M on a set M is an equivalence relation, if the following holds: (i) ∼ is reflexive, i.e., for all x ∈ M it holds that x ∼ x, (ii) ∼ is symmetric, i.e. it holds that x ∼ y ⇒ y ∼ x for all x, y ∈ M , (iii) ∼ is transitive, i.e. it holds that x ∼ y and y ∼ z ⇒ x ∼ z for all x, y, z ∈ M . [x]∼ = {y ∈ M | x ∼ y} is the equivalence class of x ∈ M . M/∼ is the set of all equivalence classes of ∼. The index of ∼ is the cardinality of M/∼. ∼ is right-invariant, if x ∼ y implies that for all u ∈ Σ∗ it holds: xu ∼ yu.

1.2

Theorem of Myhill-Nerode

4

Example: Σ = {0, 1}. It holds that x ∼ y iff |x| ≡ |y| (mod 2). The index of ∼ is 2 and ∼ is right-invariant. Definition: Let L ⊆ Σ∗ . The natural equivalence relation of L is defined via x ∼L y iff for all u ∈ Σ∗ it holds that: xu ∈ L ⇔ yu ∈ L. Theorem: Let L ⊆ Σ∗ . The following statements are equivalent: (i) L is regular. (ii) There exists a right-invariant equivalence relation ≈ on Σ∗ with finite index, such that L can be written as a union of equivalence classes of ≈. (iii) ∼L has a finite index. Proof: (i) ⇒ (ii): Let L be regular, and let L = L(M ) denote the DFA M = (Z, Σ, δ, z0 , E). W.l.o.g. assume that all states in Z are reachable from the initial state z0 . Now define a relation ≈ on Σ∗ as follows: x ≈ y ⇔ z0 x = z0 y, ˆ 0 , x) = δ(z ˆ 0 , y) for x, y ∈ Σ∗ . Now we will prove that ≈ has the desired i.e., x ≈ y ⇔ δ(z properties: – ≈ is reflexive, symmetric and transitive. The transitivity holds as for x ≈ y and y ≈ u, u ∈ Σ∗ be chosen arbitrarily ⇒ z0 x = z0 y and z0 y = z0 u ⇒ z0 x = z0 u, hence x ≈ u. – ≈ is right-invariant: let x ≈ y, hence z0 x = z0 y. Let u ∈ Σ∗ be chosen arbitrarily. Then it holds that z0 (xu) = (z0 x)u = (z0 y)u = z0 (yu), thus xu ≈ yu. – Let x ∈ Σ∗ . Then it holds that [x]≈ = {y ∈ Σ∗ | z0 x = z0 y}. For each z ∈ Z define Lz = {x ∈ Σ∗ | z0 x = z}. Then the sets Lz are equivalence classes of ≈ and for every equivalence class [x]≈ there exists a z with [x]≈ = Lz . The number of equivalence classes of ≈ is at most |Z|, hence finite. S – L can be written as L = z∈E Lz . Notation for the equivalence relation constructed above: ≈M . (ii) ⇒ (iii): Let ≈ be an equivalence relation such that (ii) holds. We show that ≈ is a refinement of ∼L , i.e., for every word x ∈ Σ∗ it holds that: [x]≈ ⊆ [x]∼L . Then the index of ∼L is not larger than the index of ≈, hence finite. Consider some [x]≈ . Let y ∈ [x]≈ , hence x ≈ y. Then it holds for all u ∈ Σ∗ that xu ≈ yu, thus xu ∈ L off yu ∈ L. Hence it holds that x ∼L y and therefore y ∈ [x]∼L is true. Consequently it holds that [x]≈ ⊆ [x]∼L . (iii) ⇒ (i): First we show that ∼L is right-invariant.

1.2

Theorem of Myhill-Nerode

5

Σ? ∼L ≈ Figure 1: ≈ is a refinement of ∼L . Let x ∼L y and v ∈ Σ∗ . We have to show that xv ∼L yv, i.e., for every w ∈ Σ∗ it holds that: xvw ∈ L ⇔ yvw ∈ L. As x ∼L y is true, for all u ∈ Σ∗ it holds that xu ∈ L ⇔ yu ∈ L. The equivalence from above immediately follows with u = vw. Now we define a DFA Mmin = (Zmin , Σ, δmin , z0min , Emin ) as follows: – Zmin = Σ∗ / ∼L , the set of equivalence classes of ∼L . – δmin ([x]∼L , a) = [xa]∼L for all [x]∼L ∈ Zmin , a ∈ Σ. δmin is well defined as ∼L is right-invariant: y ∈ [x]∼L implies xu ∼L yu for all u ∈ Σ∗ , thus in particular it holds that xa ∼L ya aund [xa]∼L = [ya]∼L for a ∈ Σ. – z0min = [ε]∼L . – Emin = {[x]∼L | x ∈ L}. Mmin accepts L because: x ∈ L(Mmin ) ⇔ z0min x ∈ Emin ⇔ [ε]∼L · x ∈ Emin ⇔ [x]∼L ∈ Emin ⇔ x ∈ L | {z } =[εx]=[x]

Corollary (Satz von Myhill-Nerode, 1957/58): A language L ⊆ Σ∗ is regular iff ∼L has a finite index. Example: L = {an bn | n ≥ 0} For i 6= j it holds that ai L aj , because ai bi ∈ L, but aj bi ∈ / L. Thus, there exists infinite many different equivalence classes [ε], [a], [aa], [aaa], . . . and L ist not regular. Example: L = {x ∈ {0, 1}∗ | x ends with 00} – [ε]∼L = {x | x does not end with 0} Let x = y1 for some y ∈ {0, 1}∗ . Then it holds that x ∼L ε, because for w ∈ Σ∗ the following holds: xw ∈ L ⇔ w = w0 00 for an appropriate w0 ⇔ ε · w = w ∈ L. Let x = y0 for some y ∈ {0, 1}∗ . Then it holds that x L ε, because x0 ∈ L and ε·0=0∈ / L.

1.3

Minimal Automata

6

– [0]∼L = {x | x ends with 0, but does not end with 00} – [00]∼L = {x | x ends with 00} As [ε]∼L ∪ [0]∼L ∪ [00]∼L = {0, 1}∗ holds these are all equivalence classes for L. Thus the index of ∼L equals 3 and therefore L is regular. Then (minimal) finite automaton Mmin for L is: 1 [ε]

0

[0]

0

[00]

1 0

1 δ

input symbol 0

state

[ε]∼L [0]∼L [00]∼L

[0]∼L [00]∼L [000]∼L = [00]∼L

1 [ε1]∼L = [1]∼L = [ε]∼L [01]∼L = [ε]∼L [001]∼L = [ε]∼L

1.3 Minimal Automata Theorem: Let L be a regular language. Then there exists a unique (up to isomorphism) minimal deterministic finite automata (i.e., a DFA with minimal number of states) which accepts L. Proof: We prove that the automaton Mmin constructed in the last proof is in fact the unique minimal automaton. (i) Mmin has minimal number of states. The proof shows the following: from an arbitrary DFA M = (Z, Σ, δ, z0 , E) accepting L we construct ≈M with index ≤ |Z|. Furthermore it holds that |Zmin | = index of ∼L ≤ index of ≈M ≤ |Z|. Thus Mmin does not contains more states than M . (ii) Mmin is unique. We will prove the following: If M = (Z, Σ, δ, z0 , E) is a DFA with L(M ) = L(Mmin ) and |Z| = |Zmin | then M and Mmin are isomorphic. If z ∈ Z then choose x ∈ Σ∗ such that z0 · x = z (x must exist as M does not contain non-reachable states due to its minimality). z corresponds in Mmin to the state zmin = z0min · x = [ε]∼L x.

1.3

Minimal Automata

7

The isomorphism z 7→ zmin is well defined because if y 6= x holds with z0 x = z0 y = z then also x ≈M y, wherefore x ∼L y holds and therefore z0min · x = [ε]∼L · x = [x]∼L = [y]∼L = z0min · y. One can easily show that the defined mapping is bijective (Exercise).

Algorithm to construct the minimal automaton: Given a DFA M = (Z, Σ, δ, z0 , E) assume w.l.o.g. that all states in M can be reached from the initial state. Input: DFA M = (Z, Σ, δ, z0 , E) Output: minimal automaton Mmin mit L(Mmin ) = L(M ). Method: Remove every state that is not reachable from z0 . Construct a table of sets {z, z 0 } with z 6= z 0 and z, z 0 ∈ Z. Mark every set {z, z 0 } with z ∈ E and z 0 ∈ / E. while a set has been marked do If {z, z 0 } is unmarked and there is an a ∈ Σ such that {δ(z, a), δ(z 0 , a)} is marked then mark {z, z 0 }. fi od Merge states z, z 0 such that {z, z 0 } is not marked. 4

The algorithm has a runtime of O(|Z| · |Σ|), hence polynomial runtime w.r.t. the length of the encoding of the input automaton. Through a smart implementation with priority 2 queues one can reach a runtime of O(|Z| · |Σ|). Example: Given the following deterministic finite automaton: z0

0

0

1

z2

1 1

0

z1

1 0

z3

z4

0,1

1.3

Minimal Automata

8

This leads to the following table: ×

z1 z2 z3 z4

× × × z0

× z1

× × z2

× z3

Consequently one constructs the following minimal automaton:

z0 , z2

0

z1 , z3

0

z4

1 0,1

1

Corollary: The equivalence problem of regular languages, i.e., the problem Given: Question:

DFAs M1 , M2 Does L(M1 ) = L(M2 ) hold?

is decidable in polynomial runtime. Proof: The following algorithm solves the problem within desired runtime: (i) M1min := minimal automaton of M1 . (ii) M2min := minimal automaton of M2 . (iii) Outputs yes“ iff M1min is isomorphic to M2min . ”

In case of nondeterministic finite automata the following holds: – The uniqueness of minimal nondeterministic automata does not hold. – The equivalence problem of NFAs is PSPACE-complete. – The following problem is PSPACE-complete, too. Given: Question:

NFA M , k ∈ N Does there exist an NFA Mmin with at most k states such that L(Mmin ) = L(M ) holds?

1.4

Automata and Semigroups

9

1.4 Automata and Semigroups Definition: Let M be a set and ◦ ⊆ M × M a closed binary relation . Then (M, ◦) is called a monoid if the following two properties hold (i) There is an element e ∈ M s.t. m ◦ e = e ◦ m = m for all m ∈ M (identity element). (ii) (m1 ◦ m2 ) ◦ m3 = m1 ◦ (m2 ◦ m3 ) hold for all m1 , m2 , m3 ∈ M (associativity). If ≈ is a equivalence relation on M , then ≈ is a congruence if the following additional property holds: (iii) m1 ≈ m2 and m01 ≈ m02 implies m1 ◦ m01 ≈ m2 ◦ m02 . Also define [m1 ]≈ ∗ [m2 ]≈ := [m1 ◦ m2 ]≈ . Then (M/≈ , ∗) is a monoid. Observation: For every alphabet Σ the set Σ∗ together with the operation concatenation form a monoid, and Σ+ together with concatenation form a semigroup. Definition: Let L ⊆ Σ∗ . Define x ≡L y iff for all u, v ∈ Σ∗ it holds that: uxv ∈ L ⇔ uyv ∈ L Observation: ≡L is a congruence on Σ∗ (i.e., x ≡L y ⇒ ∀u, v ∈ Σ∗ : uxv ≡L uyv). Definition: Let L ⊆ Σ∗ . – ≡L is the syntactic congruence of L. – Mon(L) = (Σ∗ / ≡L , ∗) is the syntactic monoid of L together with the operation [x]≡L ∗ [y]≡L = [x · y]≡L . The corresponding natural homomorphism ηL : Σ∗ → Mon(L), x 7→ [x]≡L is referred to as syntactic homomorphism. Example: L = {w ∈ {0, 1}∗ | |w| ≡ 0 (mod 2)} It holds that: x ≡L y ⇔ |x| ≡ |y| (mod 2). Thus there are two equivalence classes: [ε] and [0]. ∗ [ε] [0]

[ε] [ε] [0]

[0] [0] [ε]

Mon(L) is isomorphic to the group of order 2. ( [ε], if |x| is even, ηL (x) = [0], otherwise.

1.4

Automata and Semigroups

10

Example: L = {w ∈ {0, 1}∗ | |w|1 ≡ 0 (mod 2)} It holds that: x ≡L y ⇔ |x|1 ≡ |y|1 (mod 2). Thus there exist the equivalence classes [ε] and [1]. ∗ [ε] [1]

[ε] [ε] [1]

[1] [1] [ε]

Mon(L) is again isomorphic to the group of order 2. ( [ε], if |x|1 is even, ηL (x) = [1], otherwise. Theorem: Let L ⊆ Σ∗ . L is regular iff Mon(L) is finite. Proof: ⇒“: Let L = L(M ) via the DFA M = (Z, Σ, δ, z0 , E). Define the equivalence ” relation ∼ = on Σ∗ as follows: x∼ = y iff for all z ∈ Z it holds that zx = zy The number of equivalence classes of ∼ = is ≤ the number of functions f : Z → Z which is |Z| equal to |Z| (since every equivalence class is determined through a mapping Z → Z). We show that ∼ = is a refinement of ≡L , i.e., for every x ∈ Σ∗ it holds that [x]∼ = ⊆ [x]≡L . Consequently the index of ≡L is not greater than the index of ∼ =, hence Mon(L) is finite. ∗ ∼ Let y ∈ [x]∼ = , x = y and let u, v ∈ Σ . If uxv ∈ L then it holds that

z0 · uxv = ((z0 u)x)v = ((z0 u)y)v = z0 (uyv) ∈ E, hence uyv ∈ L. In the same way one can show that uyv ∈ L ⇒ uxv ∈ L holds. Thus: x ≡L y. ⇐“: Assume that Mon(L) is finite. Now define the DFA ” M = (Σ∗ / ≡L , Σ, δ, [ε]≡L , {[w]≡L | w ∈ L}), {z } | {z } | =id

=E

where id is the identity element of Mon(L) and δ([w]≡L , a) = [wa]≡L . Thus it holds that w ∈ L(M ) ⇔ [ε]≡L · w ∈ E = [w]≡L ∈ E ⇔ w ∈ L. Hence L is regular. Definition: Let M = (Z, Σ, δ, z0 , E) be a DFA. For a string w ∈ Σ∗ define fw : Z → Z via fw (z) = z · w. The set of all such mappings {fw | w ∈ Σ∗ } together with the operation fx · fy := fxy is called the transition monoid of M , and will be denoted with Mon(M ).

1.4

Automata and Semigroups

11

Theorem: Let Σ be an alphabet and let L ⊆ Σ∗ be a language. If M is a minimal automaton of L then Mon(M ) and Mon(L) are isomorphic. Proof: Let M = (Z, Σ, δ, z0 , E) be a minimal automaton for L. It suffices to prove: fx = fy iff x ≡L y. Consequently Mon(M ) and Mon(L) are isomorphic via the following isomorphism: π : Mon(M ) → Mon(L), π(fx ) = [x]≡L . ⇒“: Assume that it holds that fx = fy . Let u and v be arbitrarily chosen strings over ” Σ. Then it holds that uxv ∈ L ⇔ z0 · uxv ∈ E ⇔ ((z0 u)x)v ∈ E ⇔ ((z0 u)y)v ∈ E ⇔ z0 (uyv) ∈ E ⇔ uyv ∈ L, hence it Holds that x ≡L y. (Please note that the minimality of M is not important for the proof.) ⇐“: Assume that it holds that x ≡L y. Choose arbitrarily a z ∈ Z. Let u ∈ Σ∗ be a ” string such that z0 · u = z holds. If such a u does not exist, then z is not reachable and hence M not minimal. Set z1 = fx (z) = zx = z0 ux and z2 = fy (z) = zy = z0 uy. From

ux v ∈ L ⇔ |{z}

=z1 =z0 ux

uy v ∈ L one deduces z1 v ∈ E ⇔ z2 v ∈ E for all v ∈ Σ∗ . As |{z}

=z2 =z0 uy

M is minimal it follows that z1 = z2 , too. As z ∈ Z was arbitrarily chosen it follows that fx = fy holds. Note: The inverse implication of the previous theorem does not hold. The examples of above correspond to different languages with different minimal automata but same syntactic monoids. 1.4.1 Computation of syntactic monoids Example: Let A = {a, b, c} be an alphabet and L = a∗ ba∗ ca∗ be a regular expression over A. Then the following automaton is minimal: a

1

a

a b

c

2 c

b

4

a, b, c

3

b, c

1.4

Automata and Semigroups 1 1 1 = 2 1 = 4 1 = 3 1 = 4

If w ∈ a∗ , then

fw =

If w ∈ a∗ ba∗ , then

fw

If w ∈ a∗ ca∗ , then

fw

If w ∈ L, then

fw

Otherwise

fw

2 2 2 4 2 3 2 4 2 4

12 4 =1 4 4 =α 4 4 =β 4 4 =γ 4 4 =0 4

3 3 3 5 3 4 3 4 3 4

M(L) emerges to

1 α β γ 0

1 1 α β γ 0

α α 0 0 0 0

β β γ 0 0 0

γ γ 0 0 0 0

0 0 0 0 0 0

Reason/motiviation for the algebraic approach Note:

– regular expressions: ∪, ·,∗ (can define all reg. languages)

– star-free reg. expressions: ∪, ·,¯· Question: Given L, is there a star-free reg. expr. that defines L? Theorem: Let L be a regular expression. L is star-free iff M on(L) is aperiodic (groupfree), thus decidable. Example: Let A = {a, b, c} be an alphabet and L = A∗ \ A∗ bbA∗ be a regular expression over A. Then the following automaton is minimal: a, c

a, b, c b

1

2 a, c

1 = [] α = [ab] β = [ba] γ = [bab] δ = [a] 0 = [bb]

1 2 3 1 2 3 1 2 3 ∗ ∗ = L ∩ A b \ bA , 2 2 3 1 2 3 ∗ ∗ = L ∩ bA \ A b , 1 3 3 1 2 3 ∗ = L ∩ (bA b ∪ b) , 2 3 3 1 2 3 ∗ ∗ = L \ ({} ∪ bA ∪ A b) , 1 1 3 1 2 3 ∗ = A bbA∗ , 3 3 3 = {}

,

b

3

1.4

Automata and Semigroups

13

M (L) emerges to 1 1 α β γ δ 0

1 α β γ δ 0

α α α γ γ α 0

β β 0 β 0 δ 0

γ γ 0 γ 0 α 0

δ δ δ β β δ 0

0 0 0 0 0 0 0

Example: A = {a, b}

a

1

a,b

2

a

3

a

...

a

n−1

a

n

b b

b

b

fa = (1 2 3 · · · n) fb = (1 2) fa and fb generate Sn . Only elements from Sn can be generated. (Every fw ∈ M (N ) can be written as a product of fx1 fx2 ...fxn , where w = x1 ...xn .). Thus it holds that M (L) = Sn . Systematic computation of syntactic monoids Given: L ⊆ Σ∗ (i) Determine the minimal automaton N for L. (ii) Arrange Σ. Start a tabular of transformations and a list of equations. (iii) Computer successively all transformations in M of words of length 1, 2, 3, .., until no new transformation is computed, as follows: a) For the step from length n to length n + 1 consider all words uai where ai ∈ A and u is a word of length n taken from the tabular. b) Now let v = uai . If the transformation w.r.t. v is not found in the tabular, then insert it unless it is already found and associated with word w. For the negative case insert v = w into the list of equations.

1.5

Finite Automata with Output

14

Example: L = A∗ abaA∗ , A = {a, b}. N: b

start

1

a,b

a a

b

2

a

3

4

b a 0, w = xa for x ∈ Σ∗ and a ∈ Σ: ˆ ε , xa) = δ(δ(f ˆ ε , x), a) δ(f = δ(fx , a) = fxa

Definition von δˆ by induction hypothesis definition of δ

ˆ ε , w) = f (w). Claim B: For all w ∈ Σ∗ it holds that λ(f Proof of claim by induction on |w|: ˆ ε , ε) = ε = f (ε), by definition of λ ˆ and because f is length preserving. |w| = 0: λ(f |w| > 0, w = xa for x ∈ Σ∗ and a ∈ Σ: ˆ ε , xa) = λ(f ˆ ε , x) ◦ λ(δ(f ˆ ε , x), a) λ(f ˆ = f (x) ◦ λ(δ(fε , x), a) = f (x) ◦ λ(fx , a) = f (x) ◦ fx (a) = f (xa)

ˆ definition of λ by induction hypothesis claim A definition of M f sequential

Altogether it hence follows that: ˆ 0 , w) = λ(f ˆ ε , w) = f (w) TM (w) = λ(z for all w ∈ Σ∗ .

1.6 Two-way automata The model of computation of finite automata only allows the head to move into one direction. A two-way automata is a more general model of computation where the head is allowed to move both directions of the input tape. It is not allowed to leave the input on the left part. The computation ends if the input is left on the right part. Then the automaton accepts if an accepting state is reached. Now the question arises whether this generalization enlarges the overall computational power. Definition: A deterministic two-way automaton (short: 2DFA) is a 5-tuple M = (Z, Σ, δ, z0 , E), where – Z, Σ, z0 and E are defined as for DFAs and – δ : Z × Σ → Z × {L, R} is the transition function. A configuration of M is an element of Σ∗ · Z · Σ∗ . A configuration wzx ∈ Σ∗ · Z · Σ∗ represents the following sutiation of M :

1.6

Two-way automata

19

– input word is wx, – state is z and – the head position is on the first letter of x. If x = ε then the head has left the input on the right side. In the following we will define how M on input w = a1 · · · an changes in a single step from configuration K into configuration K 0 , in symbols K `M K 0 : a1 · · · ai−1 zai ai+1 · · · an `M a1 · · · ai−1 ai z 0 ai+1 · · · an , if δ(z, ai ) = (z 0 , R), a1 · · · ai−2 ai−1 zai · · · an `M a1 · · · ai−2 z 0 ai−1 ai · · · an , if δ(z, ai ) = (z 0 , L), i ≥ 2. Note: In configuration za1 · · · an it is not possible to move the head to the left, in configuration a1 · · · an z no move of the head is possible. Definition: Let `∗M be the reflexive and transitive closure of `M . The language L(M ) accepted by a 2DFA M = (Z, Σ, δ, q0 , E) is defined as L(M ) = {w ∈ Σ∗ | z0 w `∗M wz for some z ∈ E}. Example: Let M = ({z0 , z1 , z2 }, {0, 1}, δ, z0 , {z0 , z1 , z2 }) be a 2DFA with δ as follows: δ z0 z1 z2

0 (z0 , R) (z1 , R) (z0 , R)

1 (z1 , R) (z2 , L) (z2 , L)

On input 101001 M produces the computation z0 101001 `M `M `M `M `M `M `M `M `M `M

1z1 01001 10z1 1001 1z2 01001 10z0 1001 101z1 001 1010z1 01 10100z1 1 1010z2 01 10100z0 1 101001z1

M accepts because z1 ∈ E. Usually computations are visualized via a path of the head on the input where for every move from one symbol to the next the recent state is denoted:

1.6

Two-way automata

20

1 z0

0 z1

1

0

0

1

z1 z2 z0

z1

z1

z1 z2 z0

z1

If one considers the input string 101101 then M computes the following 1 z0

0 z1

1

1

0

1

z1 z2 z0

z1

z2

z2

z0

M enters an infinite loop as below the second cell border from left the same state and the same move of the head appears as four steps prior to this. Note: The automaton of above consists only of accepting states but does not accept the language {0, 1}∗ but (0 + 1)∗ 11(0 + 1)∗ . Also 2DFAs with only accepting states do not necessarily accept Σ∗ . Observation: If L is regular, then there exists a 2DFA M with L(M ) = L. n o Example: Define Ln = ak ak−1 · · · an · · · a1 ai ∈ {0, 1} for 1 ≤ i ≤ k and an = 1 for 0 < n ∈ N. – Every DFA M with L(M ) = Ln consists of at least 2n states. (Myhill-Nerode: For u, v ∈ {0, 1}n with u 6= v it holds that u Ln v.) – There exists a 2DFA M with 2n − 1 states and L(M ) = Ln . (Exercise)

2 Context-free Languages

21

Theorem: Let L ⊆ Σ∗ . If there exists a 2DFA M with L(M ) = L, then L is regular. The following proof follows “J. C. Shepherdson. The reduction of Two-Way Automata to One-Way Automata, IBM Journal of Research and Development, vol. 3(2) pp. 198–200, 1959”. Proof: Let M = (Z, Σ, δ, z0 , E) be a 2DFA. For x ∈ Σ∗ define τx : {¯ z0 } ∪ Z → Z ∪ {0} by  0 z , if M leaves x to the right in state z 0 ,     after entering x from the right in z τx (z) =  0 , if M does not leave x to the right in state z 0 ,    after entering x from the right in z and τx (¯ z0 ) =

 0 z      0   

, if M after , if M after

leaves x to the right in state z 0 , entering x from the left in z0 does not leave x to the right in state z 0 , entering x from the left in z0

for z ∈ Z. The number of such functions τx is ≤ (1 + |Z|)(1+|Z|) . Now define x ≈M y iff τx = τy . The index of ≈M is finite. Now observe that x ≈M y ⇒ x ∼L(M ) y. Thus ≈M is a refinement of ∼L(M ) and hence ∼L(M ) has a finite index which lets us deduce by the theorem of Myhill-Nerode that L(M ) is regular.

2 Context-free Languages Kontext-free languages are defined by context-free grammars. A context-free grammar (short: cf. grammar, CFG) is a 4-tuple G = (V, Σ, P, S), where – V is a finite set of variables (non-terminal symbols), – Σ is the terminal alphabet, V ∩ Σ = ∅, – P ⊆ V × (V ∪ Σ)∗ is a set of productions/rules and – S ∈ V is the starting symbol. A context-free grammar is said to be strict, if P ⊆ (V × (V \ {S} ∪ Σ)+ ) ∪ {S → ε}. For every context-free grammar G there exists a strict context-free grammar G0 such that L(G) = L(G0 ). Notation:

2 Context-free Languages

22

For (α, β) ∈ P we write α → β ∈ P For A → α1 ∈ P, . . . , A → αn ∈ P use the shortcut: A → α1 | · · · |αn ∈ P . α ⇒G β: β is produced from α via application of a rule in P , i.e., α ⇒G β, if there are α1 , α2 ∈ (V ∪ Σ)∗ and A → γ ∈ P , such that α = α1 Aα2 and β = α1 γα2 . ⇒∗G is the reflexive and transitive closure of ⇒G . A derivation is a sequence of strings, which can be produced from the starting variable of a grammar and result into a string over the terminal alphabet, i.e., a sequence S ⇒G w1 ⇒G w2 ⇒G . . . ⇒G wn with wi ∈ (V ∪ Σ)∗ , 1 ≤ i < n and wn ∈ Σ∗ . The language produced from a context-free grammar is L(G) = {w ∈ Σ∗ | S ⇒∗G w}. CFL is the set of context-free languages, i.e., the set of all L(G) for context-free grammars G. Theorem (Pumping-Lemma, uvwxy-Theorem): Let L be a context-free language. Then there exists a number n, such that all words z ∈ L with |z| ≥ n can be written as z = uvwxy, such that the following properties hold (i) |vx| ≥ 1 (ii) |vwx| ≤ n (iii) For all i ≥ 0 it holds that: uv i wxi y ∈ L Example:

– Dyck-language over the alphabet Σ:

ˆ = Σ ∪ Σ0 . D∗ is the language which Define Σ0 = {¯ a | a ∈ Σ} (“copy” of Σ) and Σ Σ can be produced by the following grammar ˆ P, S) G = ({S, T }, Σ, where P = {S → T S|ε} ∪ {T → aS¯a | a ∈ Σ}. ∗ E.g., let. Σ = {(, [, }. DΣ is the language of all correct bracket∗ ˆ expressions with brackets of Σ. A word of DΣ is

(< []() > [])[< ()() >]. – L = {w ∈ Σ∗ | w = wR , i.e., w is a palindrome} ∈ CFL: The context-free grammar G = ({S}, Σ, {S → aSa|a|ε a ∈ Σ}, S) produces L.

2.1

Chomsky Normal Form and CYK-Algorithm

23

– L = {ak bk ck | k ≥ 0} is not context-free: Assume L is context-free. Thus there exists an n ∈ N as in the Pumping-Lemma. Now choose z = an bn cn . It holds that |z| = 3n ≥ n und z ∈ L. Let z = uvwxy be an arbitrary decomposition of z with |vx| > 0 and |vwx| ≤ n. z = a ... a vwx =

b ... b

c ... c

= an bn cn

Thus we have: in vwx occur at most two of the three letters a, b and c. It hence follows uv 0 wx0 y = uwy ∈ /L and this a contradiction to the assumption of L being context-free. Closure properties of the class CFL – L1 , L2 ∈ CFL ⇒ L1 ∪ L2 , L1 ◦ L2 , L∗1 ∈ CFL: (Exercise) – CFL is closed under homomorphisms and inverse homomorphisms: (Exercise) – CFL is not closed under intersection: Let L1 = {an bn cm | n, m ≥ 0} and L2 = {an bm cm | n, m ≥ 0}. It holds that L1 , L2 ∈ CFL, because a context-free grammar for L1 is G = ({S, A, B}, {a, b, c}, P, S) with the productions S → AB, A → aAb|ε, B → cB|ε. But for the intersection L1 ∩ L2 it holds that L1 ∩ L2 = {ak bk ck | k ≥ 0} ∈ / CFL (see example from above). – CFL is not closed under complement: Assume converse, then CFL would be closed under intersection (by de Morgan) and this is a contradiction.

2.1 Chomsky Normal Form and CYK-Algorithm Definition: A context-free grammar G = (V, Σ, P, S) with ε ∈ / L is said to be in ChomskyNormalform (CNF), if for all productions it holds that A → BC or A → a for A, B, C ∈ V and a ∈ Σ. Theorem: Let G be a context-free grammar with ε ∈ / L(G). Then there exists a contextfree grammar G0 in CNF with L(G) = L(G0 ).

2.1

Chomsky Normal Form and CYK-Algorithm

24

Proof: Let G = (V, Σ, P, S) be a context-free grammar with ε ∈ / L(G). We will state an algorithm which modifies the productions of G such that only productions of the kind from above remain: Step I: Elimination of chain rules (i.e., productions of the form A → B with A, B ∈ V ). (i) If there are variables B1 , B2 , . . . , Bk ∈ V with B1 → B2 , B2 → B3 , . . . , Bk−1 → Bk , Bk → B1 then replace in all productions the variables B1 , B2 , . . . , Bk with a fresh variable B and delete the productions from above. (ii) Order the variables in V such that V = {A1 , . . . , An } and such that from Ai → Aj ∈ P it follows that i < j. (iii) For the descending order k = n − 1, n − 2, . . . , 1, execute the following step (iv): (iv) If there exists a production Ak → Ak0 ∈ P with k 0 > k, then delete Ak → Ak0 and add for every production Ak0 → α ∈ P a production Ak → α to P . Now all productions are of the form: A → a (for A ∈ V and a ∈ Σ) or A → α (for A ∈ V and |α| ≥ 2) Step II: Elimination of Terminals. For all a ∈ Σ add a fresh variable Ba to V and add a fresh production Ba → a to P . Now replace in every production A → α (with |α| ≥ 2) on the right side the letter a by Ba . Now all productions are of the form: A → a (for A ∈ V, a ∈ Σ) or A → B1 B2 . . . Bk (for A, B1 , B2 , . . . , Bk ∈ V, k ≥ 2) Step III: Elimination of long productions. For every production A → B1 . . . Bk with k ≥ 3 add fresh variables C2 , . . . , Ck−1 to V and replace in P the production from above by A C2 .. .

→ →

B1 C 2 B2 C 3 .. .

Ck−1

→

Bk−1 Bk

2.1

Chomsky Normal Form and CYK-Algorithm

25

The CYK-Algorithm The word problem for a language L ⊆ Σ∗ is defined by: Input: Question:

a string w ∈ Σ∗ Does w ∈ L hold?

Let G = (V, Σ, P, S) be a context-free grammar in CNF. If x = a, a ∈ Σ and A ⇒∗ x hold, then there must exists a production A → a in P . If x = a1 a2 . . . an , n ≥ 2 and A ⇒∗ x hold, then there must exist a production A → BC and a k with 1 ≤ k < n, such that B ⇒∗ a1 . . . ak and C ⇒∗ ak+1 . . . an hold. Notation: For x ∈ Σ∗ denote with xi,j the substring of x, which starts at position i and is of length j. Thus in the upper paragraph it holds that: B ⇒∗ x1,k und C ⇒∗ xk+1,n−k . The consideration from above is the main idea of the algorithm of Cocke, Younger and Kasami: Use an array T [1 . . . n, 1 . . . n] with the meaning A ∈ T [i, j] iff A ⇒∗ xi,j . CYK-Algorithm: Input: x = a1 . . . an Method: for i = 1 to n do T [i, 1] := {A ∈ V | A → ai ∈ P }; for j = 2 to n do for i = 1 to n + 1 − j do // aim: T [i, j] = {A | A ⇒∗ ai,j } begin T [i, j] := ∅; for k = 1 to j − 1 do T [i, j] := T [i, j] ∪ {A ∈ V |A → BC, B ∈ T [i, k] ∧ C ∈ T [i + k, j − k]}; end; if S ∈ T [1, n] then accept else reject; Theorem: The word problem of context-free languages is decidable in time O(n3 ). Note: There exists a faster algorithm, which is called the Earley-Algorithm, which solves the problem in time O(n2.6... ). Further the Earley-Algorithm is more sophisticated and harder to implement. Example: Consider the context-free grammar in CNF with the following productions: S A Input: x = baaba, n = |x| = 5

→ →

AB|BC BA|a

B C

→ →

CC|b AB|a

2.1

Chomsky Normal Form and CYK-Algorithm x= i→

26

b 1

a 2

a 3

b 4

B

A, C

A, C

B

2

S, A

B

S, C

S, A

3

∅

B

B

4

∅

S, A, C

5

S, A, C

j ↓ 1

a 5 A, C

It holds that S ∈ T [1, 5], thus it holds that x ∈ L(G). Hereby one can construct the following derivation: S S

⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒

BC bC bAB baB baCC baABC baaBC baabC baaba

B

C A

b

B

a

C

C

A

B

a

a

a

Another possible derivation from the matrix of above is: S S

⇒ ⇒ ⇒∗ ⇒ ⇒∗

AB BAB baB baCC baaba

A

B

B

A

b

a

C

C

A

B a

a

b

2.2

Greibach Normal Form and Pushdown Automata

27

2.2 Greibach Normal Form and Pushdown Automata Pushdown automata M : finite control (state) reading head, moves right stack a1

a2

...

an

input tape D C

reading head is always on the uppermost stack symbol

B A # Definition: A (nondeterministic) push down automaton (kurz: PDA, NPDA) is a 7-tuple M = (Z, Σ, Γ, δ, z0 , #, E), where – Z is the finite set of states, – Σ is the input alphabet, – Γ is the stack alphabet, – z0 ∈ Z is the initial state, – # ∈ Γ is the bottommost stack symbol (on initialization), – E ⊆ Z is the set of accepting states and – δ : Z × (Σ ∪ {ε}) × Γ → P e (Z × Γ∗ ) is the transition function with P e (M ) = {M 0 ⊆ M | M 0 is finite}. For z, z 0 ∈ Z, a ∈ Σ, A, B1 , . . . , Bk ∈ Γ the meaning of δ(z, a, A) 3 (z 0 , B1 , . . . , Bk ) is: If M is in state z, reads the input symbol a on the input tape and A is the upper-most stack symbol, then M can enter state z 0 and substitutes the symbol A with the symbols B1 , . . . , Bk (B1 will then be the upper-most stack symbol). The reading head moves one step right on the input tape.

2.2

Greibach Normal Form and Pushdown Automata

28

If δ(z, ε, A) 3 (z 0 , B1 , . . . , Bk ) holds, then the reading head on the input tape does not move. The usual stack operations are: pop push(B)

: B1 . . . Bk = ε : B1 . . . Bk = BA

Notation: zaA → z 0 B1 . . . Bk . Definition: A configuration of a pushdown automaton is a triple K ∈ Z × Σ∗ × Γ ∗ . The meaning of the configuration K = (z, α, β) is that M is in state z, α is the remaining part of the input and β is the content on the stack. Now we define the binary Relation `M (or also `) on the set of all configurations. Intuitively the meaning of K ` K 0 is that K 0 emerges in one step of M from K. Formally:  0   (z , a2 . . . an , B1 . . . Bk A0 2 . . . Am ),  if δ(z, a1 , A1 ) 3 (z , B1 . . . Bk ), (z, a1 . . . an , A1 . . . Am ) `M 0 (z , a  1 . . . an , B1 . . . Bk A2 . . . Am ),   if δ(z, ε, A1 ) 3 (z 0 , B1 . . . Bk ), where z, z 0 ∈ Z, a1 , . . . , an ∈ Σ and A1 , . . . , Am , B1 , . . . , Bk ∈ Γ. We write `∗M for the reflexive and transitive closure of `M . Definition: The language accepted by final states of a pushdown automaton M is L(M ) = {x ∈ Σ∗ | (z0 , x, #) `∗M (z, ε, γ) for some z ∈ E, γ ∈ Γ∗ }. The language accepted by empty stack of a pushdown automaton M is N (M ) = {x ∈ Σ∗ | (z0 , x, #) `∗M (z, ε, ε) for some z ∈ Z}.

Example: Let L = {a1 a2 · · · an $an . . . a1 | n ≥ 0, a1 , . . . , an ∈ {a, b}}. Now construct a PDA M s.t. L = N (M ). Set M = ({z0 , z1 }, {#, A, B}, δ, z0 , #, ∅), where δ is defined as follows: z0 a# z0 b# z0 $# z1 ε#

→ → → →

z0 A# z0 B# z1 # z1 ε

, z0 aA , z0 bA , z0 $A , z1 aA

→ → → →

z0 AA , z0 aB z0 BA , z0 bB z1 A , z0 $B z1 ε , z0 bB

→ → → →

z0 AB z0 BB z1 B z1 ε.

2.2

Greibach Normal Form and Pushdown Automata

29

Now consider the words w = bab$bab and w0 = bab$babb. The following configurations are entered successively: (z0 1, bab$bab, #)

` ` ` ` ` ` ` `

(z0 , ab$bab, B#) (z0 , b$bab, AB#) (z0 , $bab, BAB#) (z1 , bab, BAB#) (z1 , ab, AB#) (z1 , b, B#) (z1 , ε, #) (z1 , ε, ε)

(z0 1, bab$babb, #)

` ` ` ` ` ` `

(z0 , ab$babb, B#) (z0 , b$babb, AB#) (z0 , $babb, BAB#) (z1 , babb, BAB#) (z1 , abb, AB#) (z1 , bb, B#) (z1 , b, #)

Hence w ∈ N (M ) and w0 ∈ / N (M ). Theorem: Let L ⊆ Σ∗ . Then the following claims are equivalent. (i) There exists a pushdown automaton M with L = N (M ). (ii) There exists a pushdown automaton M with L = L(M ). Proof: (exercise) Definition: A context-free grammar G = (V, Σ, P, S) with ε ∈ / L(G) is said to be in Greibach normal form (GNF), if all rules are of the form A → aB1 . . . Bk

(a ∈ Σ, A, B1 , . . . , Bk ∈ V, k ≥ 0).

Theorem: Let G be context-free with ε ∈ / L(G). Then there exists a cf. grammar G0 in 0 GNF with L(G) = L(G ). Proof: Preliminary considerations for left recursion: A rule of the form A → Aα is left recursive, for A ∈ V, α ∈ (V ∪ Σ)∗ and G = (V, Σ, P, S). In GNF such rules are not allowed. In the following we describe how to remove such rules correctly. Let G = (V, Σ, P, S) be a CFG, A ∈ V with left recursive rules of the form A → Aα1 | . . . |Aαk and the remaining not-left recursive rules A → β1 | . . . |β` , for αi , βj ∈ (V ∪ Σ)∗ , 1 ≤ i ≤ k, 1 ≤ j ≤ `, k, ` ∈ N. These are all A-rules. The following derivations from A are possible:

2.2

Greibach Normal Form and Pushdown Automata

30

A A A A

αi1 αi2

βj with i1 , i2 , . . . ∈ {1, . . . , k}, 1 ≤ j ≤ l. Conclusively the rules from above can be replaced by the following rules: A → β1 | . . . |βl |β1 B| . . . |βl B B → α1 | . . . |αk |α1 B| . . . |αk B, with a fresh variable B. Now let G = (V, Σ, P, S), w.l.o.g. let G be in CNF and V = {A1 , . . . , Am }. Step I: Modify P such that from Ai → Aj α ∈ P it follows that i < j. Input: G = (V, Σ, P, S) Method: for i := 1, . . . , m do begin for j := 1, . . . , i − 1 do for all Ai → Aj α ∈ R do begin Let Aj → β1 | . . . |βl all rules with Aj on the left side; Delete Ai → Aj α; Add Ai → β1 α| . . . |βl α to P ; end; if there are left rekursive rules Ai → Ai α, then remove them as described above under insertion of the fresh variable Bi ; end; Now it follows for all rules Ai → Aj α ∈ P that it holds that i < j. In particular it holds for Am → α ∈ P , that α starts with a terminal symbol. Step II: Modify all Ai -rules to match the desired form:

2.2

Greibach Normal Form and Pushdown Automata

31

Input: G = (V, Σ, P, S) Method: for i := m − 1, . . . , 1 do for all Ai → Aj α ∈ P do begin Let Aj → β1 | . . . |βl all Aj −rules; Delete Ai → Aj α; Add the rules Ai → β1 α| . . . |βl α to P ; end; Step III: Modify all Bi -rules to match the desired form: Input: G = (V, Σ, P, S) Method: for i := 1, . . . , m do begin If Bi → Aj α ∈ P holds and Aj → β1 | . . . |βl are all Aj −rules, then delete Bi → Aj α and add Bi → β1 α| . . . |βl α to P ; end;

Definition: A derivation S ⇒ α1 ⇒ α2 ⇒ . . . ⇒ αn is called leftmost derivation, if in every derivation step αi ⇒ αi+1 (i = 1, . . . , n − 1) always the first (leftmost) non-terminal in αi is replaced. Leftmost derivations for grammars in GNF are of the form S ⇒ aB1 . . . Bk ⇒ abC1 . . . Cl B2 . . . Bk ⇒ abcD1 . . . Dm C2 . . . Cl B2 . . . Bk ⇒ . . . , where a, b, c ∈ Σ, B1 , . . . , Bk , C1 , . . . , Cl , D1 , . . . , Dm ∈ V and S is the initial non-terminal. Theorem: A language L is context-free iff there exists a PDA M such that L = N (M ) holds. Proof: ⇒“: Let L = L(G) for G = (V, Σ, P, S). W.l.o.g. assume G is in Greibach normal ” form. Define M = ({z}, Σ, V, δ, z, S), where δ is as follows: δ(z, a, A) 3 (z, γ),

if A → aγ ∈ P.

Claim: For all x ∈ Σ∗ and α ∈ V ∗ it holds: it exists a leftmost derivation S ⇒∗G xα iff (z, x, S) `∗M (z, ε, α) holds Claim proof: ⇐“: Induction on the length i of a computation of M : ” i = 0: x = ε and S = α: S ⇒∗ S

2.2

Greibach Normal Form and Pushdown Automata

32

i ≥ 1: Let (z, x, S) `∗ (z, ε, α) in i steps of M . Set x = ya, y ∈ Σ∗ , a ∈ Σ. The computation of above can be divided as follows (M does not have any ε-transitions!): =x

z}|{ (z, ya , S) `∗ (z, a, β) ` (z, ε, α) {z } | (i−1)steps

Thus it holds that: (z, y, S) `∗ (z, ε, β) in (i − 1) steps. By induction assumption it follows: S ⇒∗ yβ via a leftmost derivation. From (z, a, β) ` (z, ε, α) it follows that: β = Aγ for a A ∈ V , γ ∈ V ∗ , A → aη ∈ P and α = ηγ. Together we hence have the following leftmost derivation: S ⇒∗ yβ = yAγ ⇒ yaηγ = xα. ⇒“: Induction on the length i of the leftmost derivation S ⇒∗ xα: ” i = 0: S ⇒∗ S and xα = S: (z, ε, S) `∗ (z, ε, S) i ≥ 1: S ⇒∗ xα can be divided into S ⇒∗ yAγ ⇒ yaηγ, | {z } (i−1) steps

where A → aη ∈ P , x = ya and α = ηγ and a ∈ Σ, y ∈ Σ∗ , A ∈ V , γ, η ∈ V ∗ . By induction hypothesis it follows: (z, y, S) `∗ (z, ε, Aγ). From A → aη ∈ P we get δ(z, a, A) 3 (z, η) by definition of δ. Together we hence get: (z, x, S) = (z, ya, S) `∗ (z, a, Aγ) ` (z, ε, ηγ) = (z, ε, α). (Claim) Now choose in the claim α = ε: x ∈ L(G) iff S ⇒∗G x iff (z, x, S) `∗M (z, ε, ε) iff x ∈ N (M ) for all x ∈ Σ∗ . Thus it holds that L(G) = N (M ). ⇐“: Idea: The computation of M on input x will be simulated via a leftmost deriva” tion. Non-terminals of a derivation word correspond to the stack content of M in this computation step. Therefore we use non-terminals [z, A, z 0 ] with z, z 0 ∈ Z, A ∈ Γ, such that (z, x, A) `∗M (z 0 , ε, ε) iff [z, A, z 0 ] ⇒∗G x holds. Let M = (Z, Σ, Γ, δ, z0 , #, E) be a PDA (note: accepting states E of M are not necessary as only N (M ) is considered). Define G = (V, Σ, P, S), where V = {S} ∪ (Z × Γ × Z) (triple construction) and P consists of the following rules

2.2

Greibach Normal Form and Pushdown Automata

33

(i) S → [z0 , #, z] for all z ∈ Z. (ii) [z, A, zm+1 ] → a[z1 , B1 , z2 ][z2 , B2 , z3 ] . . . [zm , Bm , zm+1 ] f¨ ur alle z, z1 , . . . , zm+1 ∈ Z, a ∈ Σ ∪ {ε} and A, B1 , . . . , Bm ∈ Γ, such that δ(z, a, A) 3 (z1 , B1 . . . Bm ). If m = 0 holds then the we take the rule [z, A, z1 ] → a. Claim: For all z, z 0 ∈ Z, A ∈ Γ and x ∈ Σ∗ it holds that: [z, A, z 0 ] ⇒∗ x iff (z, x, A) `∗ (z 0 , ε, ε). Claim proof: ⇐“: Induction on the length i of the computation of M : ” i = 1: It holds that: (z, x, A) ` (z 0 , ε, ε), thus δ(z, x, A) 3 (z 0 , ε), hence [z, A, z 0 ] → x ∈ P , consequently [z, A, z 0 ] ⇒ x i > 1: Let x = ay, y ∈ Σ∗ , a ∈ Σ ∪ {ε} and (z, ay, A) ` (z1 , y, B1 . . . Bn ) `∗ (z 0 , ε, ε), thus δ(z, a, A) 3 (z1 , B1 . . . Bn ) for B1 , . . . , Bn ∈ Γ. Divide y into y = y1 y2 . . . yn (yi ∈ Σ∗ ), such that during the processing of yi the symbol Bi disappears from the stack, thus: y1 is the shortest prefix of y after whose processing the stack content is B2 . . . Bn , y1 y2 is the shortest prefix of y after whose processing the stack content is B3 . . . Bn and so on and so forth. stack height

n n−1 n−2

.. .

1 input

0 y1

y2

...

yn

2.2

Greibach Normal Form and Pushdown Automata

34

in < i steps

z }| { There exists z2 , . . . , zn+1 = z such that (zj , yj , Bj ) `∗ (zj+1 , ε, ε) for 1 ≤ j ≤ n. By induction hypothesis we hence deduce [zj , Bj , zj+1 ] ⇒∗ yj for 1 ≤ j ≤ n. 0

Further it holds δ(z, a, A) 3 (z1 , B1 . . . Bn ) (1. step of the computation of above), thus we have [z, A, zn+1 ] → a[z1 , B1 , z2 ][z2 , B2 , z3 ] . . . [zn , Bn , zn+1 ] ∈ P by definition of G. Together we get the following derivation: [z, A, zn+1 ] ⇒ a[z1 , B1 , z2 ][z2 , B2 , z3 ] . . . [zn , Bn , zn+1 ] | {z } ⇒ ay1 [z2 , B2 , z3 ] . . . [zn , Bn , zn+1 ] =z 0 ⇒∗ ay1 y2 . . . yn = x ⇒“: Induction on the length i of the derivation in G: ” i = 1: [z, A, z 0 ] ⇒ x, hence [z, A, z 0 ] → x ∈ P , also δ(z, x, A) 3 (z 0 , ε), thus (z, x, A) ` (z 0 , ε, ε). i > 1: Let [z, A, z 0 ] ⇒ a[z1 , B1 , z2 ][z2 , B2 , z3 ] . . . [zn , Bn , zn+1 ] ⇒∗ x with z 0 = zn+1 . Write x as x = ay1 . . . yn with [zj , Bj , zj+1 ] ⇒∗ yj for 1 ≤ j ≤ n. | {z } in < i steps

By induction hypothesis we then get (zj , yj , Bj ) `∗ (zj+1 , ε, ε) for 1 ≤ j ≤ n. From the first derivation step it further follows that δ(z, a, A) 3 (z1 , B1 . . . Bn ), thus (z, a, A) ` (z1 , ε, B1 . . . Bn ). Together the following computation is deduced: (z, ay1 . . . yn , A) ` | {z } `∗ =x `∗

(z1 , y1 . . . yn , B1 . . . Bn ) (z2 , y2 . . . yn , B2 . . . Bn ) (zn+1 , ε, ε) | {z } =z 0

(Claim) Choose in the claim z = z0 and A = #. Then it holds that: [z0 , #, z] ⇒∗ x iff (z0 , x, #) `∗ (z, ε, ε) for all z ∈ Z. As S → [z0 , #, z 0 ] ∈ P holds for all z 0 ∈ Z it then follows that: x ∈ L(G) ⇔ S ⇒∗ x ⇔ S ⇒ [z0 , #, z] ⇒∗ x ∗

⇔ (z0 , x, #) ` (z, ε, ε) ⇔ x ∈ N (M )

for some z ∈ Z for some z ∈ Z

2.3

Deterministic context-free Languages

35

Corollary: (i) Every context-free language is accepted by a pushdown automaton with only one state. (ii) Every context-free language L with ε ∈ / L is accepted by a pushdown automaton with only one state which does not make any ε-moves.

2.3 Deterministic context-free Languages Definition: A pushdown automaton M = (Z, Σ, Γ, δ, z0 , #, E) is a deterministic pushdown automaton (DPDA), if for every z ∈ Z, a ∈ Σ, A ∈ Γ it holds that: |δ(z, ε, A)| + |δ(z, a, A)| ≤ 1. A context-free language L is deterministic context-free (L ∈ DCFL), if there exists a DPDA M such that L = L(M ) holds. It holds that: {wwR | w ∈ {0, 1}∗ } ∈ CFL \ DCFL. Thus: DCFL $ CFL. Goal: DCFL is closed under complementation. Idea: The machine type is deterministic thus one can swap final states and non-final states. Problems: A The DPDA does not accept its input because the automaton does not read the input completely due to being stuck. Possible reasons: undefined entry in the δ-function, empty stack, infinite loop of ε-transitions. ⇒ If we swap final states with non-final states then the automaton still is stuck and does not accept. B The DPDA accepts x but after completely reading the input the automaton enters a sequence of ε-transitions which include final and non-final states. ⇒ After swapping final and non-final states the input x is still accepted. Theorem: The complement of every DCFL is a DCFL. Proof: Let L ∈ DCFL, L = L(M ) by DKA M = (Z, Σ, Γ, δ, z0 , #, E). W.l.o.g. we assume that M always reads completely its input (problem A, exercise). Now turn to problem B: Define M 0 = (Z 0 , Σ, Γ, δ 0 , z0 0 , #, E 0 ) as follows: – Z 0 = Z × {1, 2, 3} Idea: The second component states if since reading the last symbol a final state has been entered (state ∈ Z × {1}) or not (state ∈ Z × {2}). Before reading the next symbol we change from Z × {2} to Z × {3}. – E 0 = Z × {3}

2.3

Deterministic context-free Languages

– z0

0

36

( (z0 , 1), if z0 ∈ E . = (z0 , 2), if z0 ∈ /E

The transition function δ 0 is defined as follows: (i) If δ(z, ε, A) = {(z 0 , γ)} holds, then δ 0 ((z, k), ε, A) = {((z 0 , k 0 ), γ)} holds where k 0 = 1, if z 0 ∈ E or k = 1 holds, and otherwise k 0 = 2. (ii) If δ(z, a, A) = {(z 0 , γ)} holds for a ∈ Σ, then δ 0 ((z, 2), ε, A) = {((z, 3), A)} and δ 0 ((z, 1), a, A) = δ 0 ((z, 3), a, A) = {((z 0 , k 0 ), γ)} hold where k 0 = 1, if z 0 ∈ E holds, and otherwise k 0 = 2. Claim: It holds that L(M ) = L(M 0 ). Proof of claim: Let a1 . . . an ∈ L(M ). Then M Enters a state in E after reading an (poss. after a sequence of ε-transitions). M 0 changes to a state in Z × {1}. At further ε-transitions the state of M 0 still is in Z × {1}. ⇒ M 0 does not accept. Now let a1 . . . an ∈ / L(M ). Then all states which M enters after reading an are not in E. Then the corresponding states of M 0 are in Z ×{2}. After the simulation of M automaton M 0 now enters a state in Z × {3} = E 0 . ⇒ M 0 accepts. Note:

– DCFL is not closed under intersection.

(see pp.23: it holds that L1 , L2 ∈ DCFL). – DCFL is not closed under union. (In this case DCFL would be closed under intersection by de Morgan due to L1 ∩ L2 = L1 ∪ L2 .) Definition: A language L ⊆ Σ∗ has the prefix property (L is prefix free), if for all w ∈ L it holds that: no real prefix of w is in L, i.e., w ∈ L ⇒ ∀u, v ∈ Σ+ (w = uv ⇒ u ∈ / L) . The following holds: The language accepted by a deterministic pushdown automaton with empty stack coincides with the deterministic context-free languages having the prefix property, which are the LR(0) languages (e.g., PASCAL). DCFL is in this context also be called LR(1). / Σ. It further holds that L ∈ DCFL ⊆ Σ∗ ⇒ L0 = L · {$} ∈ DCFL and prefix free for $ ∈

2.4

Decidability Questions

37

2.4 Decidability Questions Post’s Correspondence Problem Let Σ be an alphabet such that |Σ| ≥ 2. Post’s Correspondence Problem (PCP) over Σ is defined as follows: Given: Question:

A finite sequence of pairs C = ((x1 , y1 ), (x2 , y2 ), . . . , (xk , yk )) with xi , yi ∈ Σ+ for 1 ≤ i ≤ k. Are there i1 , i2 , . . . , in with iµ ∈ {1, . . . , k}, 1 ≤ µ ≤ n, such that xi1 xi2 . . . xin = yi1 yi2 . . . yin holds?

In a positive case C is said to be solvable and the sequence (i1 , . . . , in ) is a solution of C. Example: Consider the following instance C = ((1, 101), (10, 00), (011, 11)) of Post’s Correspondence Problem. Now write the parts of the sequence on below the other: 1 101

10 00

011 11

We get: i1 = 1 1 101

i2 = 3 011 11

i3 = 2 10 00

i4 = 3 011 11

Thus a solution is (1, 3, 2, 3), hence C ∈ PCP. Example: Is PCP solvable for the following instance? 001 0

01 011

01 101

10 001

(Exercise) Observation: PCP is recursive-enumerable. Consider the following algorithm: Input: C = ((x1 , y1 ), . . . , (xk , yk )) Method: for n := 1, 2, 3, . . . for all sequences i1 , . . . , in ∈ {1, . . . , k} if xi1 xi2 . . . xin = yi1 yi2 . . . yin then return ”C is solvable” and halt.

2.4

Decidability Questions

38

The algorithm halts iff C is solvable. Thus PCP is semidecidable and recursive-enumerable. Goal: PCP is not decidable. Definition: The halting problem is the language H = {hM, wi | M is a deterministic 1-Band-TM, which halts on input w}. H is not decidable. (Notation: hxi is a computable encoding of x.) Definition: Let A ⊆ Σ∗ , B ⊆ Γ∗ be languages. A is said to be reducible to B (in symbols: A ≤ B) if there exists a total computable function f : Σ∗ → Γ∗ such that for all x ∈ Σ∗ holds: x ∈ A ⇔ f (x) ∈ B If A is not decidable then B is not decidable. If B is decidable then A is decidable. Goal: H ≤ PCP At first we consider the modified Post’s correspondence problem (MPCP) and prove its undecidability. The problem MPCP is defined as follows: Given: Question:

C = ((x1 , y1 ), . . . , (xk , yk )) as for PCP Does there exist a sequence i2 , . . . , in x1 xi2 . . . xin = y1 yi2 . . . yin holds?

∈

{1, . . . , k} such that

Goal: H ≤ MPCP ≤ PCP Lemma: H ≤ MPCP Proof: Now we define a total computable function f which maps hM, wi to instances of MPCP such that the following holds: hM, wi ∈ H ⇔ f (hM, wi) ∈ MPCP.

Let w ∈ Σ∗ and M = (Z, Σ, Γ, δ, z0 , , E) be a TM where – Z is the set of states, – Σ is the input alphabet, – Γ, Σ ⊂ Γ is the tape alphabet, – δ : Z × Σ → Z × Σ × {L, N, R} is the transition function, – z0 ∈ Z is the initial state, – the empty tape symbol, and

2.4

Decidability Questions

39

– E ⊆ Z the set of accepting states. W.l.o.g. let δ(z, a) be undefined for all z ∈ E, a ∈ Σ and Z ∩ Γ = ∅. A configuration of M is a string over Γ∗ · Z · Γ∗ , where uzv with u, v ∈ Γ∗ and z ∈ Z means: – M is in state z, – the tape content (content of all tape cells which have been visited so far) is uv, and – the head of M is positioned on the first symbol of v. M accepts w iff there exist configurations K0 , K1 , . . . , Kt mit K0 = z0 w, Kt ∈ Γ∗ · E · Γ∗ and for 1 ≤ i < t configuration Ki+1 is reached in one step of M from Ki . We construct an instance C of MPCP by setting pairs (xi , yi ): The alphabet of the MPCP is Z ∪ Γ ∪ {#}, where # ∈ / Z ∪ Γ holds. (i) Initial rule“ ” (x1 , y1 ) = (##, ##z0 w#) (ii) Transition rules“ ” If δ(z, a) = (z 0 , b, N) holds then add the pair (za, z 0 b). If δ(z, a) = (z 0 , b, R) holds then add the pair (za, bz 0 ). If δ(z, a) = (z 0 , b, L) holds then add the pair (cza, z 0 cb) for all c ∈ Σ and besides add the pair (#za, #z 0 b). If δ(z, ) = (z 0 , b, N) holds then also add the pair (z#, z 0 b#) auf. If δ(z, ) = (z 0 , b, R) holds then also add the pair (z#, bz 0 #) auf. If δ(z, ) = (z 0 , b, L) holds then also add the pair (cz#, z 0 cb#) for all c ∈ Σ. (iii) Copy rules“ ” For all a ∈ Σ ∪ {#} add the pair (a, a). (iv) Deletion rules“ ” For all z ∈ E und a ∈ Σ add (az, z) and (za, z). (v) Completion rules“ ” For all z ∈ E add the pairs (z##, #). Claim: M halts on input w ⇔ C is solvable. Proof of claim: ⇒“: M halts on input w. Then there exists a sequence K0 , K1 , . . . , Kt ” as from above. A solution of C can be constructed as follows: – Start with (x1 , y1 ) ## ##z0 w# The length of the string on the bottom is always larger than from above. If both strings end with # then the string on the bottom is exactly one configuration longer than the one from above.

2.4

Decidability Questions

40

– Now apply copy- and transition-rules as long as the following situation is reached: ## K0 # K1 # ## K0 # K1 # K2 #

··· ···

Kt−1 # Kt #

– Apply deletion- and copy-rules as long as the following situation is reached: ## K0 # K1 # ## K0 # K1 # K2 #

··· ···

Kt−1 # Kt # Kt #

··· # · · · # z#

where z ∈ E. – Apply matching completion rules: · · · # z# # · · · # z# # Consequently it follows that C ∈ MPCP. ⇐“: If C is solvable by the sequence i1 , . . . , in mit i1 = 1 then a halting computation of ” M can be constructed with similar arguments. Corollary: MPCP is undecidable. Lemma: MPCP ≤ PCP. Proof: Let C be a MPCP-instance over Σ. Let #, $ ∈ / Σ. For w = a1 . . . am ∈ Σ+ , define #

w# w# # w

= = =

#a1 #a2 # . . . #am # a1 #a2 # . . . #am # #a1 #a2 # . . . #am

If now C = ((x1 , y1 ), . . . , (xk , yk )) holds then define f as # # # # # f (C) = ((# x# 1 , y1 ), (x1 , y1 ), . . . , (xk , yk ), ($, #$)).

f is total and computable. Claim: C ∈ MPCP ⇔ f (C) ∈ PCP. Proof of claim: ⇒“: Let (i1 , i2 , . . . , in ) with i1 = 1 a solution of C, i.e., ” xi1 xi2 . . . xin = yi1 yi2 . . . yin . Then (1, i2 + 1, . . . , in + 1, k + 2) is a solution of f (C). ⇐“: If (i1 , . . . , in ) is a minimal solution of f (C) then it must hold that: ” i1 = 1, i2 , . . . , in−1 ∈ {2, . . . , k + 1} and in = k + 2 Then (1, i2 − 1, . . . , in−1 − 1) is a solution of C. Corollary: PCP is undecidable.

2.4

Decidability Questions

41

Theorem: PCP over Σ is already undecidable for |Σ| = 2. Proof: Let Σ be an alphabet, w.l.o.g. Σ = {b1 , . . . , bm }. For 1 ≤ j ≤ m define b bj = 01j . + For w ∈ Σ , w = a1 . . . an , set w b=b a1 . . . b an . Then it holds: ((x1 , y1 ), . . . , (xn , yn )) ∈ PCP ⇔ ((b x1 , yb1 ), . . . , (b xn , ybn )) ∈ PCP

Note:

(over Σ) (over {0, 1})

– PCP over Σ with |Σ| = 1 ist decidable.

– Let PCPk PCP be restricted on inputs with exactly k pairs. Then it holds that: – PCP1 and PCP2 are decidable. – PCPk is undecidable for k ≥ 7. – The decidability of PCPk for k ∈ {3, . . . , 6} is open. Decidability Questions for Context-free Languages Theorem: The intersection problem for DCFL, hence the problem Given: Question:

Cf. grammars G1 , G2 with L(G1 ), L(G2 ) ∈ DCFL. Does L(G1 ) ∩ L(G2 ) 6= ∅ hold?

is undecidable. Proof: We will state a reduction from PCP to the intersection problem of DCFL. Let C = ((x1 , y1 ), . . . , (xk , yk )) be a PCP instance over Σ = {0, 1}. Define the grammar G1 over the alphabet Σ = {0, 1, $, a1 , . . . , ak } as G1 = ({S, A, B}, Σ, P, S) with the following production rules: S A B

→ → →

A$B a1 Ax1 | . . . | ak Axk | a1 x1 | . . . | ak xk y1R Ba1 | . . . | ykR Bak | y1R a1 | . . . | ykR ak

(wR is the string w mirrored.) The language produced by G1 is: L(G1 ) = ain . . . ai1 xi1 . . . xin $yjRm . . . yjR1 aj1 . . . ajm n, m ≥ 1, iµ , jν ∈ {1, . . . , k} . Define the grammar G2 = ({S, T }, Σ, P, S) with the following production rules: S T

→ →

a1 Sa1 | . . . | ak Sak | T 0T 0 | 1T 1 | $

2.4

Decidability Questions

42

The language produced by G2 is: L(G2 ) = uv$v R uR u ∈ {a1 , . . . , ak }∗ , v ∈ {0, 1}∗ . Obviously it holds that L(G1 ), L(G2 ) ∈ DCFL. Further it holds that: C has a solution i1 , . . . , in iff ain . . . ai1 xi1 . . . xin $yjRn . . . yjR1 ai1 . . . ain ∈ L(G1 ) ∩ L(G2 ). C 7→ hG1 , G2 i is a reduction from PCP to the intersection problem of DCFL. Corollary:

– The intersection problem for CFL is undecidable.

– The intersection problem for deterministic pushdown automata, i.e., the problem Given: Question:

Two deterministic pushdown automata M1 , M2 . Does L(M1 ) ∩ L(M2 ) 6= ∅ hold?

is undecidable. Theorem: The equivalence problem for cf. languages, hence the problem Given: Question:

Cf. grammars G1 , G2 . Does L(G1 ) = L(G2 ) hold?

is undecidable. Proof: Reduction from the intersection problem of deterministic pushdown automata. Let M1 , M2 be deterministic pushdown automata. Then it holds that L(M1 ) ∩ L(M2 ) = ∅ ⇔ L(M1 ) ⊆ L(M2 ) ⇔ L(M1 ) ⊆ L(M2C ) ⇔ ⇔

L(M1 ) ∪ L(M2C ) = L(M3 ) = L(M2C ),

L(M1 ) L(M2C )

L(M2 )

where M2C is a DPDA with L(M2C ) = L(M2 ) and M3 is a PDA with L(M1 )∪L(M2C ). Thus (M1 , M2 ) ∈ / intersection problem ⇔ (M3 , M2C ) is a member of the equivalence problem and (M1 , M2 ) 7→ (M3 , M2C ) is the desired reduction. Note: The equivalence problem for DCFL, resp., DPDAs is decidable. Theorem: The problem Non-empty Complement (NEC) for cf. languages:

2.4

Decidability Questions Given: Question:

43 Cf. grammar over the alphabet Σ. Does L(G) 6= Σ∗ hold?

is undecidable. Proof: Let G1 and G2 be defined as in the proof of the intersection problem on page 41. Then it holds that: C ∈ PCP ⇔ L(G1 ) ∩ L(G2 ) 6= ∅ ⇔ L(G1 ) ∩ L(G2 ) 6= Σ∗ ⇔ L(G1 ) ∪ L(G2 ) 6= Σ∗ ⇔ L(G4 ) 6= Σ∗ , where G4 is a cf. grammar with L(G4 ) = L(G1 ) ∪ L(G2 ). The mapping C 7→ G4 reduces PCP to NEC. Theorem: The problem Given: Question:

Cf. grammars G1 and G2 . Is L(G1 ) ∩ L(G2 ) context-free?

is undecidable. Proof: Let C be an instance of Post’s correspondence problem. Let G1 and G2 be the grammars of the proof of the undecidability of the intersection problem from page 41. Claim: C is solvable iff L(G1 ) ∩ L(G2 ) ∈ / CFL holds. Then the complement of PCP is reducible to the problem of the theorem. Proof of claim: Let C be not solvable. Then it holds that L(G1 ) ∩ L(G2 ) = ∅, hence we have L(G1 ) ∩ L(G2 ) ∈ CFL. Let C be solvable. Then C has infinitely many solution whence |L(G1 ) ∩ L(G2 )| = ∞ holds. Set L := L(G1 ) ∩ L(G2 ). We show that L ∈ / CFL holds with the help of the Pumping Lemma: Assume that L is cf. then there exists an n ∈ N as in the Pumping Lemma. Choose z ∈ L with z = aim . . . ai1 xi1 . . . xim $yiRm . . . yiR1 ai1 . . . aim with m > n. z exists as |L| = ∞. Thus it holds that |z| > n. Now let z = uvwxy be a decomposition of z with |vx| > 0 and |vwx| ≤ n. We show that uwy ∈ / L holds. Case 1: vx consists of the symbol $ ⇒ uwy consists of no $, hence z ∈ / L.

2.4

Decidability Questions

44

Case 2: w consists of the symbol $ ⇒ vx ∈ {0, 1}∗ , as |vwx| ≤ n holds ⇒ uwy ∈ / L(G1 ), hence uwy ∈ / L. Case 3: vwx consists of no $ ⇒ v and x are both left of $ or both are right of $ ⇒ uwy ∈ / L(G2 ), hence uwy ∈ / L.

Theorem: The problem Cf. grammar G. Is L(G) context-free?

Given: Question: is undecidable.

Proof: Let C be an instance of PCP, G1 and G2 defined as above. Construct a cf. grammar G4 with L(G4 ) = L(G1 ) ∩ L(G2 ) (as in the proof of the NEC-problem, page 42). Then it holds that: C is not solvable ⇔ L(G1 ) ∩ L(G2 ) ∈ CFL ⇔ L(G4 ) ∈ CFL (reduction from the complement of PCP) Theorem: The problem Given: Question:

Cf. grammar G. Is L(G) regular?

is undecidable. Proof: Let C, G1 , G2 and G4 be defined as above. Then it holds that: C is not solvable ⇔ L(G1 ) ∩ L(G2 ) = ∅ ⇔ L(G1 ) ∩ L(G2 ) ∈ REG ⇔ L(G4 ) ∈ REG ⇔ L(G4 ) ∈ REG, where REG is the set of all regular languages. Theorem: The problem Given: Question: is undecidable.

Cf. grammar G. Does L(G) ∈ DCFL hold?

2.4

Decidability Questions

45

Proof: Let C, G1 , G2 and G4 defined as above. Then it holds that C is not solvable ⇔ L(G1 ) ∩ L(G2 ) = ∅ ⇔ L(G1 ) ∩ L(G2 ) ∈ DCFL ⇔ L(G4 ) ∈ DCFL ⇔ L(G4 ) ∈ DCFL

Decidable Problems for Context-free Languages Lemma: Let G = (V, Σ, P, S) be a cf. grammar in Chomsky normal form. Then it holds that: (i) L(G) 6= ∅ ⇔ there exists an x ∈ L(G) with |x| < 2|V | . (ii) |L(G)| = ∞ ⇔ there exists an x ∈ L(G) with 2|V | ≤ |x| < 2|V |+1 . Proof: for (i): Application of the Pumping Lemma for cf. languages. Constant n in the Pumping Lemma is in this case 2|V | . ⇐“: clear ” ⇒“: Let z be a shortest string in L(G). We show that |z| < 2|V | . ” Assumption: |z| ≥ 2|V | . Then z is by the Pumping Lemma decomposable into z = uvwxy with |vx| > 0 and |vwx| ≤ 2|V | and uv i wxi y ∈ L for i ∈ N, hence in particular we have uwy ∈ L(G). But |uwy| = |z| − |vx| < |z| holds contradicting our choice of z. for (ii): see exercise class. Theorem: The emptiness problem for cf. languages Given: Question:

Cf. grammar G. Does L(G) = ∅ hold?

is decidable. Proof: (Exercise) Theorem: The finiteness problem for cf. languages Given: Question: is decidable.

Cf. grammar G. Does |L(G)| = ∞ hold?

3 Context-sensitive Languages and Type-0-Languages

46

Proof: Analogously to the algorithm of above with part (ii) of the previous lemma. Overview to the decidability problems word problem emptiness problem finiteness problem equivalence problem intersection problem nonempty complement

REG X X X X X X

CFL X X X

(X = ˆ decidable)

3 Context-sensitive Languages and Type-0-Languages Let G = (V, Σ, P, S) be a grammar. If P ⊆ (V ∪Σ)+ ×(V ∪Σ)∗ holds then G is a grammar of type 0. If further, it always holds that (α, β) ∈ P implies |α| ≤ |β| then G is of type 1 or contextsensitive (short: CSG for context-sensitive grammar). (exception: the rule S → ε is allowed if S does not appear on any right part of a production rule.) CSL is the class of all context-sensitive languages, i.e., all languages for which there exists a grammar of type 1. A linear bounded automaton (LBA) is a nondeterministic Turing machine which uses space ≤ n on input length n, i.e., the head never leaves the area on the tape on which the input is written. therefore assume that the right input ending is marked with a special character (see below).

3.1 Machine characterizations for languages of type 0 and typ 1 Theorem: There exists a grammar of type 1 for a given language iff the language is accepted via an LBA. Proof: ⇒“: Let L = L(G) with G = (V, Σ, P, S). The following algorithm verifies on ” input w ∈ Σ∗ if S ⇒∗G w holds: Input: w ∈ Σ∗ Method: Repeat until nothing changes: Choose nondeterministically a rule α → β; Choose nondeterministically the occurrence of β on the working tape; If such occurrence still exists then replace β with α; If only S is left then Accept;

3.1

Machine characterizations for languages of type 0 and typ 1

47

⇐“: Let L = L(M ) for some LBA M = (Z, Σ, Γ, δ, z0 , , E), where ” – Z is the set of states – Σ is the input alphabet, b ∪ {}, where Σ b = {ˆ – Γ is the working alphabet with Γ ⊇ Σ ∪ Σ a | a ∈ Σ}, – δ is the transition function, – z0 ∈ Z is the initial state, – ∈ Γ \ Σ is the empty tape cell symbol, and – E ⊆ Z is the set of final states. A configuration of M is a string from Γ∗ · (Z × Γ) · Γ∗ . Here u(z, a)v means for a ∈ Σ u, v ∈ Σ∗ and z ∈ Z: M is in state z, the tape content is uav and the head is on the symbol a. Example: ...

a

b

c

d

...

z

This situation corresponds to the configuration a(z, b)cd. It holds that |a(z, b)cd| = 4. The initial configuration of M on input w = a1 . . . an is (z0 , a1 )a2 a3 . . . an−1 ˆan (ˆan symbolizes the special mark on the right end). Define G = (V, Σ, P, S) as follows: At first define the production rule set P 0 which simulates on configurations exactly the computation of M . The transition δ(z, a) 3 (z 0 , b, L) δ(z, a) 3 (z 0 , b, R) δ(z, a) 3 (z 0 , b, N )

leads to the rule leads to the rule leads to the rule

c(z, a) → (z 0 , c)b (z, a)c → b(z 0 , c) (z, a) → (z 0 , b).

for all c ∈ Γ \ {}, for all c ∈ Γ \ {},

Hence it holds for configurations K, K 0 of M that K `M K 0 iff K ⇒∗ K 0 with rules from P 0. Now let ∆ = Γ ∪ (Z × Γ) be an alphabet of configurations of M . Set V = {S, A} ∪ (∆ × Σ). Then the rule set P consists of the following rules

3.2

Decidability and Closure Properties

(i)

S S A A

→ → → →

((z0 , ˆa), a) A(ˆa, a) A(a, a) ((z0 , a), a)

48 for for for for

a∈Σ a∈Σ a∈Σ a∈Σ

With these rules from S all words of the following form can be produced ((z0 , a1 ), a1 )(a2 , a2 )(a3 , a3 ) . . . (an−1 , an−1 )(ˆan , an ) for a1 , . . . , an ∈ Σ. The sequence of first component corresponds to the initial configuration of M ona1 . . . an . The sequence in the second component corresponds to the input a1 . . . an . (ii)

(A1 , a)(A2 , b) (A, a)

→ →

(B1 , a)(B2 , b) (B, a)

for A1 A2 → B1 B2 ∈ P 0 for A → B ∈ P 0

(A, B ∈ ∆ und a, b ∈ Σ, A, B, A1 , A2 , B1 , B2 ∈ ∆) With these rules the computation of M is simulated on the first component. The input string on the second component is not changed. (iii)

((z, a), b) (a, b)

→ →

b b

for z ∈ E and a ∈ Γ, b ∈ Σ for a ∈ Γ, b ∈ Σ

With these rules all first components can be deleted after reaching a final configuration. Afterwards a1 . . . an remains. Hence if w ∈ L(M ), then it holds that S ⇒∗G w. The converse can be shown analogously. Finally we get that L(M ) = L(G) holds.

3.2 Decidability and Closure Properties Corollary: The word problem for type-1-grammars is decidable. Corollary: Given a language L. Then there exists a grammar of type 0 for L iff there exists a TM accepting L iff L is recursive enumerable. Proof: ⇒“: The algorithm in the previous proof is a semi-decision-algorithm for the given ” type-0-language. ⇐“: Construct grammar from the TM as above where the rules of item (i) are constructed ” for all c ∈ Γ and additionally to item (iii) the rules ((z, a), ) → ε, (a, ) → ε are added for all a ∈ Γ. The equialvence and intersection problem of CSLs are not decidable as they are not decidable for CFLs. Consider the emptiness problem which is decidable for CFL: Theorem: The emptiness problem for type-1-languages is undecidable.

3.2

Decidability and Closure Properties

49

Proof: Reduction from the complement of the intersection problem. Given two type-1-grammars G1 , G2 . Construct a type-1-grammar G with L(G) = L(G1 )∩ L(G2 ) (this is possible as CSL is closed under intersection, see below). Then it holds that (G1 , G2 ) ∈ intersection problem ⇔ G ∈ / emptiness problem. Theorem: CSL is closed under union. Proof: Let L1 = L(G1 ), L2 = L(G2 ) for type-1-grammars Gi = (Vi , Σ, Pi , Si ), i = 1, 2 and w.l.o.g. S ∈ / V1 ∪ V2 , V1 ∩ V2 = ∅. Then it holds that L1 ∪ L2 = L(G) for the context-sensitive grammar G = (V1 ∪ V2 ∪ {S}, Σ, P1 ∪ P2 ∪ {S → S1 , S → S2 }, S). Theorem: CSL is closed under intersection. Proof: Given Li = L(Mi ) for LBAs Mi , i = 1, 2. Now define LBA M for L1 ∩ L2 as follows: Input: w ∈ Σ∗ Method: Simulate M1 on w; (with pairs as above ensuring w is preserved) Simulate M2 on w; Accept, if both simulations accept.

Now we want to examine the closure under complement for CSL: Let s : N → N. NSPACE(s) is the class of languages which are accepted by nondeterministic Turing machines in space O(s). Hence: CSL = NSPACE(n). coNSPACE(s) is the class of complements of languages in NSPACE(s). The question whether CSL is closed under complement is hence the question NSPACE(n) = coNSPACE(n)?

We will consider this question more generally: s : N → N is said to be space constructible, if there exists a deterministic TM which uses exactly s(n) tape cells on input of a string of length n For s(n) < n we will use TMs with working tape and separate input tape. The used space is equal to the number of used cells on the working tape. For s(n) ≥ n this model coincides with one-tape TMs (see lecture Komplexit¨at von Algorithmen“). ” A configuration of a TM with seperate input tape is a tuple (state, content of the working tape, head position on the working tape, head position on the input tape).

3.2

Decidability and Closure Properties

50

For s(n) ≥ log n it then holds the following: |configurations of M on input length n| = O(s(n)). ⇒ number of all such configurations = 2O(s(n)) . ⇒ Every accepting path is time-restricted by 2c·s(n) for c ∈ N . Theorem (Immermann, Szelepcz´enyi): Let s(n) ≥ log n be space constructible. Then it holds that NSPACE(s) = coNSPACE(s). Proof: Let A ∈ NSPACE(s). Let M be a nondeterministic TM as above which accepts A in space s. Without loss of generality assume that all paths of M are of equal length 2c·s(|x|) on input x. x Kstart

01 ↑ accept Notation: K1

=t M, x

01

0 ↑ reject

K2 , if M on input x can reach configuration K1 in exactly t steps.

Assuming we know the number x nxM := |{K | K is s(|x|) space constructible and Kstart

=N M, x

K}|,

x where N = sc·s(|x|) and Kstart corresponds to the initial configuration of M on input x. Then the following NTM accepts the language A:

Input: x Method: x K 0 := Kstart ; m := 0; for i := 0 to nxM do begin

3.2

Decidability and Closure Properties

51

x guess configuration K with K > K 0 (in lex. ordering where Kstart is minimal) and K is not accepting; guess path Π of length 2c·s(|x|) ; = 2c·s(|x|) x if Kstart K with Π then m := m + 1; M, x K 0 := K; end; if m = nxM then accept else reject;

Problem: How do we compute nxM ? Let M be normalized such that there exists a unique accepting configuration Kacc = =1 (q+ , ε, 1, 1). Further let Kacc M, x Kacc the only configuration transition from Kacc , i.e., further steps will not leave Kacc . Let N = 2c·s(|x|) be an upper bound for the number of configurations on input x, then it holds that x M accepts x iff Kstart

=N M, x

Kacc .

=t x Define nxM (t) := |{K | K is s(|x|)-space bounded and Kstart M, x K}| for 0 ≤ t ≤ N . It x x holds: nM = nM (N ). We construct a s-space bounded NTM which computes nxM (t + 1) if the input is nxM (t) (inductive counting).

Computation of nxM (0): nxM (0) = 1. Computation of nxM (t + 1): Let nxM (t) be given. Verify for each s(|x|)-space bounded =t+1 x configuration K if Kstart K holds. The number of those K is equal to M, x x nM (t + 1). x How can we verify Kstart

=t+1 M, x

K?

Let Ki1 , . . . , Kir be all predecessor configurations of K (r is constat and is deter=t x mined by M ). Verify if Kstart M, x Kiµ holds for some 1 ≤ µ ≤ r. x How can we verify Kstart

=t M, x

Kiµ ? Note: nxM (t) is known!

Method: m:=0; for all configurations K do begin guess nondeterministically (A) or (B): (A): Simulate t steps of M ; =t x if Kstart M, x K then m := m + 1; (B): do nothing; end; if m 6= nxM (t) then reject; else output ”‘yes”’ if Kiµ was a configuration for which (A) has been guessed, otherwise output ”‘No”’; The complete algorithm is defined as:

3.2

Decidability and Closure Properties

Input: x Method: (∗If n = nxM (t) holds then we have: reach(x,n,t,K)=true iff Kstart function reach(x, n, t, K): boolean; begin m := 0; b := false; for all configurations K 0 do begin guess nondeterministically (A) or (B): =t 0 x (A): if Kstart M, x K then (∗Simulation of t steps of M ∗) begin m := m + 1; if K 0 = K then b := true; end; (B): nop; end; if m = n then return b else reject; end;

52

=t M, x

K∗)

(∗Main program:∗) N := 2c·s(|x|) ; n := 1; for t := 1 to N do (∗loop invariant: n := nxM (t − 1)∗) begin k := 0; for all configurations K do begin f := false; for all predecessor configurations K 0 of K do if reach(x, n, t − 1, K 0 ) then f := true; if f then k := k + 1; end; (∗k := nxM (t)∗) n := k; end; (∗n := nxM (N ) = nxM ∗) if reach(x, n, N , Kacc ) then reject else accept; The variables N, K, K 0 , b, f, n, k, t need O(s) space, hence the algorithm needs O(s) space storage. Corollary:

– Let A ⊆ Σ∗ . Then it holds that A ∈ CSL ⇔ A ∈ CSL.

– NSPACE(n) = coNSPACE(n). – NL = coNL. Note: It is an open problem if deterministic LBAs suffice to decide all context-sensitive languages (the LBA-problem). In complexity theory notation this is the question if NSPACE(n) = SPACE(n) holds.

Symbols list

53

Symbols list H # ∼L M/∼ [x]∼ ∼, ≈ Σ, Γ, ∆ δˆ δ K ε λ ˆ λ ≤ N (M ) L(M ) ≡L Mon(L)

The common halting problem. 38 bottom stack symbol. 27 Natural equivalence relation of L. 4 Set of all equivalence classes of ∼. 3 Equivalence class of x. 3 Equivalence relation. 3 Alphabets. 1 Extended transition function of an automata. 1 Transition function of an automata. 1 configuration. 19, 28 The empty word. 1 Output function of a Moore-, resp., Mealyautomaton. 14 Extended output function of a Moore-, resp., Mealyautomaton. 17 reduction function. 38 Languages accepted via empty stack. 28 Language accepted via acceptance states. 1, 2, 28 Syntactic congruence of L. 9 Syntactic monoid of L. 9