Formal Languages and Automata

Formal Languages and Automata Chapter 6 Simplification of Context-Free Grammars and Normal Forms Chuan-Ming Liu [email protected] Department of ...
Author: Loreen Lamb
14 downloads 0 Views 138KB Size
Formal Languages and Automata Chapter 6 Simplification of Context-Free Grammars and Normal Forms Chuan-Ming Liu [email protected]

Department of Computer Science and Information Engineering National Taipei University of Technology Taipei, TAIWAN

Mobile Computing and Software Engineering – p. 1/3

Objectives Study several transformations and substitutions Investigate normal forms for context-free grammars (cfg’s) Chomsky normal form Greibach normal form Discuss useless, λ−, and unit- productions

Mobile Computing and Software Engineering – p. 2/3

Contents Methods for Transforming Grammars Two Important Normal Forms A Membership Algorithm for cfg’s∗ .

Mobile Computing and Software Engineering – p. 3/3

Methods for Transforming Gramma The empty string plays a rather singular role in many theorems and proofs. We restrict our discussion to λ-free languages.

Mobile Computing and Software Engineering – p. 4/3

A Useful Substitution Rule Theorem: Let G = (V, T, S, P ) be a context-free grammar (cfg). Suppose that P contains a production of the form A → x1 Bx2 . Assume that A and B are different variables and that B → y1 |y2 | · · · |yn is the set of all productions in P which have B as the left side. ˆ = (V, T, S, Pˆ ) be the grammar in which Pˆ is Let G constructed by deleting A → x1 Bx2 from P , and adding to it A → x1 y1 x2 |x1 y2 x2 | · · · |x1 yn x2 . ˆ = L(G). Then, L(G)

Mobile Computing and Software Engineering – p. 5/3

Example 1 Consider G = ({A, B}, {a, b, c}, A, P ) with A → a|aaA|abBc, B → abbA|b ˆ as described in the above theorem? What is G Note: 1. The substitution rule we discussed here needs that A and B are distinct. How about A = B? 2. Consider the productions associated with B after the substitution.

Mobile Computing and Software Engineering – p. 6/3

Removing Useless Productions Definition: (Useful Variable) Let G = (V, T, S, P ) be a context-free grammar (cfg). A variable A is said to be useful if and only if there exists at least one w ∈ L(G) such that S =⇒∗ xAy =⇒∗ w, with x, y ∈ (V ∪ T )∗ . Definition: (Useless Variable) A variable is not useful is called useless. A production is useless if it involves any useless variable.

Mobile Computing and Software Engineering – p. 7/3

Example 2 Consider G = ({A, B, S}, {a, b}, S, P ) with S → A, A → aA|λ, B → bA B is useless and B → bA is a useless production. Note: Two reasons why a variable is useless 1. it can not be reached from the start symbol. 2. it can not derive a terminal string.

Mobile Computing and Software Engineering – p. 8/3

Example 3 Eliminate useless variables and productions from G = ({A, B, C, S}, {a, b}, S, P ) with P consisting of S A B C

→ → → →

aS|A|C, a, aa, aCb.

Solution:

Mobile Computing and Software Engineering – p. 9/3

Rule 1 Theorem: Let G = (V, T, S, P ) be a context-free grammar. Then there exists an equivalent grammar ˆ = (Vˆ , Tˆ , S, Pˆ ) that does not contain any useless G variables or productions.

Mobile Computing and Software Engineering – p. 10/3

Removing λ-Productions Definition: (λ-production) Any production of a cfg of the form A→λ is call a λ-production. Definition: (Nullable Variable) Any variable A for which the derivation A =⇒∗ λ is possible is called nullable.

Mobile Computing and Software Engineering – p. 11/3

Example 4 Consider the grammar S → aS1 b, S1 → aS1 b|λ, which generates the λ-free language {an bn : n ≥ 1}. S1 → λ can be removed by adding S → ab, S1 → ab.

Mobile Computing and Software Engineering – p. 12/3

Rule 2 Theorem: Let G = (V, T, S, P ) be a context-free grammar with λ 6∈ L(G). Then there exists an equivalent grammar ˆ = (Vˆ , Tˆ , S, Pˆ ) that does not contain any G λ-productions.

Mobile Computing and Software Engineering – p. 13/3

Example 5 Find a cfg without λ-productions equivalent to the grammar defined by S A B C D

→ → → → →

ABaC, BC, b|λ, D|λ, d.

Mobile Computing and Software Engineering – p. 14/3

Removing Unit-Productions Definition: (Unit-Productions) Any production of a cfg of the form A→B where A, B ∈ V , is call a unit-production.

Mobile Computing and Software Engineering – p. 15/3

Rule 3 Theorem: Let G = (V, T, S, P ) be a context-free grammar without λ-productions. Then there exists an ˆ = (Vˆ , Tˆ , S, Pˆ ) that does not contain equivalent cfg G any unit-productions.

Mobile Computing and Software Engineering – p. 16/3

Example 6 Remove all unit-productions from S → Aa|B, B → A|bb, A → a|bc|B.

Mobile Computing and Software Engineering – p. 17/3

Theorem Let L be a context free language that does not contain λ. Then there exists a cfg that generates L and that does not have any useless productions, λ-productions, or unit-productions.

Mobile Computing and Software Engineering – p. 18/3

Contents Methods for Transforming Grammars Two Important Normal Forms A Membership Algorithm for cfg’s∗ .

Mobile Computing and Software Engineering – p. 19/3

Chomsky Normal Form Note: the string on the right of a production consist of no more than two symbols. Definition: A cfg is in Chomsky normal formChomsky normal form if all productions are of the form A → BC or A → a, where A, B, C ∈ V and a ∈ T .

Mobile Computing and Software Engineering – p. 20/3

Example 7 The grammar S → AS|a, A → SA|b is in Chmosky normal form. The grammar S → AS|AAS, A → SA|aa is not in Chmosky normal form.

Mobile Computing and Software Engineering – p. 21/3

Theorem Any cfg G = (V, T, S, P ) with λ 6∈ L(G) has an ˆ = (Vˆ , Tˆ, S, Pˆ ) in Chomsky equivalent grammar G normal form.

Mobile Computing and Software Engineering – p. 22/3

Example 8 Convert the grammar with productions S → abA, A → aab, B → Ac to Chmosky normal form.

Mobile Computing and Software Engineering – p. 23/3

Greibach Normal Form Note: We put restrictions on the positions in which terminals and variables can appear. Definition: A cfg is in Chomsky normal form if all productions are of the form A → ax, where x ∈ V ∗ and a ∈ T . Note: the form A → ax is common to both Greibach normal form and s-grammar, but Greibach normal form does not put the restriction that the pair (A, a) occur at most once.

Mobile Computing and Software Engineering – p. 24/3

Example 9 The grammar S → AB, A → aA|bB|b, B → b is not in Greibach normal form.

Mobile Computing and Software Engineering – p. 25/3

Example 10 Convert the grammar S → abSb|aa to Greibach normal form.

Mobile Computing and Software Engineering – p. 26/3

Theorem Any cfg G = (V, T, S, P ) with λ 6∈ L(G) has an ˆ = (Vˆ , Tˆ, S, Pˆ ) in Greibach equivalent grammar G normal form.

Mobile Computing and Software Engineering – p. 27/3

Contents Methods for Transforming Grammars Two important Normal Forms A Membership Algorithm for cfg’s∗ .

Mobile Computing and Software Engineering – p. 28/3

CYK Algorithm An algorithm to verify if a given string belongs to the language generated by some given cfg. According to the dynamic programming algorithmic design paradigm. The given cfg G = (V, T, S, P ) is in Chomsky normal form. Given a string w = a1 a2 · · · an , we define substrings wij = ai · · · aj , and subsets Vij = {A ∈ V : A ⇒∗ wij } ⊆ V . Clearly, w ∈ L(G) ⇔ S ∈ V1n .

Mobile Computing and Software Engineering – p. 29/3

Computing Vij Consider two forms defined in Chomsky normal form: For each i, A ∈ Vii ⇐⇒ A → ai ; For j > i, A ⇒∗ wij ⇐⇒ ∃ a production A → BC, with B ⇒∗ wik , C ⇒∗ wk+1j , for some i ≤ k < j. In other words, Vij =

[

{A : A → BC, with B ∈ Vik , C ∈ Vk+1j }

k∈{i,i+1,...,j−1}

Mobile Computing and Software Engineering – p. 30/3

Bottom-up Approach Use a Bottom-up approach to compute all Vij with the equations discussed Compute V11 , V22 , . . . , Vnn , Compute V12 , V23 , . . . , Vn−1,n , Compute V13 , V24 , . . . , Vn−2,n ,

Mobile Computing and Software Engineering – p. 31/3

Algorithm - Pseudocode M EMBERSHIP (G, w) 1 for i ← 1 to n  n = |w|; 2 do if A → ai exists 3 then Vii = A 4 else Vii = ∅ 5 for l ← 2 to n 6 do for i ← 1 to n − l + 1 7 do j ← i + l − 1 8 for k ← i to j − 1 9 do if A → BC exists 10 for B ∈ Vik and C ∈ Vk+1,j 11 then Vij = Vij ∪ {A} 12 if S ∈ V1n 13 then w ∈ L(G)

Mobile Computing and Software Engineering – p. 32/3

Example 11 Determine whether the string w = aabbb is in the language generated by the grammar S → AB, A → BB|a, B → AB|b.

Mobile Computing and Software Engineering – p. 33/3

Suggest Documents