Formal Languages and Automata Chapter 6 Simplification of Context-Free Grammars and Normal Forms Chuan-Ming Liu
[email protected]
Department of Computer Science and Information Engineering National Taipei University of Technology Taipei, TAIWAN
Mobile Computing and Software Engineering – p. 1/3
Objectives Study several transformations and substitutions Investigate normal forms for context-free grammars (cfg’s) Chomsky normal form Greibach normal form Discuss useless, λ−, and unit- productions
Mobile Computing and Software Engineering – p. 2/3
Contents Methods for Transforming Grammars Two Important Normal Forms A Membership Algorithm for cfg’s∗ .
Mobile Computing and Software Engineering – p. 3/3
Methods for Transforming Gramma The empty string plays a rather singular role in many theorems and proofs. We restrict our discussion to λ-free languages.
Mobile Computing and Software Engineering – p. 4/3
A Useful Substitution Rule Theorem: Let G = (V, T, S, P ) be a context-free grammar (cfg). Suppose that P contains a production of the form A → x1 Bx2 . Assume that A and B are different variables and that B → y1 |y2 | · · · |yn is the set of all productions in P which have B as the left side. ˆ = (V, T, S, Pˆ ) be the grammar in which Pˆ is Let G constructed by deleting A → x1 Bx2 from P , and adding to it A → x1 y1 x2 |x1 y2 x2 | · · · |x1 yn x2 . ˆ = L(G). Then, L(G)
Mobile Computing and Software Engineering – p. 5/3
Example 1 Consider G = ({A, B}, {a, b, c}, A, P ) with A → a|aaA|abBc, B → abbA|b ˆ as described in the above theorem? What is G Note: 1. The substitution rule we discussed here needs that A and B are distinct. How about A = B? 2. Consider the productions associated with B after the substitution.
Mobile Computing and Software Engineering – p. 6/3
Removing Useless Productions Definition: (Useful Variable) Let G = (V, T, S, P ) be a context-free grammar (cfg). A variable A is said to be useful if and only if there exists at least one w ∈ L(G) such that S =⇒∗ xAy =⇒∗ w, with x, y ∈ (V ∪ T )∗ . Definition: (Useless Variable) A variable is not useful is called useless. A production is useless if it involves any useless variable.
Mobile Computing and Software Engineering – p. 7/3
Example 2 Consider G = ({A, B, S}, {a, b}, S, P ) with S → A, A → aA|λ, B → bA B is useless and B → bA is a useless production. Note: Two reasons why a variable is useless 1. it can not be reached from the start symbol. 2. it can not derive a terminal string.
Mobile Computing and Software Engineering – p. 8/3
Example 3 Eliminate useless variables and productions from G = ({A, B, C, S}, {a, b}, S, P ) with P consisting of S A B C
→ → → →
aS|A|C, a, aa, aCb.
Solution:
Mobile Computing and Software Engineering – p. 9/3
Rule 1 Theorem: Let G = (V, T, S, P ) be a context-free grammar. Then there exists an equivalent grammar ˆ = (Vˆ , Tˆ , S, Pˆ ) that does not contain any useless G variables or productions.
Mobile Computing and Software Engineering – p. 10/3
Removing λ-Productions Definition: (λ-production) Any production of a cfg of the form A→λ is call a λ-production. Definition: (Nullable Variable) Any variable A for which the derivation A =⇒∗ λ is possible is called nullable.
Mobile Computing and Software Engineering – p. 11/3
Example 4 Consider the grammar S → aS1 b, S1 → aS1 b|λ, which generates the λ-free language {an bn : n ≥ 1}. S1 → λ can be removed by adding S → ab, S1 → ab.
Mobile Computing and Software Engineering – p. 12/3
Rule 2 Theorem: Let G = (V, T, S, P ) be a context-free grammar with λ 6∈ L(G). Then there exists an equivalent grammar ˆ = (Vˆ , Tˆ , S, Pˆ ) that does not contain any G λ-productions.
Mobile Computing and Software Engineering – p. 13/3
Example 5 Find a cfg without λ-productions equivalent to the grammar defined by S A B C D
→ → → → →
ABaC, BC, b|λ, D|λ, d.
Mobile Computing and Software Engineering – p. 14/3
Removing Unit-Productions Definition: (Unit-Productions) Any production of a cfg of the form A→B where A, B ∈ V , is call a unit-production.
Mobile Computing and Software Engineering – p. 15/3
Rule 3 Theorem: Let G = (V, T, S, P ) be a context-free grammar without λ-productions. Then there exists an ˆ = (Vˆ , Tˆ , S, Pˆ ) that does not contain equivalent cfg G any unit-productions.
Mobile Computing and Software Engineering – p. 16/3
Example 6 Remove all unit-productions from S → Aa|B, B → A|bb, A → a|bc|B.
Mobile Computing and Software Engineering – p. 17/3
Theorem Let L be a context free language that does not contain λ. Then there exists a cfg that generates L and that does not have any useless productions, λ-productions, or unit-productions.
Mobile Computing and Software Engineering – p. 18/3
Contents Methods for Transforming Grammars Two Important Normal Forms A Membership Algorithm for cfg’s∗ .
Mobile Computing and Software Engineering – p. 19/3
Chomsky Normal Form Note: the string on the right of a production consist of no more than two symbols. Definition: A cfg is in Chomsky normal formChomsky normal form if all productions are of the form A → BC or A → a, where A, B, C ∈ V and a ∈ T .
Mobile Computing and Software Engineering – p. 20/3
Example 7 The grammar S → AS|a, A → SA|b is in Chmosky normal form. The grammar S → AS|AAS, A → SA|aa is not in Chmosky normal form.
Mobile Computing and Software Engineering – p. 21/3
Theorem Any cfg G = (V, T, S, P ) with λ 6∈ L(G) has an ˆ = (Vˆ , Tˆ, S, Pˆ ) in Chomsky equivalent grammar G normal form.
Mobile Computing and Software Engineering – p. 22/3
Example 8 Convert the grammar with productions S → abA, A → aab, B → Ac to Chmosky normal form.
Mobile Computing and Software Engineering – p. 23/3
Greibach Normal Form Note: We put restrictions on the positions in which terminals and variables can appear. Definition: A cfg is in Chomsky normal form if all productions are of the form A → ax, where x ∈ V ∗ and a ∈ T . Note: the form A → ax is common to both Greibach normal form and s-grammar, but Greibach normal form does not put the restriction that the pair (A, a) occur at most once.
Mobile Computing and Software Engineering – p. 24/3
Example 9 The grammar S → AB, A → aA|bB|b, B → b is not in Greibach normal form.
Mobile Computing and Software Engineering – p. 25/3
Example 10 Convert the grammar S → abSb|aa to Greibach normal form.
Mobile Computing and Software Engineering – p. 26/3
Theorem Any cfg G = (V, T, S, P ) with λ 6∈ L(G) has an ˆ = (Vˆ , Tˆ, S, Pˆ ) in Greibach equivalent grammar G normal form.
Mobile Computing and Software Engineering – p. 27/3
Contents Methods for Transforming Grammars Two important Normal Forms A Membership Algorithm for cfg’s∗ .
Mobile Computing and Software Engineering – p. 28/3
CYK Algorithm An algorithm to verify if a given string belongs to the language generated by some given cfg. According to the dynamic programming algorithmic design paradigm. The given cfg G = (V, T, S, P ) is in Chomsky normal form. Given a string w = a1 a2 · · · an , we define substrings wij = ai · · · aj , and subsets Vij = {A ∈ V : A ⇒∗ wij } ⊆ V . Clearly, w ∈ L(G) ⇔ S ∈ V1n .
Mobile Computing and Software Engineering – p. 29/3
Computing Vij Consider two forms defined in Chomsky normal form: For each i, A ∈ Vii ⇐⇒ A → ai ; For j > i, A ⇒∗ wij ⇐⇒ ∃ a production A → BC, with B ⇒∗ wik , C ⇒∗ wk+1j , for some i ≤ k < j. In other words, Vij =
[
{A : A → BC, with B ∈ Vik , C ∈ Vk+1j }
k∈{i,i+1,...,j−1}
Mobile Computing and Software Engineering – p. 30/3
Bottom-up Approach Use a Bottom-up approach to compute all Vij with the equations discussed Compute V11 , V22 , . . . , Vnn , Compute V12 , V23 , . . . , Vn−1,n , Compute V13 , V24 , . . . , Vn−2,n ,
Mobile Computing and Software Engineering – p. 31/3
Algorithm - Pseudocode M EMBERSHIP (G, w) 1 for i ← 1 to n n = |w|; 2 do if A → ai exists 3 then Vii = A 4 else Vii = ∅ 5 for l ← 2 to n 6 do for i ← 1 to n − l + 1 7 do j ← i + l − 1 8 for k ← i to j − 1 9 do if A → BC exists 10 for B ∈ Vik and C ∈ Vk+1,j 11 then Vij = Vij ∪ {A} 12 if S ∈ V1n 13 then w ∈ L(G)
Mobile Computing and Software Engineering – p. 32/3
Example 11 Determine whether the string w = aabbb is in the language generated by the grammar S → AB, A → BB|a, B → AB|b.
Mobile Computing and Software Engineering – p. 33/3