Pumping Lemma for Context-Free Languages
The Pumping Lemma for CFLs is a tool for proving that certain languages are not context-free. Like the pumping lemma for regular languages, it's a necessary but not sufficient condition.
Statement
If
(the "pumped" portion plus context is bounded) (at least one of or is non-empty) - For all
, (both and can be pumped simultaneously)
Intuition
Consider a parse tree for a string in a CNF grammar. If the string is long enough, some path from root to leaf must be longer than the number of variables — by the pigeonhole principle, some variable must repeat.
The repeated variable corresponds to a recursive structure: the subtree rooted at the upper occurrence can be "pumped" by repeating (or removing) the lower occurrence.
and are the terminal strings generated from the two occurrences of the repeated variable is what's between them and are the prefix and suffix
Proof of the Pumping Lemma
Let
be a CFG in CNF for . Let
be the number of variables. Choose
. Consider a parse tree for string
with . In a binary tree, the longest path has length , so it contains variables — by pigeonhole, some variable appears twice on this path. Let the upper
generate (eventually) and the lower generate . From the derivation chain:
We can pump:
: (remove the step) : Insert the step twice : Insert times
because the subtree rooted at the upper has depth , so its yield length . because CNF has no unit productions — at least one leaf must be generated.
Proving Non-Context-Free
Strategy
- Assume
is context-free, let be the PL constant - Choose
with — critically, must consider ALL legal decompositions - Consider any decomposition
with and - Find some
such that - Contradiction —
is not context-free
Classic Non-CFL Examples
1.
Choice of
Since
Case analysis:
- If
or contains and : has 's and 's interleaved — not in - If
or contains and : has 's and 's interleaved — not in - If
and are all from one or two symbols: pumping changes the count of one or two symbols but not all three, breaking the equal-count property
In all cases,
2.
Choice of
Since
Alternative choice:
3.
Choice of
Since
- If
contains no 2's: pumping up adds 0's or 1's, can make or - If
contains 2's but no 0's: pumping down removes 2's, can make
Each case leads to a string violating
Ogden's Lemma (Extension)
A stronger version of the Pumping Lemma that allows marking distinguished positions. Useful for proving languages like
Statement
If
contains at most distinguished positions contains at least one distinguished position for all
Comparison: Regular vs. Context-Free Pumping Lemma
| Feature | Regular PL | CFL PL |
|---|---|---|
| Pumped portions | One substring | Two substrings |
| Bound | ||
| Non-empty | ||
| Derives from | State repetition (cycle) | Variable repetition (recursion) |
| Recognizes | Finite memory | Stack memory (nested structure) |