Skip to content

Pumping Lemma for Context-Free Languages

The Pumping Lemma for CFLs is a tool for proving that certain languages are not context-free. Like the pumping lemma for regular languages, it's a necessary but not sufficient condition.


Statement

If L is a context-free language, then there exists a constant n1 (the pumping length) such that for all strings zL with |z|n, we can write z=uvwxy such that:

  1. |vwx|n (the "pumped" portion plus context is bounded)
  2. |vx|1 (at least one of v or x is non-empty)
  3. For all i0, uviwxiyL (both v and x can be pumped simultaneously)

Intuition

Consider a parse tree for a string in a CNF grammar. If the string is long enough, some path from root to leaf must be longer than the number of variables — by the pigeonhole principle, some variable must repeat.

The repeated variable corresponds to a recursive structure: the subtree rooted at the upper occurrence can be "pumped" by repeating (or removing) the lower occurrence.

  • v and x are the terminal strings generated from the two occurrences of the repeated variable
  • w is what's between them
  • u and y are the prefix and suffix

Proof of the Pumping Lemma

  1. Let G be a CFG in CNF for L{ε}.

  2. Let m=|V| be the number of variables.

  3. Choose n=2m.

  4. Consider a parse tree for string z with |z|n. In a binary tree, the longest path has length m+1, so it contains m+1 variables — by pigeonhole, some variable A appears twice on this path.

  5. Let the upper A generate vAx (eventually) and the lower A generate w.

  6. From the derivation chain:

    SuAyuvAxyuvwxy=z
  7. We can pump:

    • i=0: SuAyuwy (remove the AvAx step)
    • i=2: Insert the AvAx step twice
    • i=k: Insert k times
  8. |vwx|n because the subtree rooted at the upper A has depth m, so its yield length 2m=n.

  9. |vx|1 because CNF has no unit productions — at least one leaf must be generated.


Proving Non-Context-Free

Strategy

  1. Assume L is context-free, let n be the PL constant
  2. Choose zL with |z|n — critically, must consider ALL legal decompositions
  3. Consider any decomposition z=uvwxy with |vwx|n and |vx|1
  4. Find some i such that uviwxiyL
  5. ContradictionL is not context-free

Classic Non-CFL Examples

1. L={anbncnn1}

Choice of z: z=anbncn (where n is the PL constant).

Since |vwx|n, the substring vwx can span at most two of the three blocks (a's, b's, c's).

Case analysis:

  • If v or x contains a and b: uv2wx2y has a's and b's interleaved — not in L
  • If v or x contains b and c: uv2wx2y has b's and c's interleaved — not in L
  • If v and x are all from one or two symbols: pumping changes the count of one or two symbols but not all three, breaking the equal-count property

In all cases, uv2wx2yLcontradiction. L is not context-free.

2. L={www{0,1}}

Choice of z: z=0n1n0n1n (palindromic-like choice exploits the bounded window).

Since |vwx|n, vwx falls entirely within one of the four quarters. Pumping changes only that quarter, breaking the ww pattern.

Alternative choice: z=0n10n1 — simpler, similar argument.

3. L={0i1j2ki<j<k}

Choice of z: z=0n1n+12n+2

Since |vwx|n, vwx spans at most two symbols.

  • If vx contains no 2's: pumping up adds 0's or 1's, can make ij or jk
  • If vx contains 2's but no 0's: pumping down removes 2's, can make kj

Each case leads to a string violating i<j<kcontradiction.


Ogden's Lemma (Extension)

A stronger version of the Pumping Lemma that allows marking distinguished positions. Useful for proving languages like {aibjckij or jk} are inherently ambiguous.

Statement

If L is context-free, there exists n such that for any zL, if we mark at least n positions in z as distinguished, we can write z=uvwxy such that:

  1. vwx contains at most n distinguished positions
  2. vx contains at least one distinguished position
  3. uviwxiyL for all i0

Comparison: Regular vs. Context-Free Pumping Lemma

FeatureRegular PLCFL PL
Pumped portionsOne substring yTwo substrings v,x (pumped in tandem)
Bound|xy|n|vwx|n
Non-empty|y|>0|vx|1
Derives fromState repetition (cycle)Variable repetition (recursion)
RecognizesFinite memoryStack memory (nested structure)