Skip to content

Pushdown Automata

A pushdown automaton (PDA) is essentially an ε-NFA with a stack. The stack provides unlimited memory, making PDAs more powerful than finite automata.


Formal Definition

A PDA is a 7-tuple P=(Q,Σ,Γ,δ,q0,Z0,F):

  • Q: finite set of states
  • Σ: input alphabet
  • Γ: stack alphabet
  • δ: transition function Q×(Σ{ε})×Γ finite subsets of Q×Γ
  • q0Q: start state
  • Z0Γ: initial stack symbol
  • FQ: accepting states

Transition Meaning

δ(q,a,X)={(p,γ)} means:

In state q, reading input a, with X on top of stack, go to state p and replace X by γ on the stack.

  • If a=ε: move allowed without consuming input (spontaneous)
  • If γ=ε: pop X (push nothing)
  • If γ=YZ: push Z, then Y (convention: leftmost symbol goes on top)

Instantaneous Descriptions (ID)

(q,w,α) represents:

  • State q
  • Remaining input w
  • Stack contents α (top at left)

Moves

(q,aw,Xβ)(p,w,γβ) if (p,γ)δ(q,a,X)

= zero or more moves.


Two Notions of Acceptance

Acceptance by Final State

L(P)={w(q0,w,Z0)(q,ε,α) for some qF,αΓ}

The PDA accepts if, after reading all input, it reaches a final state (stack contents irrelevant).

Acceptance by Empty Stack

N(P)={w(q0,w,Z0)(q,ε,ε)}

The PDA accepts if, after reading all input, the stack is empty (final state irrelevant).

Equivalence

These two notions are equivalent. Given a PDA accepting by final state, we can construct one accepting by empty stack (and vice versa).


Equivalence of PDA and CFG

Theorem: A language is context-free iff it is accepted by some PDA.

Direction 1: CFG PDA

Let L=L(G) for some CFG G. Construct PDA P that simulates leftmost derivations.

Intuition: At each step, P represents some left-sentential form (LSF) that has already been partially matched against the input. The stack stores the unmatched suffix of the LSF (variables and terminals, with leftmost at top).

Transition function:

  1. Type 1: δ(q,a,a)={(q,ε)} for aΣ
    • Match input symbol with terminal on top of stack
  2. Type 2: For each production Aα: δ(q,ε,A)={(q,α)}
    • Expand variable by its production body (replace A with α on stack)

Proof of correctness: We prove (q,wx,S)(q,x,α) iff Slmwα (leftmost derivation).

This ensures L(P)=L(G).

Direction 2: PDA CFG

Given PDA P accepting by empty stack, construct CFG G.

Key idea: Variables of G are of the form [pXq], representing:

Starting in state p with X on top of stack, eventually pop X and reach state q (with empty stack relative to X).

Productions:

  • Basis: If (r,ε)δ(p,a,X), then [pXr]a (pop X in one move)
  • One intermediate: If (r,Y)δ(p,a,X), then [pXq]a[rYq] for all states q
  • Two intermediate: If (r,YZ)δ(p,a,X), then [pXq]a[rYs][sZq] for all states s,q
  • General: If (r,Y1Y2Yk)δ(p,a,X), we need variables for each intermediate state

Start symbol: S with productions S[q0Z0q] for all states q.

Proof: (q0,w,Z0)(p,ε,ε) iff [q0Z0p]w.


Properties of Context-Free Languages

Decision Properties

Given a CFG (or PDA), we can decide:

  1. Emptiness: Is L(G)=? — Check if start symbol is a "useful" variable (derives some terminal string). Solved during useless-variable elimination.
  2. Membership: Is wL(G)? — Use the CYK algorithm, O(n3) where n=|w|.
  3. Infiniteness: Is L(G) infinite? — Similar to regular languages: find if there is a cycle in the variable dependency graph that generates non-empty strings.

Non-Decision Properties (Undecidable)

Questions that cannot be decided for CFLs (but can for regular languages):

  • Are two CFLs equal? (L(G1)=L(G2))
  • Are two CFLs disjoint? (L(G1)L(G2)=)
  • Is a CFL equal to Σ?
  • Is the complement of a CFL also context-free?

These require Turing machine theory to prove.

CYK Algorithm (Cocke-Younger-Kasami)

Input: A CFG G in CNF and string w=a1a2an. Output: Whether wL(G).

Dynamic Programming approach: Let Xij be the set of variables that derive aiai+1aj.

  • Basis (j=i): Xii={AAai is a production}
  • Induction (i<j): Xij={AABC, and k,ik<j:BXik,CXk+1,j}
  • Accept: wL(G) iff SX1n

The algorithm fills an n×n table bottom-up. Time complexity: O(n3).

Closure Properties

CFLs are closed under:

  • Union (LM): Add SSLSM to combined grammar
  • Concatenation (LM): Add SSLSM to combined grammar
  • Kleene Star (L): Add SSLSε
  • Reversal (LR): Reverse all right sides in productions
  • Homomorphism (h(L)): Replace each terminal a by h(a)
  • Substitution: Replace each terminal a by La (a CFL)
  • Inverse Homomorphism (h1(L)): Use PDA construction — buffer h(a) in finite control

CFLs are NOT closed under:

  • Intersection: {0n1n2n} is intersection of two CFLs ({0n1n2i} and {0i1n2n}), but is not context-free
  • Complement: If closed, intersection would be closed (L1L2=L1L2)
  • Difference: Similar argument

However, the intersection of a CFL with a regular language IS a CFL (simulate DFA and PDA in parallel — product construction with stack from PDA).


Deterministic PDA (DPDA)

A PDA is deterministic if for every configuration, at most one move is possible:

  1. For each q,a,X, at most one choice (including ε-moves)
  2. If δ(q,ε,X), then δ(q,a,X)= for all aΣ

DPDAs accept a proper subclass of CFLs called deterministic CFLs (DCFLs). DCFLs are important because they can be parsed efficiently (LR parsers). DCFLs are closed under complement.