flowchart LR

classDef blue fill:#2374f7,stroke:#000,stroke-width:2px,color:#fff
classDef pink fill:#eb3dd6,stroke:#000,stroke-width:2px,color:#fff
classDef orange fill:#fc822b,stroke:#000,stroke-width:2px,color:#fff
classDef red fill:#ed2633,stroke:#000,stroke-width:2px,color:#fff
classDef green fill:#16b522,stroke:#000,stroke-width:2px,color:#fff

init[Build a CNF]

init --> s1(Eliminate Useless Variable):::orange
init --> s2(Remove Epsilon):::blue
init --> s3(Remove Unit Production):::green

s1 --> s11(Derive-Nothing Symbols)
s1 --> s12(Unreachable Symbols)

I. Eliminating Useless Variables


1. Derive-Nothing Symbols

<aside> 💡 Discovery Algorithms

Untitled

  1. Whether a variable derives some terminal string

    Basis: If there is a production $A → w$, where $w$ has no variables, then A derives a terminal string.

    Induction: If there is a production $A → a$, where $a$ consists only of terminals and variables known to derive a terminal string, then A derives a terminal string.

  2. Algorithm to eliminate variables that derive nothing

    1. Discover all variables that derive terminal strings.
    2. For all other variables, remove all productions in which they appear in either the head or body.

2. Unreachable Symbols

Another way a terminal or variable deserves to be eliminated is if it cannot appear in any derivation from the start symbol.

  1. Whether a symbol is reachable

    Basis: We can reach $S$ (the start symbol).

    Induction: if we can reach $A$, and there is a production $A \rightarrow \alpha$, then we can reach all symbols of $\alpha$.

  2. Algorithm to remove symbols that are not reachable

    Remove from the grammar all symbols not discovered reachable from $S$ and all productions that involve these symbols.

3. Together

II. Removing Epsilon


Epsilon Productions: productions of the form $\rm A \rightarrow \epsilon$

ε cannot be in the language of any grammar that has no ε–productions, and hence should be removed