flowchart LR
classDef blue fill:#2374f7,stroke:#000,stroke-width:2px,color:#fff
classDef pink fill:#eb3dd6,stroke:#000,stroke-width:2px,color:#fff
classDef orange fill:#fc822b,stroke:#000,stroke-width:2px,color:#fff
classDef red fill:#ed2633,stroke:#000,stroke-width:2px,color:#fff
classDef green fill:#16b522,stroke:#000,stroke-width:2px,color:#fff
init[Build a CNF]
init --> s1(Eliminate Useless Variable):::orange
init --> s2(Remove Epsilon):::blue
init --> s3(Remove Unit Production):::green
s1 --> s11(Derive-Nothing Symbols)
s1 --> s12(Unreachable Symbols)
<aside> 💡 Discovery Algorithms
Whether a variable derives some terminal string
Basis: If there is a production $A → w$, where $w$ has no variables, then A derives a terminal string.
Induction: If there is a production $A → a$, where $a$ consists only of terminals and variables known to derive a terminal string, then A derives a terminal string.
Algorithm to eliminate variables that derive nothing
Another way a terminal or variable deserves to be eliminated is if it cannot appear in any derivation from the start symbol.
Whether a symbol is reachable
Basis: We can reach $S$ (the start symbol).
Induction: if we can reach $A$, and there is a production $A \rightarrow \alpha$, then we can reach all symbols of $\alpha$.
Algorithm to remove symbols that are not reachable
Remove from the grammar all symbols not discovered reachable from $S$ and all productions that involve these symbols.
Epsilon Productions: productions of the form $\rm A \rightarrow \epsilon$
ε cannot be in the language of any grammar that has no ε–productions, and hence should be removed