# Grammar

By introducing more or less restrictive criteria on the form of the rules of grammar, one obtains classes of hierarchical grammars, ordered by inclusion. The classification of grammars, defined in 1957 by Noam Chomsky, distinguishes four classes.

## Chomsky’s Classes

Type 0 : no restriction on the rules.

Type 1 : context-sensitive grammars. The rules of R are of the form:

• uAv → uwv with A ∈ N, u, v ∈ (N ∪ T)∗ and w ∈ (N ∪ T)+
• In other words, the non-terminal symbol A is replaced by w if we have the contexts u on the left and v on the right.

Type 2 : context-free grammars. The rules of R are of the form:

• A → w with A ∈ N and w ∈ (N ∪ T)
• In other words, the left-hand side of each rule consists of a single non-terminal symbol.

Type 3 : regular grammars

• to the right. The rules of R are of the form
A → aB or A → a with A, B ∈ N and a ∈ T
• to the left. The rules of R are of the form
A → Ba or A → a with A, B ∈ N and a ∈ T
• In other words, the left-hand side of each rule consists of a single non-terminal symbol, and the right-hand side consists of a terminal symbol and possibly a non-terminal symbol. For right regular grammars, the non-terminal symbol should always be on the right of the terminal symbol, while for left regular grammars it should be on the left.

## Example 1

We consider the language L of words on {0, 1} which represent unsigned integers in base 2 (the words of this language all end with 0 and do not start with 0, except for the null integer). Formally define L and build a regular grammar describing L.

The language is as following: L = {0, 1u0/ u ∈ {0, 1}}

We define a right regular grammar G = (T, N, S, R) where
T = { a, b, c }
N = { S, U }
R = { S → 0 | 1U
U → 1U | 0U | 0 }
or a left regular grammar G = (T, N, S, R) where
T = { a, b, c }
N = { S, U }
R = { S → 0 | U0
U → U1 | U0 | 1 }

Preferably, we always take the right regular grammar.

## Example 2

Here are the grammar rules: P → aP, P → aQ, Q → bP, Q → R, R → bR, R → cQ, R → bP, R → ε. The non-terminals of the grammar are {P, Q, R}, the initial symbol is P.

Let Xp, Xq, Xr be the languages from non-terminals P, Q and R. The system of equations that describes those languages is as follows: 