## Grammars

By introducing more or less restrictive criteria on the form of the grammar rules, we obtain hierarchical grammar classes (types of grammars), ordered by inclusion. The classification of grammars, defined in 1957 by Noam Chomsky, distinguishes four classes.

## Chomsky classification

**Type 0:** no restriction on the rules.

**Type 1:** context sensitive or contextual grammars. The rules of R are of the form:

- uAv → uwv with A ∈ N, u, v ∈ (N ∪ T)
^{∗ }and w ∈ (N ∪ T)^{+} - In other words, the non-terminal symbol A is replaced by w if we have the contexts u on the left and v on the right.

**Type 2:** context-free grammars. The rules of R are of the form:

- A → w with A ∈ N and w ∈ (N ∪ T)
^{∗} - In other words, the left side of each rule consists of a single non-terminal symbol.

**Type 3:** regular grammars

- to the right. The rules of R are of the form

A → aB or A → a with A, B ∈ N and a ∈ T - to the left. The rules of R are of the form

A → Ba or A → a with A, B ∈ N and a ∈ T - In other words, the left side of each rule consists of a single non-terminal symbol, and the right side consists of a terminal symbol and possibly a non-terminal symbol. For regular grammars on the right, the non-terminal symbol must always be to the right of the terminal symbol while for regular grammars on the left it must be on the left.

## Example 1

We consider the language L of words on {0, 1} which represent even unsigned base 2 integers (the words of this language all end with 0 and do not start with 0, except for the zero integer). Formally define L and construct a regular grammar describing L.

The language is of the following form: L = {0, 1u0 / u ∈ {0, 1}^{∗}}

We define the regular right-hand grammar G = (T, N, S, R) where

T = {0, 1}

N = {S, U}

R = {S → 0 | 1U

U → 1U | 0U | 0}

as well as the regular left grammar G = (T, N, S, R) where

T = {0, 1}

N = {S, U}

R = {S → 0 | U0

U → U1 | U0 | 1}

Preferably, we always take the regular right-hand grammar for the study of future states in automata.

## Example 2

Here are the grammar productions: P → aP, P → aQ, Q → bP, Q → R, R → bR, R → cQ, R → bP, R → ε. The non-terminals of the grammar are {P, Q, R}, the initial symbol is P.

By denoting with X_{p}, X_{q}, X_{r} the languages accepted from the states P, Q and R respectively, the system of equations for these languages is: