Thompson's algorithm or Thompson's construction makes it possible to construct a automaton non-deterministic with epsilon-transition from a regular expression. The automaton is built recursively from basic patterns:
Unit elements like ∅, ε and a are recognized by
We consider the operators of the regular expression R in decreasing order of exteriority. The outermost operator makes it possible to divide R in R1 | R2, R1.R2 or R1 * which are recognized by
Here is an example of construction with the expression (a | b) (a ∗ | ba ∗ | b ∗) ∗
We start the recursion by creating (A) (B) - the content is only indicative, we will see how to fill A and B as we go.
Then by placing the star on (B) * - here we take the opportunity to do the OR function in (A) = (a | b).
The next step is to do the OR function in (B) = (C | D | E)
The process contained by descending recursively until obtaining the final Thompson automaton.
The latter should then be determined and minimized for better reading.
Some simple examples
- (0 + 1)01
- 00(0 + 1)*