First set

The FIRST set of a symbol in a grammar is the set of tokens that can appear at the beginning of some string derived from that symbol. For each non-terminal symbol (e.g., S, E), the FIRST set includes:

  1. Terminal tokens that appear first in any rule for that symbol.
  2. Tokens from the FIRST set of other symbols if those symbols can appear first in the derivation.

Constructing the FIRST set

Follow set

The FOLLOW set of a symbol in a grammar represents the set of tokens that can appear immediately after that symbol in some derivation of the grammar. This set is crucial for constructing parsers, particularly for handling situations like predictive parsing (LL parsers).

  1. End-of-input token ($): The start symbol always has $ in its FOLLOW set since it should be followed by the end of the input.
  2. Tokens after a symbol: If a symbol S is followed by some token t in a rule, t will be in the FOLLOW set of S.
  3. Tokens from FIRST sets: If a symbol S is followed by another symbol A, and A can produce some token t (i.e., t is in the FIRST set of A), then t will also be in the FOLLOW set of S.
  4. Propagation from FOLLOW sets: If a symbol S is followed by another symbol A at the end of a rule, then all tokens in the FOLLOW set of A are added to the FOLLOW set of S.

Constructing the FOLLOW set

For more info visit the dotlr library docs