Ambiguities and Conflicts

The former grammar is ambiguous. For instance, an expression like exp '-' exp followed by a minus '-' can be worked in more than one way. An expression like:

               4 - 3 - 1

Is ambiguous. If you can't see it, it is because after so many years in school, your mind has ruled out one of the interpretations. Two interpretations of the former phrase are:

               (4 - 3) - 1
               4 - (3 - 1)
In our planet the first interpretation is preferred over the second.

If we have an input like NUM - NUM - NUM the activity of a LALR(1) parser (the family of parsers to which Eyapp belongs) consists of a sequence of shift and reduce actions:

For input NUM - NUM - NUM the activity will be as follows (the dot is used to indicate where the next input token is):

.NUM - NUM - NUM # shift
 NUM.- NUM - NUM # reduce exp: NUM 
 exp.- NUM - NUM # shift
 exp -.NUM - NUM # shift
 exp - NUM.- NUM # reduce exp: NUM
 exp - exp.- NUM # shift/reduce conflict
up to this point two different decisions can be taken: the next description can be
 exp.- NUM # reduce by exp: exp '-' exp
or:
 exp - exp -.NUM # shift '-'
that is called a shift-reduce conflict: the parser must decide whether to shift NUM or to reduce by the rule exp: exp - exp. A shift-reduce conflict means that the parser is not in condition to decide whether to associate the processed phrase (left association) or to continue reading more input to make an association later (right association). This incapability usually comes from the fact that the grammar is ambiguous but can also be due to other reasons, as the myopic condition of the parser, being able only to see one token ahead.

Another kind of conflicts are reduce-reduce conflicts. They arise when more that rhs can be applied for a reduction action.

The precedence declarations in the head section tells the parser what to do in case of ambiguity.

By associating priorities with tokens the programmer can tell Eyapp what syntax tree to build in case of conflict.

The declarations %nonassoc, %left and %right declare and associate a priority with the tokens that follow them.

When there is a shift-reduce conflict the precedence of the rule and the precedence of the incoming token are compared Thus, in the example we are saying that '+' and '-' have the same precedence but higher than '='. The final effect of '-' having greater precedence than '=' is that an expression like a=4-5 is interpreted as a=(4-5) and not as (a=4)-5.

The use of %left applied to '-' indicates that, in case of ambiguity and a match between precedences, the parser must build the tree corresponding to a left parenthesization. Thus, 4-5-9 is interpreted as (4-5)-9.

As was said, the %prec directive can be used when a rhs is involved in a conflict and has no tokens inside or it has but the precedence of the last token leads to an incorrect interpretation. A rhs can be followed by an optional %prec token directive giving the production the precedence of the token

exp:   '-' exp %prec NEG { -$_[1] }
This solves the conflict in - NUM - NUM between (- NUM) - NUM and - (NUM - NUM). Since NEG has more priority than '-' the first interpretation will win.

Procesadores de Lenguajes 2010-01-31