6

I'm trying to learn basics of the Lemon parser generator, but I stuck quickly.

Here's a tiny grammar:

%right PLUS_PLUS.
%left DOT.

program ::= expr.

member_expr ::= expr DOT IDENTIFIER.

lhs_expr ::= member_expr.

expr ::= lhs_expr.
expr ::= PLUS_PLUS lhs_expr.

It causes 1 parsing conflict:

State 3:
      (3) expr ::= lhs_expr *
      (4) expr ::= PLUS_PLUS lhs_expr *

                           DOT reduce       3      expr ::= lhs_expr
                           DOT reduce       4       ** Parsing conflict **
                     {default} reduce       4      expr ::= PLUS_PLUS lhs_expr

Whereas, if I rewrite the last rule as follows:

expr ::= PLUS_PLUS expr DOT IDENTIFIER.

Then it causes no conflicts. But I don't think that this is the right way to go.

I'd appreciate if someone could explain what's the right way, and why.

2
  • I suggest you think of this without the extra names for things. Everywhere you have lhs_expr you could simply write expr DOT IDENTIFIER to see clearly what is really being asked for. If everything is in terms of expr you can see the conflict more clearly. Commented Jan 6, 2017 at 22:48
  • Thing is, lhs_expr can be something other than expr DOT IDENTIFIER. This particular grammar does not contain other rules for it, because I wanted it to be very minimal, but it can also be e.g. IDENTIFIER, or expr LBRACKET expr RBRACKET, etc. And I want to write the rule PLUS_PLUS lhs_expr just once, and cover all possible left-hand side expressions being pre-incremented. Commented Jan 6, 2017 at 22:52

1 Answer 1

5
+150

So you wrote an ambiguous grammar, which says to accept:

 ++ x . y

with two interpretations:

 [++ x ] . y

and

 ++ [x . y]

where the [ ] are just my way to show to the groupings.

Lemon is an L(AL)R parser, and such parsers simply don't handle ambiguities (multiple interpretations). The reduce-reduce conflict reported is what happens when the parser hits that middle dot; does it group "++ x" as "[++ x] ." or as "++ [ x .]"? Both choices are valid, and it can't choose safely.

If you stick with Lemon (or another LALR parser generator), you have to get rid of the problem by changing the grammar. [You could use a GLR parser generator; it would accept and give you both parses. But all you've done then is to push the problem of resolving the ambiguity to the post-parsing phrase. As you don't want an ambiguity, you might as well avoid it during parsing if you can. In this case I think you can.]

I think you are trying to build a C-like language. So you want something like this:

primitive_target ::= IDENTIFIER ;
primitive_target ::= IDENTIFIER '[' expr ']' ;
access_path ::= primitive_target ;
access_path ::= access_path '.' primitive_target ;

lhs ::= access_path ;
lhs ::= PLUS_PLUS access_path ;
lhs ::= access_path PLUS_PLUS ;

program ::= expr ;

expr ::= term ;
expr ::= expr '+' term ;
term :::= '(' expr ')' ;
term ::= lhs ;
term ::= lhs '=' expr ;
term ::= constant ;
Sign up to request clarification or add additional context in comments.

7 Comments

Thanks for the answer. But, shouldn't the ambiguity of ++ x . y be solved by the priorities of PLUS_PLUS and DOT? Given my grammar, DOT has a higher priority (because it appears after PLUS_PLUS).
I'm don't know ANTLR's %right and %left declarations. What you in essence want is something that forces a shift when you encounter dot with left context containing ++. It isn't clear to me that %right and %left force shifting; my guess is they force associativity on binary operators, but PLUS_PLUS isn't a binary operator so it might be meaningless. My cure was to avoid those special declarations and force the grammar to do what I (well, you) wanted directly.
It's not ANTLR, but Lemon. Anyway, thank you so much!
Oh, yes, I knew that when I wrote the answer, lost in fog over last several days. You're welcome.
Still, this grammar won't parse expressions like foo.bar.baz.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.