0

I wrote a pretty simple left-recursive grammar in yacc, based on data types for a simple C-like language. The generated yacc parser reduces before the left recursion suffix, and I have no clue why.

Here is the source code:

%%
start: type {
        printf("Reduced to start.\n");
    };

type: pointer {
        printf("Reduced pointer to type.\n");
    } 
    | char {
        printf("Reduced char to type.\n");
    };

char: KW_CHAR {
        printf("Reduced to char.\n");
    };

pointer: type ASTERISK {
        printf("Reduced to pointer.\n");
    };
%%

Given the input char * (KW_CHAR ASTERISK):

Reduced to char.
Reduced char to type.
syntax error
2
  • 1
    I think you have something wrong with your source code beyond the grammar, or that you didn't build it right. It's hard to say because not enough is shown. Filling in the details, it works fine for me. Commented Sep 30, 2021 at 23:24
  • Most likely a lexer problem -- perhaps the space is causing problems. Try running with YYDEBUG enabled (set yydebug = 1; before calling yyparse();) Commented Oct 1, 2021 at 0:11

1 Answer 1

0

You don't seem to be asking about the parse error you're receiving (which, as noted in the comments, is likely an issue with the token types being returned by yylex()), but rather why the parser reductions are performed in the order shown by your trace. That's the question I will try to address here, even though it's quite possible that it's an XY problem, because understanding reduction order is important.

In this case, the reduction order is pretty simple. If you have a production:

pointer: type ASTERISK 

then type must be reduced before ASTERISK is shifted. In an LR parser, reductions must be done when the right-hand side to be reduced ends exactly at the last consumed input token. (The lookahead token has been identified by the lexical scanner but not yet consumed by the parser. So it's not part of the reduction and can be used also to identify other reductions ending at the same point.)

I hope it's more evident why with the production

type: char

char needs to be reduced before type. Until char is reduced, it's not available to be used in the reduction of type.

Really, these two examples show the same behaviour. Reductions are performed left-to-right, and from the bottom up (that is, children first). Hence the name for this kind of parsing.

So the reduction order shown by your parser (first char, then type, and only after the * is shifted, pointer) is precisely what would be expected.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you. It was a lexer problem, however I didn't understand the parsing sufficiently, so I thought it was a parser problem.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.