1

I'm new to Antlr4/CFG and am trying to write a parser for a boolean querying DSL of the form

(id AND id AND ID (OR id OR id OR id))

The logic can also take the form

(id OR id OR (id AND id AND id))

A more complex example might be:

(((id AND id AND (id OR id OR (id AND id))))) (enclosed in an arbitrary amount of parentheses)

I've tried two things. First, I did a very simple parser, which ended up parsing everything left to right:

grammar filter;

filter: expression EOF;

expression
    : LPAREN expression RPAREN
    | expression (AND expression)+
    | expression (OR expression)+
    | atom;

atom 
    : INT;

I got the following parse tree for input:

( 60 ) AND ( 55 ) AND ( 53 ) AND ( 3337 OR 2830 OR 23)

Attempt 1

This "works", but ideally I want to be able to separate my AND and OR blocks. Trying to separate these blocks into separate grammars leads to left-recursion. Secondly, I want my AND and OR blocks to be grouped together, instead of reading left-to-right, for example, on input (id AND id AND id), I want:

(and id id id)

not

(and id (and id (and id)))

as it currently is.

The second thing I've tried is making OR blocks directly descendant of AND blocks (ie the first case).

grammar filter;

filter: expression EOF;

expression
    : LPAREN expression RPAREN
    | and_expr;

and_expr
    : term (AND term)* ;

term
    : LPAREN or_expr RPAREN
    | LPAREN atom RPAREN ;

or_expr
    : atom (OR atom)+;

atom: INT ;

For the same input, I get the following parse tree, which is more along the lines of what I'm looking for but has one main problem: there isn't an actual hierarchy to OR and AND blocks in the DSL, so this doesn't work for the second case. This approach also seems a bit hacky, for what I'm trying to do.

Attempt 2

What's the best way to proceed? Again, I'm not too familiar with parsing and CFGs, so some guidance would be great.

1 Answer 1

1

Both are equivalent in their ability to parse your sample input. If you simplify your input by removing the unnecessary parentheses, the output of this grammar looks pretty good too:

grammar filter;
filter: expression EOF;
expression
    : LPAREN expression RPAREN
    | expression (AND expression)+
    | expression (OR expression)+
    | atom;
atom : INT;
INT: DIGITS;
DIGITS : [0-9]+;
AND : 'AND';
OR : 'OR';
LPAREN : '(';
RPAREN : ')';
WS: [ \t\r\n]+ -> skip;

Which is what I suspect your first grammar looks like in its entirety.

Your second one requires too many parentheses for my liking (mainly in term), and the breaking up of AND and OR into separate rules instead of alternatives doesn't seem as clean to me.

You can simplify even more though:

grammar filter;
filter: expression EOF;
expression
    : LPAREN expression RPAREN   # ParenExp
    | expression AND expression  # AndBlock
    | expression OR expression   # OrBlock
    | atom                       # AtomExp
    ;
atom : INT;
INT: DIGITS;
DIGITS : [0-9]+;
AND : 'AND';
OR : 'OR';
LPAREN : '(';
RPAREN : ')';
WS: [ \t\r\n]+ -> skip;

This gives a tree with a different shape but still is equivalent. And note the use of the # AndBlock and # OrBlock labels... these "alternative labels" will cause your generated listener or visitor to have separate methods for each, allowing you to completely separate these two in your code semantically as well as syntactically. Perhaps that's what you're looking for?

I like this one the best because it's the simplest and clearer recursion, and offers specific code alternatives for AND and OR.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the help. While I agree that this does work, I think what I want in the final parse tree is to have the AND and OR blocks grouped together (like in my second attempt). I'm not completely sure, is this what you were trying to achieve with your labels? Ultimately, I hope to use the parse tree to convert the DSL to a json string (if you have any advice for how to get started on that, that would be great).
The value of the labels doesn't become apparent until you generate your listener and/or visitor. Then you'll see how they come in handy.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.