1

I'm working on implementing a parser that is supposed to process the input string, extracting its components, validating them, and then creating a SQL Alchemy query from it. At the moment, I'm working on the first part of the parser and encountered a certain issue. I want to define an exception that checks the correctness of the filter.

Filter definition:

filter_term = Combine(Optional(space) + Word(alphas) + Optional(space)).set_results_name("filter").set_parse_action(
    filter_validator).set_name("filter")

I would like to add additional validation for the filter - I have specific words that can be used as filters, they will be defined as a dictionary with aliases, for example:

    "animal": "animal",
    "dog": "animal",
    "cat": "animal",
    "pet": "animal"
}

In the provided code, I am using a simple check to see if the filter equals 'w', and if so, I return an exception.

    if t[0] == "w":
        raise FilterException("Invalid filter")

However, at the moment, this is not happening because my parser throws an exception, but it is not related to filter validation.

ParseException: Expected end of text, found 'and' (at char 15), (line:1, col:16) FAIL: Expected end of text, found 'and' (at char 15), (line:1, col:16)

Could I ask for your help in solving this problem?"

parser:

from pyparsing import Word, Combine, Optional, DelimitedList, alphanums, Suppress, Group, one_of, alphas, \
    CaselessLiteral, infix_notation, opAssoc, OneOrMore, Keyword, CaselessKeyword, pyparsing_common, Forward, \
    ParseException, ParseSyntaxException, ZeroOrMore



class OrOperation:
    def __init__(self, instring, loc, toks):
        raise ParseException(instring, loc, "invalid OR given")


class AndOperation:
    def __init__(self, instring, loc, toks):
        raise ParseException(instring, loc, "invalid AND given")


class FilterException(ParseException):
    def __init__(self, pstr):
        super().__init__(pstr)


def filter_validator(s, l, t):
    if t[0] == "w":
        raise FilterException("Invalid filter")


# utils:
comma = Suppress(",")
space = Suppress(" ")
lbrace = Suppress("(")
rbrace = Suppress(")")
and_operator = Suppress(CaselessKeyword("AND"))
or_operator = CaselessKeyword("OR")

search_parser = Forward().set_name("search_expression")
literal_value = Forward().set_name("literal_value").set_results_name("literal_value")

delimited_list_delim = Optional(comma + Optional(space))
delimited_list = DelimitedList(literal_value, delim=delimited_list_delim).set_parse_action(
    lambda tokens: ", ".join(tokens))

string_literal = Word(alphanums + "_")
wildcard_literal = Combine(string_literal + "*").set_parse_action(lambda tokens: tokens[0].replace("*", "?"))
delimited_list_literal = lbrace + delimited_list + rbrace

filter_term = Combine(Optional(space) + Word(alphas) + Optional(space)).set_results_name("filter").set_parse_action(
    filter_validator).set_name("filter")
literal_value <<= delimited_list_literal | wildcard_literal | string_literal

equals_operator = one_of("= :")
comparison_operator = one_of("> >= < <= ")
not_equals_operator = CaselessLiteral("!=")
contains_operator = CaselessLiteral("~").set_parse_action(lambda tokens: "LIKE")
not_contains_operator = CaselessLiteral("!~").set_parse_action(lambda tokens: "NOT LIKE")
operator = equals_operator | not_equals_operator | contains_operator | not_contains_operator | comparison_operator
operator_term = Combine(Optional(space) + operator + Optional(space)).set_results_name("operator")
expression_term = Group(filter_term + operator_term + literal_value).set_parse_action(filter_validator) | Group(
    literal_value)

search_parser <<= infix_notation(expression_term,
                                 [
                                     (and_operator, 2, opAssoc.LEFT,
                                      lambda instring, loc, toks: AndOperation(instring, loc, toks)),
                                     (or_operator, 2, opAssoc.LEFT,
                                      lambda instring, loc, toks: OrOperation(instring, loc, toks))
                                 ])

try:
    result = search_parser.parse_string("w~(a, b c, d)")
    print(result.dump())
except FilterException as e:
    print("Filter failed:", e)

search_parser.run_tests('''
asas
was*
(as, b,c d)
((as, b,c d))
w=a
w=a*
w=(a, b c, d)
w:(a, b c, d)
w!=(a, b c, d)
w~(a, b c, d)
w!~(a, b c, d)
w>=(a, b c, d)
a>=(a, b c, d) and a=(a, b c, d)
w>=(a, b c, d) and w=(a, b c, d) and w=(a, b c, d)
w>=(a, b c, d) or (w=(a, b c, d) and w=(a, b c, d))
(w>=(a, b c, d) or w!~(a, b c, d))  or (w=(a, b c, d) and w=(a, b c, d))
w>=(a, b c, d) or w!~(a, b c, d)  or (w=(a, b c, d) and w=(a, b c, d))
w>=(a, b c, d) or w!~(a, b c, d)  or w=(a, b c, d) and w=(a, b c, d)
a>=(a, b c, d) and w!~(a, b c, d)  or w=(a, b c, d) and w!=(a, b c, d)
''')

1 Answer 1

1

Pyparsing's internal logic uses ParseExceptions pretty heavily, as it works through the parser structure of nested ParserElements. Since FilterException extends ParseException, it gets pulled in with all the rest of this try-and-retry internal exception raising and handling.

I changed your exception to this, and I think this gets things to come out closer to what you expect:

class FilterException(Exception):
    def __init__(self, pstr):
        self.msg = pstr

A couple other notes on your parser:

  • Optional(space) for space skipping isn't going to work well, given that pyparsing implicitly skips spaces already. Instead, try:
    filter_term = Word(alphas, as_keyword=True).set_results_name("filter").set_parse_action(
      filter_validator).set_name("filter")
    
  • AndOperation and OrOperation take constructor signatures that already align with parse action signatures, so they can be used in infix_notation as just:
    search_parser <<= infix_notation(expression_term,
                                     [
                                         (and_operator, 2, opAssoc.LEFT, AndOperation),
                                         (or_operator, 2, opAssoc.LEFT, OrOperation)
                                     ])
    
  • Lastly, I split your tests into two parts, to test expression_term separately from search_parser:
expression_term.run_tests('''
    asas
    was*
    (as, b,c d)
    ((as, b,c d))
    w=a
    w=a*
    w=(a, b c, d)
    w:(a, b c, d)
    w!=(a, b c, d)
    w~(a, b c, d)
    w!~(a, b c, d)
    w>=(a, b c, d)
    ''')
    
search_parser.run_tests('''
    a>=(a, b c, d) and a=(a, b c, d)
    w>=(a, b c, d) and w=(a, b c, d) and w=(a, b c, d)
    w>=(a, b c, d) or (w=(a, b c, d) and w=(a, b c, d))
    (w>=(a, b c, d) or w!~(a, b c, d))  or (w=(a, b c, d) and w=(a, b c, d))
    w>=(a, b c, d) or w!~(a, b c, d)  or (w=(a, b c, d) and w=(a, b c, d))
    w>=(a, b c, d) or w!~(a, b c, d)  or w=(a, b c, d) and w=(a, b c, d)
    a>=(a, b c, d) and w!~(a, b c, d)  or w=(a, b c, d) and w!=(a, b c, d)
    ''')
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you! Your comment helps me to fix my case :D

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.