1

I am changing a few pyparsing patterns from pyparsing version 2 to pyparsing version 3.

The contents of the sample file I use for parsing

this is a sample page to test parsing
line 0001 line 1
line 0002 line 2

the pattern used to create parser

p.Literal('line ') + p.Regex(r'(?P<abc>\d+)') + p.SkipTo(p.LineEnd().suppress())

When I using the locatedExpr from version2 I get the following output

{'locn_start': 55, 'abc': '0002', 'value': ['line ', '0002', ' line 2'], 'locn_end': 71}

when I use Located from version 3 I get the following output for the same pattern

{'locn_start': 55, 'value': {'abc': '0002'}, 'locn_end': 71}

However if I remove the named capturing group from the pattern like below example

[p.Literal('line ') + p.Regex(r'\d+')] + p.SkipTo(p.LineEnd().suppress())

I get the same output as locatedExpr

{'locn_start': 55, 'value': ['line ', '0002', ' line 2'], 'locn_end': 71}

However I like to group the information in my parsing and I was wondering if anybody know the difference between Located and locatedExpr

In all the above cases I am using parse_with_tabs

5
  • Is this pyparsing 3.0.9? Commented Apr 12, 2023 at 22:45
  • Yes this is pyparsing 3.0.9 Commented Apr 12, 2023 at 22:51
  • What is the purpose of the []'s in [p.Literal('line ') + p.Regex(r'\d+')] + p.SkipTo(p.LineEnd().suppress())? Commented Apr 16, 2023 at 23:26
  • locatedExpr is deprecated in favor of Located, introduced in 3.0.0. Commented Apr 16, 2023 at 23:41
  • 1
    Why did you edit the question so severely? Don't make it non understandable. Commented Apr 19, 2023 at 20:58

1 Answer 1

1

I think you may be using as_dict() to view the contents of the parsed results. as_dict() does not display unnamed elements in the results. Please use the dump() method instead. Pyparsing's run_tests method uses dump() to display the parsed results:

import pyparsing as p

tests = """
    line 0002 line 2
"""

parser = p.Literal('line ') + p.Regex(r'(?P<abc>\d+)') + p.SkipTo(p.LineEnd().suppress())
p.Located(parser).run_tests(tests)
p.locatedExpr(parser).run_tests(tests)

parser = p.Literal('line ') + p.Regex(r'\d+') + p.SkipTo(p.LineEnd().suppress())
p.Located(parser).run_tests(tests)
p.locatedExpr(parser).run_tests(tests)

prints

line 0002 line 2
[0, ['line ', '0002', 'line 2'], 16]
- locn_end: 16
- locn_start: 0
- value: ['line ', '0002', 'line 2']
  - abc: '0002'
[0]:
  0
[1]:
  ['line ', '0002', 'line 2']
  - abc: '0002'
[2]:
  16

line 0002 line 2
[[0, 'line ', '0002', 'line 2', 16]]
[0]:
  [0, 'line ', '0002', 'line 2', 16]
  - abc: '0002'
  - locn_end: 16
  - locn_start: 0
  - value: ['line ', '0002', 'line 2']

line 0002 line 2
[0, ['line ', '0002', 'line 2'], 16]
- locn_end: 16
- locn_start: 0
- value: ['line ', '0002', 'line 2']
[0]:
  0
[1]:
  ['line ', '0002', 'line 2']
[2]:
  16

line 0002 line 2
[[0, 'line ', '0002', 'line 2', 16]]
[0]:
  [0, 'line ', '0002', 'line 2', 16]
  - locn_end: 16
  - locn_start: 0
  - value: ['line ', '0002', 'line 2']

The new Located class is more consistent in how it reports the parsed value, whether it does or does not contain any named items or regex groups.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.