3,304 questions
-1
votes
1
answer
57
views
Syntactically required and semantically relevant tokens in Python grammars
I am trying to parse code into AST. I want to keep minimum keywords and delimiters in the AST while keeping the semantics.
In Python function definition def foo():, the last : is syntactically ...
5
votes
1
answer
139
views
An ambiguity in rust
Since we have no parentheses () in rust if and while conditions, there may be some ambiguous situations, e.g.
while 0 > n {} {}
is either parsed to be while (0 > n) {} and {} (2 statements),
or ...
0
votes
0
answers
79
views
Antlr4 Grammar ambiguity
Having trouble understanding the ambiguity reported by Antlr4 (4.7.1):
grammar Temp;
start: expr;
expr:
IDENTIFIER '(' expr (',' expr)* ')' // function call
| expr '.' slice2 ...
0
votes
0
answers
84
views
Parsing first-order logic formulas using parsing expression grammar
I am trying to implement a parser for first-order logic formulas in Rust using pest (which seems to be the reference in Rust).
Apparently, this library is not based on context-free grammars but on ...
0
votes
0
answers
329
views
How to change a verb to its noun form (gerund) using the Stanford-NLP library?
I have individual verbs in a collection of String objects (thanks to How to find whether a word is a noun or a verb using Stanford parser?). Now for each, I would like to obtain the appropriate noun ...
1
vote
0
answers
69
views
Compiler Design - Question on third-order recursion in relation to expression grammar
In the "Compiler Design in C" book, below is a table on a simple expression grammar.
According to the below excerpts:
Note that the grammar is recursive. For example, Production 2 has ...
3
votes
1
answer
99
views
Python text tokenize code to output results from horizontal to vertical with grammar recognition
Below code tokenises the text and identifies the grammar of each tokenised word.
import nltk
from nltk.tokenize import sent_tokenize, word_tokenize
from nltk.corpus import wordnet as wn
#nltk....
2
votes
1
answer
79
views
Why does this grammar work using the Earley parser but not LALR(1)
I wrote this grammar to describe a simple patch procedure we use at work, operating on source code files. Engineers check in files of the form:
{ignored space}
//find_start {possible comment or ...
0
votes
1
answer
81
views
chars a-z used multiple times in set [A-Za-z_]
how to remove this warning :
warning(180): AlgoGrammar.g4:54:13: chars a-z used multiple times in set [A-Za-z_]
warning(180): AlgoGrammar.g4:54:23: chars a-z used multiple times in set [A-Za-z0-9_]
...
1
vote
1
answer
113
views
Dart multiple generic type evaluation
I saw an article about the Result pattern on the official flutter site(https://docs.flutter.dev/app-architecture/design-patterns/result), and I thought it was a good structure, so I tried to apply it ...
3
votes
1
answer
103
views
Is it possible to understand, when do semantic predicates work and when they don't?
Below are two grammars.
In this grammar, semantic predicates do "work". I.e. if they are false, rules don't match and if they are true, rules do match:
expr
: term
| expr asterisk ...
0
votes
0
answers
31
views
Is this correct grammar for Nearley?
Is the following good grammar for Nearley.js? It describes logical expressions with variables, negation, conjunction and parentheses.
It's a bit convoluted (?), but I didn't know other way how to make ...
1
vote
1
answer
56
views
S/R conflict in Bison
I have a problem with a [relatively simple] grammar, a distilled version of a much bigger grammar. Bison reports one shift/reduce conflict, which is resolved incorrectly: the resulting parser fails to ...
5
votes
3
answers
153
views
I need a Raku grammar to change itself based on a different match
I want to write a Raku grammar to check a monthly report about work contribution factors, written in Racket/Scribble.
The report is divided, at the highest level, into monthly sections, and beneath ...
1
vote
1
answer
78
views
Issue with parse order in ANTLR4 grammar
Below is a very simplified grammar to illustrate the problem. I likely can handle the existing result in generated code, but suspect there is some more elegant way to instead control the parser. ...
2
votes
1
answer
109
views
Can an SLR(1) Parser Have Accept/Reduce Conflicts?
Consider the following grammar:
S → A
A → S | a
This grammar would have an accept/reduce conflict in the SLR(1) parsing table in the state with the following kernel when reading the end symbol($):
...
1
vote
0
answers
46
views
Transform grammar of functional language to LL(1)
I am trying to transform a grammar for a functional language into an LL(1) one. I am erasing left recursion in favor of right recursion and then the first condition on non overlapping firstsets is ...
0
votes
2
answers
59
views
How to Restrict return Statements to Function Declarations in ANTLR Grammar?
Question:
I'm working on a custom parser using ANTLR to define a small programming language. One of the requirements is that return statements can only appear inside the body of a function. If a ...
1
vote
0
answers
77
views
How do you parse a Synchronous Context Free Grammar?
I am trying to parse a grammar of this type:
g = """
S -> A{1} B{2}, S -> B{2} A{1}
A -> A{1} A{2}, A -> A{2} A{1}
B -> B{1} A{2}, B -> A{2} B{1}
A -> a, A -> e
...
1
vote
1
answer
535
views
What do reduce/reduce and shift/reduce conflicts mean in terms of the grammar structure? [closed]
I have to construct LR(0) table to know if there are any sort of conflicts? Is there a way to look at the grammar and tell if there's a conflict without constructing the table? So yeah, the question ...
0
votes
1
answer
92
views
Parsing extended lambda calculus using recursive descent
I'm writing interpreter for simple lambda calculus based language in C. EBNF for language is
S ::= E
E ::= 'fn' var '->' E | T {'+' T} | T {'-' T}
T ::= F {'*' F} | F {'/' F}
F ::= P {P}
P ::= var |...
1
vote
1
answer
80
views
define equality predicate Lambda-Calculus nltk
I am trying to define a Lambda-Calculus representation of the word 'are', which is an equality predicate for this ccg:
ccg = '''
# CCG grammar
# complete the lexical entries with their categories and ...
0
votes
1
answer
42
views
Issue with LL(1) Grammar Transformation – Parser Generator Error
I am working on transforming a grammar into LL(1) form, but when I try to use an online LL(1) Parser Generator, it reports an error. I have followed the standard procedure for the transformation, but ...
2
votes
0
answers
76
views
antlr4 grammar - perf issues, ambiguities, and loads of single-char tokens
I am writing an antl4 grammar for splitting semicolon-separated statements. Below is a minimal version of the grammar. The full grammar has multiple types of comments, strings, identifiers, etc.
The ...
0
votes
1
answer
28
views
Antlr4 lexer seems to have a problem processing token 'AX', and no semantic predicate runs on rule REG
In the following example, the input token 'AX' seems to cause errors for an unknown reason. The parse tree shows that other rule matches that contain register tokens such as 'DX' are working fine. I'...
1
vote
0
answers
50
views
Does the RFC9110 field-content grammar imply content can not be 2 characters wide?
When reading the field-content grammar specified in RFC9110
field-content = field-vchar [ 1*( SP / HTAB / field-vchar ) field-vchar ]
I came to the conclusion that this grammar does allow the field ...
1
vote
1
answer
66
views
Infinite recursion in pyparsing grammar for method signatures
Below is my pyparsing grammar for parsing the method signature of a Solidity function, and an example signature to parse:
from pyparsing import Word, alphas, alphanums, oneOf, Group, Forward, ...
2
votes
2
answers
140
views
LALR Grammar for transforming text to csv
I have a processor trace output that has the following format:
Time Cycle PC Instr Decoded instruction Register and memory contents
905ns 86 00000e36 00a005b3 c.add ...
1
vote
1
answer
355
views
ANTLR4 - Token recognition error and mismatched input
I am fairly new to the ANTLR grammar. Here is what I have in my g4 file:
tptp_file : tptp_input* EOF;
tptp_input : annotated_formula | include;
annotated_formula : ...
0
votes
0
answers
59
views
What could be wrong in this PYTHON PYPEG2 grammar definition?
I need to define a grammar to parse the code below:
P1 ATTACKS p2 // P1 represents a pawn in chess game and p2 represents an opponent pawn
OR
P1 DEFENDS P2
OR
P1 IATTACKS p2,p3
OR
P1 IDEFENDS P2,P3
...
0
votes
1
answer
89
views
Parser recognizing variables
I am trying to generate a parser using antlr4.
My content seems quite simple. But let's have a look at my grammar first:
Lexer:
DOLLAR: '$' -> pushMode(VAR_MODE); // as soon as there's an "$&...
2
votes
0
answers
91
views
How to convert a parse tree of one grammar (in CNF form) and apply it to the original, unadulterated grammar
I've written a C# program that takes a list of "Productions" (LHS nonterminal, RHS "recipe"), of a grammar, and applies these transformations on it to reduce the grammar to another ...
1
vote
0
answers
62
views
Problems trying to update ANTLR's default TypeScript grammar
I'm trying to update the default ANTLR TypeScript grammar since it seems to support only TypeScript up to version 2.7.
One of the new constructions is the conditional types, which demanded me to alter ...
1
vote
1
answer
638
views
How to build the antlr grammar provided?
I would like to build a cpp parser using cpp and I'm using ANTLR4. I notice there is this 'grammar' section from the official github antlr grammar github and I've downloaded it. While opening the CPP ...
0
votes
1
answer
89
views
Antlr grammars that generate actual class inheritance
Need to know, is it possible to generate parsers, lexers, listeners, etc, by importing subset-grammars? I see that the supergrammar subgrammar pattern is possible, but I'm not sure I see a true class ...
3
votes
1
answer
86
views
How to resolve the mistmatched input 'token' expecting 'LEXER_RULE'
I have the following grammar defined in ANTLR4 (g4):
grammar SimpleExpr2;
expr: entityName '(' paramList ')' SEMICOLON;
entityName: ENTITY_NAME;
paramList: param (SEPARATOR param)*;
param: ...
0
votes
1
answer
108
views
Lexing Issue in ANTLR4 Grammar for Fortran 2018: Token Misclassification
I am developing a Fortran 2018 grammar in ANTLR4 using the ISO standard. I am encountering an issue during the lexing phase with some of the lexer rules. Specifically, certain keywords are being ...
0
votes
1
answer
107
views
How does this grammar get rid of infinite recursion in left-associated recursion?
"Suppose we have a grammar like this, where alpha could be any sequence of terminals and nonterminals:
A -> A alpha | B
We can rewrite this grammar as:
A -> B A'
A' -> alpha A' | ...
0
votes
1
answer
127
views
Visual Studio Code TextMate match pattern with maximum possible length first
I am writing a TextMate grammar for a syntax highlighting extension in VS Code, and I discovered that if I define match rules for constants, it matches them multiple times in a row. More specifically, ...
2
votes
2
answers
169
views
Operator precedence of `EXISTS`
In Postgres, does the EXISTS operator have the highest precedence of all? For example:
SELECT 1 + EXISTS (SELECT 1)::int;
It seems to be missing from the manual page. Though the highest one is ::, ...
0
votes
1
answer
48
views
Terminology for data literal/identifier expression
In the following SQL statement:
SELECT
1,
myField,
myField+1
FROM
myTbl
The third column is often referred to as an "operator expression". Is there a name for the type of ...
1
vote
1
answer
309
views
How to do I parse a input string in SLR(1) parser with grammar having epsilon?
This is my Grammar:
S → (S)S | ε
while my input string, which I want to parse using SLR(1):
()()
I tried making DFA with the method specified in this question, but I wasn't able to parse it :(
SLR(1) ...
1
vote
0
answers
112
views
Is a grammar in a EBNF form equivalent to one in a Pydantic model form?
Python Pydantic models and EBNF (context free grammar) are two competing ways to have a LLM generate structured, sensible output.
Although the links reference the Outlines library, I am also curious ...
1
vote
1
answer
241
views
Can I generate a list of sentences from a list of words?
I want to take a group of words, ideally at least 100, and then get sentences that actually make sense.
Any grammar check API I've seen can correct sentences if they are in a form that at least ...
1
vote
1
answer
116
views
Bison parser shift/reduce conflict
I'm new to Bison and I'm trying to write a parser. I already wrote a scanner in flex. I came up with the following grammar for the parser:
%token NUMBER
%token IDENTIFIER
%start Program
%%
Program: ...
0
votes
0
answers
36
views
yacc returns without parsing full sentence
I am currently trying to write a parser in lex and yacc to parse all valid time formats. The relevant .l and .y codes are included below.
^[0-9]{1,2} { printf("HOUR: %s\n", yytext); return ...
0
votes
0
answers
84
views
Using Antlr grammar to parse Solidity comments
I'm trying to modifying this grammar to parse Solidity comments.
I want to obtain two type of comment: multi-line comment, that is in the form '/* any string */', and the single-line comment, that is ...
0
votes
1
answer
62
views
A question about Python's pop and what does the grammar have to say?
This is something I tried out in ipython the behavior is quite clear: when creating the dictionary in lines 3 and 6, the dictionaries are created as if invoked by dict(**kwargs) and the kwargs are ...
0
votes
1
answer
105
views
Parsing with my example ANTLR grammar. Out of Memory Errors, despite 8GB heap
I have been working this grammar for several days now, with various improvements, but I am now parsing many files, some with syntax errors and stack overflow errors (which I or AI has fixed). Now I'...
0
votes
1
answer
361
views
Need clarification on pumping lemma for context free languages
I am doing a problem where I am applying pumping lemma to the CFL L = {a^nb^nc^n : n >= 0}. Here is the start of a proof I was looking at:
Assume L is a CFL, so there exists a pumping length p for ...