Debugging in Jison

I use Jison to write a parser. This is my grammar:

{ "program": [ ["statements EOF", "return $1;"] ], "statements": [ ["statement", "$$ = $1;"], ["statements statement", "$$ = $1 + '\\n' + $2;"] ], "statement": [ ["expression NEWLINE", "$$ = $1 + ';';"] ], "expression": [ ["NUMBER", "$$ = yytext;"], ["expression expression", "$$ = $1 + ', ' + $2;"] ] } 

When I run it, I get the following error message:

 Conflict in grammar: multiple actions possible when lookahead token is NUMBER in state 9 - reduce by rule: expression -> expression expression - shift token (then go to state 5) States with conflicts: State 9 expression -> expression expression . #lookaheads= NEWLINE NUMBER expression -> expression .expression expression -> .NUMBER expression -> .expression expression 

What should I do with this debug message? How do you explain this message in plain English? What does the period in expression -> expression expression . mean expression -> expression expression . ? What are .expression and .NUMBER ? How do they differ from expression and NUMBER respectively?

+4
source share
1 answer

What should I do with this debug message?

A grammar context means that the parser can reach a state in which it can follow several rules, but it does not have enough information to determine which one follows (or, even worse, the grammar is ambiguous). You will need to adjust the grammar to resolve conflicts. Often it takes practice to qualify.

What expresses an expression of an expression in an expression →. mean?

A period is a parser position. Thus, in this rule, the parser simply parsed the two expressions and is now in state 9. When the period is at the end of the rule, this means that the rule can be “reduced” and grouped into a single expression not the terminal in this case. However, it can only be reduced if the next token (lookahead) is NEWLINE or NUMBER .

In expression -> .NUMBER analyzer just met the NUMBER token, which it can "shift", and then move to a new state.

The conflict arises due to the fact that the parser can reduce or shift when it encounters the NUMBER token.

Edit: To resolve the conflict, we need to split this expression rule into separate non-terminals. Having the same nonterminal sequence inevitably creates conflict.

eg.

 { "program": [ ["statements EOF", "return $1;"] ], "statements": [ ["statement", "$$ = $1;"], ["statements statement", "$$ = $1 + '\\n' + $2;"] ], "statement": [ ["expression NEWLINE", "$$ = $1 + ';';"] ], "expression": [ ["expression expression_base", "$$ = $1 + ', ' + $2;"], ["expression_base", "$$ = $1;"] ], "expression_base": [ ["NUMBER", "$$ = yytext;"] ] } 

Here is a good resource for more information on these types of grammars.

+10
source

All Articles