Why does this grammar conflict?

It is compiled using Lemon, which is a LALR (1) parser generator:

program ::= statement. statement ::= ifstatement Newline. statement ::= returnstatement Newline. ifstatement ::= If Number A statement B. ifstatement ::= If Number A statement B Newline Else A statement B. returnstatement ::= Return Number. 

Error message:

 user@ /tmp > lemon test.lm test.lm:6: This rule can not be reduced. 1 parsing conflicts. 

Debug output:

 State 0: program ::= * statement statement ::= * ifstatement Newline statement ::= * returnstatement Newline ifstatement ::= * If Number A statement B ifstatement ::= * If Number A statement B Newline Else A statement B returnstatement ::= * Return Number If shift 10 Return shift 3 program accept statement shift 13 ifstatement shift 12 returnstatement shift 11 State 1: statement ::= * ifstatement Newline statement ::= * returnstatement Newline ifstatement ::= * If Number A statement B ifstatement ::= * If Number A statement B Newline Else A statement B ifstatement ::= If Number A statement B Newline Else A * statement B returnstatement ::= * Return Number If shift 10 Return shift 3 statement shift 4 ifstatement shift 12 returnstatement shift 11 State 2: statement ::= * ifstatement Newline statement ::= * returnstatement Newline ifstatement ::= * If Number A statement B ifstatement ::= If Number A * statement B ifstatement ::= * If Number A statement B Newline Else A statement B ifstatement ::= If Number A * statement B Newline Else A statement B returnstatement ::= * Return Number If shift 10 Return shift 3 statement shift 8 ifstatement shift 12 returnstatement shift 11 State 3: returnstatement ::= Return * Number Number shift 14 State 4: ifstatement ::= If Number A statement B Newline Else A statement * B B shift 15 State 5: ifstatement ::= If Number A statement B Newline Else * A statement B A shift 1 State 6: ifstatement ::= If Number A statement B Newline * Else A statement B Else shift 5 State 7: (3) ifstatement ::= If Number A statement B * ifstatement ::= If Number A statement B * Newline Else A statement B Newline shift 6 Newline reduce 3 ** Parsing conflict ** State 8: ifstatement ::= If Number A statement * B ifstatement ::= If Number A statement * B Newline Else A statement B B shift 7 State 9: ifstatement ::= If Number * A statement B ifstatement ::= If Number * A statement B Newline Else A statement B A shift 2 State 10: ifstatement ::= If * Number A statement B ifstatement ::= If * Number A statement B Newline Else A statement B Number shift 9 State 11: statement ::= returnstatement * Newline Newline shift 16 State 12: statement ::= ifstatement * Newline Newline shift 17 State 13: (0) program ::= statement * $ reduce 0 State 14: (5) returnstatement ::= Return Number * {default} reduce 5 State 15: (4) ifstatement ::= If Number A statement B Newline Else A statement B * {default} reduce 4 State 16: (2) statement ::= returnstatement Newline * {default} reduce 2 State 17: (1) statement ::= ifstatement Newline * {default} reduce 1 ---------------------------------------------------- Symbols: 0: $: 1: Newline 2: If 3: Number 4: A 5: B 6: Else 7: Return 8: error: 9: program: If Return 10: statement: If Return 11: ifstatement: If 12: returnstatement: Return 
+4
source share
1 answer

Take a look at state 7 from the debug output. It describes the case when the parser has already accepted the following set of tokens:

  ifstatement ::= If Number A statement B * 

Here are two parameters that the parser can select when the Newline token appears in this case:

  • Remember this and switch to state 6. This shift is prescribed by the following rule from your grammar:

     ifstatement ::= If Number A statement B Newline Else A statement B. 
  • Consider the current rule as completed and return to the top-level rule. This abbreviation is determined by this rule from your grammar:

     ifstatement ::= If Number A statement B. 

With LALR (1), the parser has no other option to fail in this case due to the fact that it cannot look forward to the next tokens in the stream. He cannot predict that Else will appear after Newline.

Revise your grammar to avoid this conflict. I can only add that new line characters are usually not included in the grammar of a language. Tokenizers typically treat them as a marker border, similar to other space characters.

+1
source

All Articles