Grammar Specification Shift Resolution / Conflict Reduction

Question

Grammar Specification Shift Resolution / Conflict Reduction

I use Jison (Bison) to create a simple markup language. I am clearly new to this, but small changes work very well. I just don't understand the source of the S / R conflict.

It doesn’t matter that “Text” is returned by two lexer actions (with different launch conditions), and I like it because it allows the grammar to have fewer rules and because error messages are consistent to the user, I tried to make the “Text” rule publicly available regardless of context, and I also tried to give each token a different name, but it does not seem to affect the S / R conflict when all this is together.

SUPPOSED parser for creating json-object with plain text, sub-arrays and various special nodes.

Specification:

/* lexical grammar */ %lex %s bracketed %% <bracketed>(\\.|[^\\\,\[\]])+ { yytext = yytext.replace(/\\(.)/g, '$1'); return 'Text'; } <INITIAL>(\\.|[^\\\[])+ { yytext = yytext.replace(/\\(.)/g, '$1'); return 'Text'; } "[" { this.begin('bracketed'); return '['; } "]" { this.popState(); return ']'; } "," return ',' <<EOF>> return 'END' /lex %start template %% template : sentence END ; sentence : /* empty */ | sentence Text | sentence '[' ']' | sentence '[' dynamic ']' ; dynamic : sentence /*| dynamic ',' sentence*/ ;

Warnings:

 Conflict in grammar: multiple actions possible when lookahead token is ] in state 5 - reduce by rule: sentence -> - shift token (then go to state 6) States with conflicts: State 5 sentence -> sentence [ .] #lookaheads= END Text [ ] sentence -> sentence [ .dynamic ] #lookaheads= END Text [ ] dynamic -> .sentence #lookaheads= ] sentence -> . #lookaheads= ] Text [ sentence -> .sentence Text sentence -> .sentence [ ] sentence -> .sentence [ dynamic ]

Different generator algorithms have more or less problems, but they all seem to have problems.

Thanks!

+6

context-free-grammar bison jison shift-reduce-conflict

Jason kleban Oct 3 '12 at 20:01

source share

1 answer

Chris dodd · Accepted Answer · 2012-10-03T22:22:39+0000

The conflict comes mainly from these two rules:

 sentence: sentence '[' Text ']' | sentence '[' sentenceList ']'

The reason is that after looking at sentence and [ and looking at the next Text token, the parser does not know whether to shift Text by matching the first rule, or consider that Text as the beginning of a sentenceList is suitable for matching the second rule.

Now, if you have a parser generator that uses a 2-token, this will not be a problem, but the LALR (1) bison (1 is one token).

There are a few things you could try:

make an additional look in the lexer to distinguish the text from the subsequent one] from Text-not-follow-by-], because two different tokens then rewrite the rules for using both of these tokens.
Use the bison% glr-parser function to use the GLR parser. This will analyze the proposal in both directions, and then throw out the one that does not match
reorganize the grammar so that it does not need a two-point image.

One refactoring that works in your case is to rewrite sentence rules to make them correct recursive rather than left recursive:

 sentence: /* empty */ | Text sentence | '[' ']' sentence | '[' Text ']' sentence | '[' sentenceList ']' sentence ;

This avoids sentence (or any other rule starting with sentence , such as sentenceList ), with zero reduction of the sentence: /*empty*/ rule sentence: /*empty*/ . Therefore, the analyzer can freely shift a Text in a problematic case, postponing the reduction until it sees the next token. However, this has the consequences of using memory, as it leads to a parser that significantly shifts the entire entry on the parser stack and then reduces it by one sentence at a time.

Another refactor you could do would be to combine the [Text] and [] constructs in [sentenceList] :

 sentence: /* empty */ | sentence Text | sentence '[' sentenceList ']' ; sentenceList: sentence | sentenceList ',' sentence

So now the sentenceList is one or more sentences separated by commas (instead of two or more), and in action for the sentence '[' sentenceList ']' rule, you must check the sentenceList to see if it was two or more sentences and act properly.

Grammar Specification Shift Resolution / Conflict Reduction

More articles: