Grammar Specification Shift Resolution / Conflict Reduction

I use Jison (Bison) to create a simple markup language. I am clearly new to this, but small changes work very well. I just don't understand the source of the S / R conflict.

It doesn’t matter that β€œText” is returned by two lexer actions (with different launch conditions), and I like it because it allows the grammar to have fewer rules and because error messages are consistent to the user, I tried to make the β€œText” rule publicly available regardless of context, and I also tried to give each token a different name, but it does not seem to affect the S / R conflict when all this is together.

SUPPOSED parser for creating json-object with plain text, sub-arrays and various special nodes.

Specification:

/* lexical grammar */ %lex %s bracketed %% <bracketed>(\\.|[^\\\,\[\]])+ { yytext = yytext.replace(/\\(.)/g, '$1'); return 'Text'; } <INITIAL>(\\.|[^\\\[])+ { yytext = yytext.replace(/\\(.)/g, '$1'); return 'Text'; } "[" { this.begin('bracketed'); return '['; } "]" { this.popState(); return ']'; } "," return ',' <<EOF>> return 'END' /lex %start template %% template : sentence END ; sentence : /* empty */ | sentence Text | sentence '[' ']' | sentence '[' dynamic ']' ; dynamic : sentence /*| dynamic ',' sentence*/ ; 

Warnings:

 Conflict in grammar: multiple actions possible when lookahead token is ] in state 5 - reduce by rule: sentence -> - shift token (then go to state 6) States with conflicts: State 5 sentence -> sentence [ .] #lookaheads= END Text [ ] sentence -> sentence [ .dynamic ] #lookaheads= END Text [ ] dynamic -> .sentence #lookaheads= ] sentence -> . #lookaheads= ] Text [ sentence -> .sentence Text sentence -> .sentence [ ] sentence -> .sentence [ dynamic ] 

Different generator algorithms have more or less problems, but they all seem to have problems.

Thanks!

+6
source share
1 answer

The conflict comes mainly from these two rules:

 sentence: sentence '[' Text ']' | sentence '[' sentenceList ']' 

The reason is that after looking at sentence and [ and looking at the next Text token, the parser does not know whether to shift Text by matching the first rule, or consider that Text as the beginning of a sentenceList is suitable for matching the second rule.

Now, if you have a parser generator that uses a 2-token, this will not be a problem, but the LALR (1) bison (1 is one token).

There are a few things you could try:

  • make an additional look in the lexer to distinguish the text from the subsequent one] from Text-not-follow-by-], because two different tokens then rewrite the rules for using both of these tokens.

  • Use the bison% glr-parser function to use the GLR parser. This will analyze the proposal in both directions, and then throw out the one that does not match

  • reorganize the grammar so that it does not need a two-point image.

One refactoring that works in your case is to rewrite sentence rules to make them correct recursive rather than left recursive:

 sentence: /* empty */ | Text sentence | '[' ']' sentence | '[' Text ']' sentence | '[' sentenceList ']' sentence ; 

This avoids sentence (or any other rule starting with sentence , such as sentenceList ), with zero reduction of the sentence: /*empty*/ rule sentence: /*empty*/ . Therefore, the analyzer can freely shift a Text in a problematic case, postponing the reduction until it sees the next token. However, this has the consequences of using memory, as it leads to a parser that significantly shifts the entire entry on the parser stack and then reduces it by one sentence at a time.

Another refactor you could do would be to combine the [Text] and [] constructs in [sentenceList] :

 sentence: /* empty */ | sentence Text | sentence '[' sentenceList ']' ; sentenceList: sentence | sentenceList ',' sentence 

So now the sentenceList is one or more sentences separated by commas (instead of two or more), and in action for the sentence '[' sentenceList ']' rule, you must check the sentenceList to see if it was two or more sentences and act properly.

+14
source

Source: https://habr.com/ru/post/926882/


All Articles