Inevitable code breaks my grammar

I have a .g4 grammar for vba / vb6 lexer / parser where the lexer skips line continuation tokens - without missing them, breaks the parser and is not an option. Here is the lexer rule:

LINE_CONTINUATION : ' ' '_' '\r'? '\n' -> skip; 

The problem is that whenever a continuous line starts in column 1, the parser explodes:

 Sub Test() Debug.Print "Some text " & _ vbNewLine & "Some more text" End Sub 

I thought, β€œHey, I know! I’ll just pre-process the line that I feed ANTLR to insert extra spaces before underlining and change the grammar to accept it!”

So, I changed the rule as follows:

 LINE_CONTINUATION : WS? WS '_' NEWLINE -> skip; NEWLINE : WS? ('\r'? '\n') WS?; WS : [ \t]+; 

... and the vba test above gave me this analyzer error:

fake login 'vbNewLine' awaits WS

For now, my only solution is to tell my users to snooze their code correctly. Is there a way to fix this grammar rule?

(Full grammar file VBA.g4 on GitHub)

+7
parsing antlr4 grammar
source share
1 answer

Basically, you want the continuation of the line to be treated as spaces.

OK, then add the lexical definition of line continuation to the WS token. Then WS will pick up the extension of the line and you won’t need LINECONTINUATION anywhere.

 //LINE_CONTINUATION : ' ' '_' '\r'? '\n' -> skip; NEWLINE : WS? ('\r'? '\n') WS?; WS : ([ \t]+)|(' ' '_' '\r'? '\n'); 
+4
source share

All Articles