Pars comment line

Question

Pars comment line

Given the following basic grammar, I want to understand how I can handle comment lines. There is no processing <CR><LF> , which usually completes the comment line - the only exception is the last line of comments before EOF, e. g :.

 # comment abcd := 12 ; # comment eof without <CR><LF>

 grammar CommentLine1a; //========================================================== // Options //========================================================== //========================================================== // Lexer Rules //========================================================== Int : Digit+ ; fragment Digit : '0'..'9' ; ID_NoDigitStart : ( 'a'..'z' | 'A'..'Z' ) ('a'..'z' | 'A'..'Z' | Digit )* ; Whitespace : ( ' ' | '\t' | '\r' | '\n' )+ { $channel = HIDDEN ; } ; //========================================================== // Parser Rules //========================================================== code : ( assignment | comment )+ ; assignment : id_NoDigitStart ':=' id_DigitStart ';' ; id_NoDigitStart : ID_NoDigitStart ; id_DigitStart : ( ID_NoDigitStart | Int )+ ; comment : '#' ~( '\r' | '\n' )* ;

+7

comments antlr grammar line

ANTLRStarter Aug 15 '11 at 20:56

source share

1 answer

Bart kiers · Accepted Answer · 2011-08-16T06:07:31+0000

If you have a very good reason to put a comment in the parser (which I would like to hear), you should put it in the lexer:

 Comment : '#' ~( '\r' | '\n' )* ;

And since you already consider line breaks in your Space rule, there is no problem with input like # comment eof without <CR><LF>

In addition, if parser rules use literal tokens, ANTLR automatically creates lexer rules from behind the scenes. So in your case:

 Comment : '#' ~( '\r' | '\n' )* ;

will match '#' followed by zero or more tokens other than '\r' and '\n' and not zero or more characters other than '\r' and '\n' .

For future reference:

Inside the parser rules

~ cancel tokens
. matches any token

Internal lexer rules

~ cancels characters
. matches any character in the range 0x0000 ... 0xFFFF

Pars comment line

Inside the parser rules

Internal lexer rules

More articles: