ANTLR: Get a text representation of a sub lexer rule

consider the following lexer rules in ANTLR4:

ID: [a-z]+;
INT: [0-9]+;
ARRAY: ID '[' INT ']';

Is it possible in a tree navigation scenario where I have access to ctx.ARRAY()(where ctxis the subclass ParserRuleContextthat was created from the parser rule) to get a textual representation of the lexer IDand rules INT? Currently, I am getting the whole textual representation with ctx.ARRAY().getText()and parse the contents IDand INTwith the help of regular expressions and just wondered if there is a “cleaner” from the proposed ANTLR solution.

Note. Due to the external dependencies creating ARRAY, the parser rule is not an option.

Thanks for the important answers.

+2
source share
1 answer

Lexer rules in ANTLR 4 cannot be broken into parts. It was a constructive decision that we made as part of the great speed and memory improvements for ANTLR 4 lexers over ANTLR 3 lexers. ANTLR 3 lexers were recursive descent recognizers with many of the same functions as parsers. In ANTLR 4, a lexer is nothing more than a DFA recognizer with support for semantic predicates, so the boundaries between the individual components of the token are not traced at all.

You need to either make a ARRAYparser rule, or separately analyze the result getText()when you need to split the token.

+3
source

All Articles