What does β€œfragment” mean in ANTLR?

What does a fragment mean in ANTLR?

I saw both rules:

fragment DIGIT : '0'..'9'; 

and

 DIGIT : '0'..'9'; 

What is the difference?

+58
antlr
Jun 27 2018-11-11T00:
source share
3 answers

The fragment is somewhat similar to the built-in function: it makes the grammar more understandable and easy to maintain.

A fragment will never be counted as a token; it serves only to simplify the grammar.

Consider:

 NUMBER: DIGITS | OCTAL_DIGITS | HEX_DIGITS; fragment DIGITS: '1'..'9' '0'..'9'*; fragment OCTAL_DIGITS: '0' '0'..'7'+; fragment HEX_DIGITS: '0x' ('0'..'9' | 'a'..'f' | 'A'..'F')+; 

In this example, a NUMBER match will always return NUMBER to the lexer, regardless of whether it matches "1234", "0xab12", or "0777".

See point 3

+70
Jun 27 2018-11-11T00:
source share

According to the Definitive Antlr4 reference book:

Rules with a fragment prefix can be called only from other lexer rules; they are not tokens themselves.

in fact, they will improve the readability of your grammar.

look at this example:

 STRING : '"' (ESC | ~["\\])* '"' ; fragment ESC : '\\' (["\\/bfnrt] | UNICODE) ; fragment UNICODE : 'u' HEX HEX HEX HEX ; fragment HEX : [0-9a-fA-F] ; 

STRING is a lexer using a fragment rule such as ESC. Unnicode is used in the Esc rule, and Hex is used in the Unicode fragment rule. ESC and UNICODE and HEX rules cannot be used explicitly.

+8
Jan 31 '16 at 14:14
source share

There is a very clear example in this blog post where fragment is significant:

 grammar number; number: INT; DIGIT : '0'..'9'; INT : DIGIT+; 

The grammar recognizes "42", but not "7". You can fix this by making the digit a fragment (or by moving DIGIT after INT).

+4
Mar 19 '16 at 22:13
source share



All Articles