JavaCC explanation and solution for "Regular expression selection: FOO can never be matched as: BAR"?

I am learning to use JavaCC in a hobby project and have a simple grammar for writing a parser. Part of the parser includes the following:

TOKEN : { < DIGIT : (["0"-"9"]) > }
TOKEN : { < INTEGER : (<DIGIT>)+ > }
TOKEN : { < INTEGER_PAIR : (<INTEGER>){2} > }
TOKEN : { < FLOAT : (<NEGATE>)? <INTEGER> | (<NEGATE>)? <INTEGER>  "." <INTEGER>  | (<NEGATE>)? <INTEGER> "." | (<NEGATE>)? "." <INTEGER> > } 
TOKEN : { < FLOAT_PAIR : (<FLOAT>){2} > }
TOKEN : { < NUMBER_PAIR : <FLOAT_PAIR> | <INTEGER_PAIR> > }
TOKEN : { < NEGATE : "-" > }

When compiling with JavaCC, I get the output:

Warning: Regular Expression choice : FLOAT_PAIR can never be matched as : NUMBER_PAIR

Warning: Regular Expression choice : INTEGER_PAIR can never be matched as : NUMBER_PAIR

I am sure this is a simple concept, but I don't understand the warning, being new to parser generation and regular expressions.

What does this warning mean (in terms of as-novice-as-you-can-get)?

+5
source share
4 answers

I do not know JavaCC, but I am a compiler engineer.

The rule is FLOAT_PAIRambiguous. Consider the following text:

0.0

FLOAT 0, FLOAT .0; FLOAT 0., FLOAT 0; FLOAT_PAIR. FLOAT 0.0.

, , . :

12345

INTEGER 12, INTEGER 345, INTEGER_PAIR. INTEGER 123, INTEGER 45, INTEGER_PAIR. INTEGER 12345, . , INTEGER_PAIR ( FLOAT_PAIR).

, lexer. (INTEGER FLOAT) , , .

(, "----42"? , , .)

, , INTEGER, DIGIT. , JavaCC . DIGIT , -, ; , DIGIT ([0-9]) , DIGIT .

+4

JavaCC, , NUMBER_PAIR .

, , ​​ FLOAT_PAIR, INTEGER_PAIR, FLOAT INTEGER.

, JavaCC:)

0

, , FLOAT_PAIR FLOAT_PAIR, NUMBER_PAIR. FLOAT_PAIR , JavaCC . , JavaCC, .

, - , NUMBER_PAIR - .

0

, , , :

    SKIP : { < #TO_SKIP : " " | "\t" > }
    TOKEN : { < #DIGIT : (["0"-"9"]) > }
    TOKEN : { < #DIGITS : (<DIGIT>)+ > }
    TOKEN : { < INTEGER : <DIGITS> > }
    TOKEN : { < INTEGER_PAIR : (<INTEGER>) (<TO_SKIP>)+ (<INTEGER>) > }
    TOKEN : { < FLOAT : (<NEGATE>)?<DIGITS>"."<DIGITS> | (<NEGATE>)?"."<DIGITS> > } 
    TOKEN : { < FLOAT_PAIR : (<FLOAT>) (<TO_SKIP>)+ (<FLOAT>) > }
    TOKEN : { < #NUMBER : <FLOAT> | <INTEGER> > }
    TOKEN : { < NUMBER_PAIR : (<NUMBER>) (<TO_SKIP>)+ (<NUMBER>) >}
    TOKEN : { < NEGATE : "-" > }

I completely forgot to include the space that is used to separate the two tokens, I also used the β€œ#” character, which stops the tokens that match, and is simply used in defining other tokens. The above is compiled by JavaCC without warning or error.

However, as Barry noted, there is reason to oppose this.

0
source

All Articles