JavaCC explanation and solution for "Regular expression selection: FOO can never be matched as: BAR"?

Question

JavaCC explanation and solution for "Regular expression selection: FOO can never be matched as: BAR"?

I am learning to use JavaCC in a hobby project and have a simple grammar for writing a parser. Part of the parser includes the following:

TOKEN : { < DIGIT : (["0"-"9"]) > }
TOKEN : { < INTEGER : (<DIGIT>)+ > }
TOKEN : { < INTEGER_PAIR : (<INTEGER>){2} > }
TOKEN : { < FLOAT : (<NEGATE>)? <INTEGER> | (<NEGATE>)? <INTEGER>  "." <INTEGER>  | (<NEGATE>)? <INTEGER> "." | (<NEGATE>)? "." <INTEGER> > } 
TOKEN : { < FLOAT_PAIR : (<FLOAT>){2} > }
TOKEN : { < NUMBER_PAIR : <FLOAT_PAIR> | <INTEGER_PAIR> > }
TOKEN : { < NEGATE : "-" > }

When compiling with JavaCC, I get the output:

Warning: Regular Expression choice : FLOAT_PAIR can never be matched as : NUMBER_PAIR

Warning: Regular Expression choice : INTEGER_PAIR can never be matched as : NUMBER_PAIR

I am sure this is a simple concept, but I don't understand the warning, being new to parser generation and regular expressions.

What does this warning mean (in terms of as-novice-as-you-can-get)?

+5

regex parsing javacc

Grundlefleck Apr 26 '09 at 20:53

source share

4 answers

JavaCC, , NUMBER_PAIR .

, , FLOAT_PAIR, INTEGER_PAIR, FLOAT INTEGER.

, JavaCC:)

0

Uri 26 . '09 20:59

, , FLOAT_PAIR FLOAT_PAIR, NUMBER_PAIR. FLOAT_PAIR , JavaCC . , JavaCC, .

, - , NUMBER_PAIR - .

0

sth 26 . '09 22:24

, , , :

    SKIP : { < #TO_SKIP : " " | "\t" > }
    TOKEN : { < #DIGIT : (["0"-"9"]) > }
    TOKEN : { < #DIGITS : (<DIGIT>)+ > }
    TOKEN : { < INTEGER : <DIGITS> > }
    TOKEN : { < INTEGER_PAIR : (<INTEGER>) (<TO_SKIP>)+ (<INTEGER>) > }
    TOKEN : { < FLOAT : (<NEGATE>)?<DIGITS>"."<DIGITS> | (<NEGATE>)?"."<DIGITS> > } 
    TOKEN : { < FLOAT_PAIR : (<FLOAT>) (<TO_SKIP>)+ (<FLOAT>) > }
    TOKEN : { < #NUMBER : <FLOAT> | <INTEGER> > }
    TOKEN : { < NUMBER_PAIR : (<NUMBER>) (<TO_SKIP>)+ (<NUMBER>) >}
    TOKEN : { < NEGATE : "-" > }

I completely forgot to include the space that is used to separate the two tokens, I also used the “#” character, which stops the tokens that match, and is simply used in defining other tokens. The above is compiled by JavaCC without warning or error.

However, as Barry noted, there is reason to oppose this.

0

Grundlefleck Apr 27 '09 at 20:32

source share

Barry Kelly · Accepted Answer · 2009-04-26T23:51:26+0000

I do not know JavaCC, but I am a compiler engineer.

The rule is FLOAT_PAIRambiguous. Consider the following text:

0.0

FLOAT 0, FLOAT .0; FLOAT 0., FLOAT 0; FLOAT_PAIR. FLOAT 0.0.

, , . :

INTEGER 12, INTEGER 345, INTEGER_PAIR. INTEGER 123, INTEGER 45, INTEGER_PAIR. INTEGER 12345, . , INTEGER_PAIR ( FLOAT_PAIR).

, lexer. (INTEGER FLOAT) , , .

(, "----42"? , , .)

, , INTEGER, DIGIT. , JavaCC . DIGIT , -, ; , DIGIT ([0-9]) , DIGIT .

JavaCC explanation and solution for "Regular expression selection: FOO can never be matched as: BAR"?

More articles: