Priority order for mapping tokens in Flex

Sorry if the title of this thread is a bit confusing. I ask, how does Flex (lexical analyzer) deal with priority issues?

For example, let's say I have two tokens with similar regular expressions written in the following order:

"//"[!\/]{1} return FIRST; "//"[!\/]{1}\< return SECOND; 

Given the input "//! <" Will FIRST or SECOND be returned? Or both?

The FIRST line will be reached before the SECOND line, but it seems that returning SECOND will be the correct behavior.

+4
source share
1 answer

The longest match is returned.

From flex and bison, word processing tools :

How Flex Handles Ambiguous Templates

Most flexible programs are rather ambiguous, with several patterns that can match the same input. Flex resolves ambiguity with two simple rules:

  • Match the longest line each time you enter the scanner.
  • In case of binding, use the template that appears first in the program.

You can check it yourself, of course:

file: demo.l

 %% "//"[!/] {printf("FIRST");} "//"[!/]< {printf("SECOND");} %% int main(int argc, char **argv) { while(yylex() != 0); return 0; } 

Note that / and < no escaping is required, and {1} is redundant.

 bart@hades :~/Programming/GNU-Flex-Bison/demo$ flex demo.l bart@hades :~/Programming/GNU-Flex-Bison/demo$ cc lex.yy.c -lfl bart@hades :~/Programming/GNU-Flex-Bison/demo$ ./a.out < in.txt SECOND 

where in.txt contains //!< .

+8
source

All Articles