Antlr 4.5 Analyzer Runtime Error

Question

Antlr 4.5 Analyzer Runtime Error

I am building a simple grammar for programming laguange for learning purposes.

I come across a strange mistake that makes no sense to me.

line 1:0 missing {'void', 'int', 'bool', 'string', 'union'} at 'void'

I am using prequild lexer and parser from this grammar:

 grammar ProgrammingLanguage; function_definition : type_specifier IDENTIFIER '(' parameter_list_opt ')' compound_statement ; type_specifier : VOID | INT | BOOL | STRING | UNION ; compound_statement : '{' declaration_list statement_list '}' ; statement_list : statement | statement statement_list | ; statement : compound_statement | selection_statement | while_statement | jump_statement | expression_statement | comment_statement ; comment_statement : COMMENT_START COMMENT ; selection_statement : IF '(' expression ')' compound_statement | IF '(' expression ')' compound_statement ELSE compound_statement | SWITCH '(' expression ')' compound_statement ; expression_statement : ';' | expression ';' ; jump_statement : BREAK ';' | CONTINUE ';' ; while_statement : WHILE '(' expression ')' compound_statement ; primary_expression : IDENTIFIER | CONSTANT | '(' expression ')' | IDENTIFIER '(' primary_expression_list ')' ; primary_expression_list : primary_expression | primary_expression primary_expression_list | ; expression : logical_or_expression | additive_expression ; logical_or_expression : logical_and_expression | logical_or_expression '||' logical_and_expression ; logical_and_expression : compare_expression | logical_and_expression '&&' compare_expression ; compare_expression : primary_expression compare_op primary_expression | primary_expression ; compare_op : '<' | '>' | '==' | '!=' | '<=' | '>=' ; additive_expression : multiplicative_expression | additive_expression '+' multiplicative_expression | additive_expression '-' multiplicative_expression ; multiplicative_expression : primary_expression | multiplicative_expression '*' primary_expression | multiplicative_expression '/' primary_expression | multiplicative_expression '%' primary_expression ; assignment_expression : IDENTIFIER '=' expression ; id_list : IDENTIFIER | IDENTIFIER ',' id_list ; declaration : type_specifier id_list ';' ; parameter_list_opt : parameter_list | ; parameter_list : type_specifier IDENTIFIER | type_specifier IDENTIFIER ',' parameter_list ; declaration_list : declaration | declaration declaration_list | ; /**------------------------------------------------------------------ * LEXER RULES *------------------------------------------------------------------ */ WHILE : 'while' ; BREAK : 'break' ; CONTINUE : 'continue' ; SWITCH : 'switch' ; IF : 'if' ; ELSE : 'else' ; COMMENT_START : '//' ; IDENTIFIER : ('a'..'z'|'A'..'Z')('0'..'9'|'a'..'z'|'A'..'Z')*; CONSTANT : FALSE|TRUE|STRING_VALUE|INT_VALUE; STRING_VALUE : '"'COMMENT'"'; COMMENT : ('0'..'9'|'a'..'z'|'A'..'Z')*; INT_VALUE : ('0'..'9')+; FALSE : 'false'; TRUE : 'true'; VOID : 'void'; INT : 'int'; BOOL : 'bool'; STRING : 'string'; UNION : 'union'; WS : (' '|'\t'|'\n'|'\r')+ -> skip;

And I understand this java code:

 import org.antlr.v4.runtime.*; import org.antlr.v4.runtime.tree.ParseTree; import org.antlr.v4.runtime.tree.ParseTreeWalker; import java.io.IOException; public class Main { public static void main(String[] args) throws IOException { ProgrammingLanguageLexer lexer = new ProgrammingLanguageLexer(new ANTLRFileStream("input.txt")); ProgrammingLanguageParser parser = new ProgrammingLanguageParser(new CommonTokenStream(lexer)); ParseTree tree = parser.function_definition(); ParseTreeWalker.DEFAULT.walk(new ProgrammingLanguageBaseListener(), tree); } }

And finally, the line I'm trying to parse:

 void power () {}

+1

java parsing antlr4

Andrey Vasin Mar 22 '15 at 17:47

source share

1 answer

GRosenberg · Accepted Answer · 2015-03-22T19:30:27+0000

The error message means that the expected type of token containing the value "void" does not match the actual tick of the token created by consuming the string "void" from the input. Looking at your lexer rules, you can assume that the input string "void" is consumed by the IDENTIFIER rule, creating a token of type IDENTIFIER, not VOID.

In general, the lexer rule that matches the longest input string wins. For two (or more) rules with the same match length, the first listed winnings. Move all keyword rules above the IDENTIFIER rule.

The useful unit test form resets lex'd tokens and displays the corresponding token types. Sort of:

 CommonTokenStream tokens = ... tokens.fill(); StringBuilder sb = new StringBuilder(); for (Token token : tokens.getTokens()) { sb.append(((YourCustomTokenType) token).toString()); } System.out.print(sb.toString());

The Token.toString () method is usually pretty good. Override the token in your subclass to fit your needs.

Antlr 4.5 Analyzer Runtime Error

More articles: