How to handle language-specific reserved words that appear in expression or variable names

I have been working on this issue for about 4 hours. Here is my ANTLR V4 grammar file, which I reviewed as a simple example.

grammar Cfscript; component : (statement)* ; statement : 'return' expression? ';' | statementExpression ';' ; statementExpression : expression ; expression : primary | expression '.' Identifier ; primary : Identifier ; Identifier : [a-zA-Z0-9_]+ ; WS : [ \t\r\n]+ -> skip ; 

My file contains

 local.return; 

When I try to parse this file included in component , I get the following error: mismatched input 'return' expecting Identifier . I cannot understand why this error occurs.

Update

If I understand correctly, this is because return is a reserved word in Java, and that is why they structured their grammar this way. In my Coldfusion Cfscript language, return valid if limited: local.return , variables.return , local["return"] . This is also true for if , else , savecontent and many other words, all of which are valid only within areas, but not as the first member of a variable or expression: if.blah = "something" invalid, but blah.if = "something" >. This means that I will run into this problem with each of these terms, as they will contradict the parser rule that captures them.

Synthesizing what Bart said, is this a clean way to solve this problem?

 grammar Cfscript; component : (statement)* ; statement : K_Return expression? ';' | expression ';' ; expression : primary | expression '.' secondary ; primary : Identifier ; secondary : K_Return | K_If | K_Else | Identifier ; K_Return : 'return' ; K_If : 'if' ; K_Else : 'else' ; Identifier : [a-zA-Z0-9_]+ ; WS : [ \t\r\n]+ -> skip ; 
+4
source share
1 answer

Adding letter tokens inside the parser rules, as in the case of 'return' , does not mean that the lexer will match the string "return" as Identifier inside the second alternative in your expression rule:

 expression '.' Identifier 

If you want to match "return" as a keyword and as an identifier in your own language, you need to create an analyzer rule that matches both Identifier and keywords:

 expression : primary | expression '.' id ; primary : id ; id : Identifier | K_Return ; // Better explicitly define them instead of litering keywords inside parser rules K_Return : 'return' ; Identifier : [a-zA-Z0-9_]+ ; 
+4
source

All Articles