How to implement a function call using Antlr so that it can be called before it is defined?

Once the AST is built, what is the best way to implement the tree walker so that functions can be defined and called in any order?

For example, this is valid in PHP:

<?php f(); // function called before it's defined function f() { print 3; } ?> 

I suppose that somehow there must be a second pass or tree transformation, but I cannot find anything interesting on this. The problem is probably not Antlr specific, but if you can tell me an Antlr example how to do this, even better!

+6
antlr abstract-syntax-tree function-call
source share
1 answer

Yes, you are right: this is done in more than one AST pass.

First you create a grammar that builds the AST for the source, then you create a tree grammar that is used to iterate through the tree and detects all the specific functions. Then you can evaluate the script with another tree grammar, which takes the discovered functions from the previous tree grammar.

Demonstration.

Take the source:

 <?php f(); // function called before it's defined function f() { g(); } function g() {} ?> 

which is analyzed in the following AST:

alt text

using (combined) grammar:

 grammar PHPMin; options { output=AST; } tokens { SCRIPT; F_CALL; F_DECL; F_BODY; } parse : script EOF -> script ; script : '<?php' atom* '?>' -> ^(SCRIPT atom*) ; atom : functionCall | functionDecl ; functionCall : Identifier '(' ')' ';' -> ^(F_CALL Identifier) ; functionDecl : 'function' Identifier '(' ')' '{' functionBody '}' -> ^(F_DECL Identifier functionBody) ; functionBody : functionCall* -> ^(F_BODY functionCall*) ; Identifier : ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | '0'..'9')* ; LineComment : '//' ~('\r' | '\n')* ('\r'? '\n' | EOF){skip();} ; Space : (' ' | '\t' | '\r' | '\n'){skip();} ; 

Then find the declared functions using the tree view generated from the following tree grammar:

 tree grammar PHPMinFunctionWalker; options { tokenVocab=PHPMin; ASTLabelType=CommonTree; } @members { java.util.Set<String> declared = new java.util.HashSet<String>(); } discover : script ; script : ^(SCRIPT atom*) ; atom : functionCall | functionDecl ; functionCall : ^(F_CALL Identifier) ; functionDecl : ^(F_DECL Identifier functionBody) {declared.add($Identifier.text);} ; functionBody : ^(F_BODY functionCall*) ; 

To test all this, create a lexer and parser (A), generate a tree view (B), compile all the source files (C):

 // A java -cp antlr-3.2.jar org.antlr.Tool PHPMin.g // B java -cp antlr-3.2.jar org.antlr.Tool PHPMinFunctionWalker.g // C javac -cp antlr-3.2.jar *.java // D java -cp .:antlr-3.2.jar Main // *nix java -cp .;antlr-3.2.jar Main // Windows 

and run the following main class (D):

 import org.antlr.runtime.*; import org.antlr.runtime.tree.*; import org.antlr.stringtemplate.*; public class Main { public static void main(String[] args) throws Exception { String source = "<?php \n" + "f(); // function called before it's defined \n" + "function f() { \n" + " g(); \n" + "} \n" + "function g() {} \n" + "?> \n"; // create a lexer and parser for the source ANTLRStringStream in = new ANTLRStringStream(source); PHPMinLexer lexer = new PHPMinLexer(in); CommonTokenStream tokens = new CommonTokenStream(lexer); PHPMinParser parser = new PHPMinParser(tokens); PHPMinParser.parse_return returnValue = parser.parse(); CommonTree tree = (CommonTree)returnValue.getTree(); // create a tree walker to discover all declared functions CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree); nodes.setTokenStream(tokens); PHPMinFunctionWalker functions = new PHPMinFunctionWalker(nodes); functions.discover(); System.out.println("Declared functions: "+functions.declared); } } 

which produces the following output:

 Declared functions: [f, g] 

Of course, this is just an example of how to approach him, and not how this is best done. I can imagine (using Java to interpret the script) you would not save the declared functions as simple strings in Set<String> , but rather as Map<String, CommonTree> , to easily get the root of the function and evaluate it when called.

Further reading: http://www.antlr.org/wiki/display/ANTLR3/Simple+tree-based+interpeter

Good luck

EDIT

The interval of seconds can then check if all functions in front of it are defined using the previous tree walker:

 tree grammar PHPMinValidateWalker; options { tokenVocab=PHPMin; ASTLabelType=CommonTree; } @members { java.util.Set<String> declared = new java.util.HashSet<String>(); } validate : script ; script : ^(SCRIPT atom*) ; atom : functionCall | functionDecl ; functionCall : ^(F_CALL Identifier) { if(!declared.contains($Identifier.text)) { throw new RuntimeException("no such function: " + $Identifier.text); } } ; functionDecl : ^(F_DECL Identifier functionBody) ; functionBody : ^(F_BODY functionCall*) ; 

Test Usage:

 import org.antlr.runtime.*; import org.antlr.runtime.tree.*; import org.antlr.stringtemplate.*; public class Main { public static void main(String[] args) throws Exception { String source = "<?php \n" + "f(); // function called before it's defined \n" + "function f() { \n" + " g(); \n" + " x(); \n" + "} \n" + "function g() {} \n" + "?> \n"; // create a lexer and parser for the source ANTLRStringStream in = new ANTLRStringStream(source); PHPMinLexer lexer = new PHPMinLexer(in); CommonTokenStream tokens = new CommonTokenStream(lexer); PHPMinParser parser = new PHPMinParser(tokens); PHPMinParser.parse_return returnValue = parser.parse(); CommonTree tree = (CommonTree)returnValue.getTree(); // create a tree walker to discover all declared functions CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree); nodes.setTokenStream(tokens); PHPMinFunctionWalker functions = new PHPMinFunctionWalker(nodes); functions.discover(); System.out.println("Declared functions: "+functions.declared); // PHPMinValidateWalker nodes = new CommonTreeNodeStream(tree); nodes.setTokenStream(tokens); PHPMinValidateWalker validator = new PHPMinValidateWalker(nodes); validator.declared = functions.declared; validator.validate(); } } 

throws an exception since x() not defined anywhere. Removing it from the source will cause the tree-walker to not throw any exceptions.

+8
source share

All Articles