I found this answer to the jGuru question written by Terence Parr who created ANTLR. I copied this explanation from the site linked here :
Only simple, so-called syntactic directional translations can be performed using actions in the parser. These types of translations can only spit out designs that are functions of the information already seen at this point in the parsing. Tree-parsers allow you to go through the intermediate form and manipulate this tree, gradually turning it into several phases of the translation into the final form, which can be easily printed as a new translation.
Imagine a simple translation task in which you want to print an html page called "There are n elements", where n is the number of identifiers found in the input stream. IDs should be printed after the title as follows:
<html> <head> <title>There are 3 items</title> </head> <body> <ol> <li>Dog</li> <li>Cat</li> <li>Velociraptor</li> </body> </html>
from input
Dog Cat Velociraptor
So, with simple steps in your grammar, how can you figure out the name? You cannot not read the entire input. So now we know that we need an intermediate form. The best usually is the AST, which I found as it records the input structure. In this case, this is just a list, but it demonstrates my point.
Well, now you know that a tree is a good thing for anything but simple translations. Given AST, how do you get a way out of it? Imagine simple expression trees. One way is to make nodes in the tree of specific classes, such as PlusNode, IntegerNode, and so on. Then you just ask each node to print. To enter 3 + 4 you will have a tree:
+ | 3 to 4
and classes
class PlusNode extends CommonAST { public String toString() { AST left = getFirstChild(); AST right = left.getNextSibling(); return left + " + " + right; } } class IntNode extends CommonAST { public String toString() { return getText(); } }
Given an expression tree, you can translate it back into text using t.toString (). SO what happened to this? It seems to work fine, right? This seems to work well in this case because it is simple, but I argue that even in this simple example, tree grammars are more readable and are formalized descriptions of what you encoded in PlusNode.toString ().
expr returns [String r] { String left=null, right=null; } :
Note that the concrete class approach ("heterogeneous AST") actually encodes the full recursive descent parser for # (+ INT INT) manually in toString (). As human parser generators, this should make you cringe.;)
The main disadvantage of the heterogeneous AST approach is that it cannot easily access contextual information. In the recursive descent parser, your context is easily accessible, as it can be passed as a parameter. You also know exactly which rule can trigger which other rule (for example, is this expression a WHILE clause or an IF clause?) By looking at the grammar. The PlusNode class above exists in a separate, isolated world, where it has no idea who will reference the toString () method. Worse, the programmer cannot tell in what context he will be called by reading it.
In conclusion, adding actions to your input parser works for very simple translations, where:
- the order of the output constructions matches the order of input
- all designs can be generated from information parsed to the point where you need to spit them out
In addition, you will need an intermediate form - AST - the best form. Using grammar to describe AST structure is similar to using grammar to analyze your input text. Formalized descriptions in a high-level language, such as ANTLR, are better than hand-coded parsers. Actions in the grammar of the tree have a very clear context and can conveniently access information passed from rlues that call. Tree-driven translations for multi-pass translations are also much simpler using tree grammar.