Your question can be interpreted (at least) in two ways:
- separate rules from large grammar to separate grammar;
- analyze a separate language inside your "main" language (island grammar).
I guess this is the first one, in which case you can import the grammar.
Demo for option 1:
file: Lg
lexer grammar L; Digit : '0'..'9' ;
file: Sub.g
parser grammar Sub; number : Digit+ ;
file: Root.g
grammar Root; import Sub; parse : number EOF {System.out.println("Parsed: " + $number.text);} ;
file: Main.java
import org.antlr.runtime.*; public class Main { public static void main(String[] args) throws Exception { L lexer = new L(new ANTLRStringStream("42")); CommonTokenStream tokens = new CommonTokenStream(lexer); RootParser parser = new RootParser(tokens); parser.parse(); } }
Run the demo:
bart@hades :~/Programming/ANTLR/Demos/Composite$ java -cp antlr-3.3.jar org.antlr.Tool Lg bart@hades :~/Programming/ANTLR/Demos/Composite$ java -cp antlr-3.3.jar org.antlr.Tool Root.g bart@hades :~/Programming/ANTLR/Demos/Composite$ javac -cp antlr-3.3.jar *.java bart@hades :~/Programming/ANTLR/Demos/Composite$ java -cp .:antlr-3.3.jar Main
which will print:
Parsed: 42
to the console.
Additional information: http://www.antlr.org/wiki/display/ANTLR3/Composite+Grammars
Demo for option 2:
A good example of a language inside a language is regular expression. You have a βnormalβ regular expression language with its metacharacters, but there is one more in it: a language that describes a character set (or character class).
Instead of considering the metacharacters of the character set (range - , negation ^ , etc.) inside your regular grammar, you can simply consider the character set as a single token, consisting of [ and then everything, up to ] (maybe \] in it!) inside your regular grammar. When you then come across a CharSet token in one of the rules of your parser, you call the CharSet parser.
file: Regex.g
grammar Regex; options { output=AST; } tokens { REGEX; ATOM; CHARSET; INT; GROUP; CONTENTS; } @members { public static CommonTree ast(String source) throws RecognitionException { RegexLexer lexer = new RegexLexer(new ANTLRStringStream(source)); RegexParser parser = new RegexParser(new CommonTokenStream(lexer)); return (CommonTree)parser.parse().getTree(); } } parse : atom+ EOF -> ^(REGEX atom+) ; atom : group quantifier? -> ^(ATOM group quantifier?) | EscapeSeq quantifier? -> ^(ATOM EscapeSeq quantifier?) | Other quantifier? -> ^(ATOM Other quantifier?) | CharSet quantifier? -> ^(CHARSET {CharSetParser.ast($CharSet.text)} quantifier?) ; group : '(' atom+ ')' -> ^(GROUP atom+) ; quantifier : '+' | '*' ; CharSet : '[' (('\\' .) | ~('\\' | ']'))+ ']' ; EscapeSeq : '\\' . ; Other : ~('\\' | '(' | ')' | '[' | ']' | '+' | '*') ;
file: CharSet.g
grammar CharSet; options { output=AST; } tokens { NORMAL_CHAR_SET; NEGATED_CHAR_SET; RANGE; } @members { public static CommonTree ast(String source) throws RecognitionException { CharSetLexer lexer = new CharSetLexer(new ANTLRStringStream(source)); CharSetParser parser = new CharSetParser(new CommonTokenStream(lexer)); return (CommonTree)parser.parse().getTree(); } } parse : OSqBr ( normal -> ^(NORMAL_CHAR_SET normal) | negated -> ^(NEGATED_CHAR_SET negated) ) CSqBr ; normal : (EscapeSeq | Hyphen | Other) atom* Hyphen? ; negated : Caret normal -> normal ; atom : EscapeSeq | Caret | Other | range ; range : from=Other Hyphen to=Other -> ^(RANGE $from $to) ; OSqBr : '[' ; CSqBr : ']' ; EscapeSeq : '\\' . ; Caret : '^' ; Hyphen : '-' ; Other : ~('-' | '\\' | '[' | ']') ;
file: Main.java
import org.antlr.runtime.*; import org.antlr.runtime.tree.*; import org.antlr.stringtemplate.*; public class Main { public static void main(String[] args) throws Exception { CommonTree tree = RegexParser.ast("((xyz)*[^\\da-f])foo"); DOTTreeGenerator gen = new DOTTreeGenerator(); StringTemplate st = gen.toDOT(tree); System.out.println(st); } }
And if you run the main class, you will see the DOT output for the regular expression ((xyz)*[^\\da-f])foo , which is the following tree:

The magic is inside the Regex.g grammar in the atom rule, where I inserted the node tree into the rewrite rule by calling the static ast method from the CharSetParser class:
CharSet ... -> ^(... {CharSetParser.ast($CharSet.text)} ...)
Please note that there should be no half-tones inside such rewriting rules! So, that would be wrong: {CharSetParser.ast($CharSet.text);} .
EDIT
And here's how to create tree-like wallpapers for both grammars:
file: RegexWalker.g
tree grammar RegexWalker; options { tokenVocab=Regex; ASTLabelType=CommonTree; } walk : ^(REGEX atom+) {System.out.println("REGEX: " + $start.toStringTree());} ; atom : ^(ATOM group quantifier?) | ^(ATOM EscapeSeq quantifier?) | ^(ATOM Other quantifier?) | ^(CHARSET t=. quantifier?) {CharSetWalker.walk($t);} ; group : ^(GROUP atom+) ; quantifier : '+' | '*' ;
file: CharSetWalker.g
tree grammar CharSetWalker; options { tokenVocab=CharSet; ASTLabelType=CommonTree; } @members { public static void walk(CommonTree tree) { try { CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree); CharSetWalker walker = new CharSetWalker(nodes); walker.walk(); } catch(Exception e) { e.printStackTrace(); } } } walk : ^(NORMAL_CHAR_SET normal) {System.out.println("NORMAL_CHAR_SET: " + $start.toStringTree());} | ^(NEGATED_CHAR_SET normal) {System.out.println("NEGATED_CHAR_SET: " + $start.toStringTree());} ; normal : (EscapeSeq | Hyphen | Other) atom* Hyphen? ; atom : EscapeSeq | Caret | Other | range ; range : ^(RANGE Other Other) ;
Main.java
import org.antlr.runtime.*; import org.antlr.runtime.tree.*; import org.antlr.stringtemplate.*; public class Main { public static void main(String[] args) throws Exception { CommonTree tree = RegexParser.ast("((xyz)*[^\\da-f])foo"); CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree); RegexWalker walker = new RegexWalker(nodes); walker.walk(); } }
To start the demo, do:
java -cp antlr-3.3.jar org.antlr.Tool CharSet.g java -cp antlr-3.3.jar org.antlr.Tool Regex.g java -cp antlr-3.3.jar org.antlr.Tool CharSetWalker.g java -cp antlr-3.3.jar org.antlr.Tool RegexWalker.g javac -cp antlr-3.3.jar *.java java -cp .:antlr-3.3.jar Main
which will print:
NEGATED_CHAR_SET: (NEGATED_CHAR_SET \d (RANGE af)) REGEX: (REGEX (ATOM (GROUP (ATOM (GROUP (ATOM x) (ATOM y) (ATOM z)) *) (CHARSET (NEGATED_CHAR_SET \d (RANGE af))))) (ATOM f) (ATOM o) (ATOM o))