I learned how to do it. This may not be the best approach, but it seems to work.
- Antlr guerrillas receive
ITokenStream
parameter - Antlr Lexers themselves
ITokenSource
s ITokenSource
- a significantly simpler interface than ITokenStream
- The easiest way to convert
ITokenSource
to ITokenStream
is to use CommonSourceStream
, which receives the ITokenSource
parameter
So, now we only need to do 2 things:
- Adjust grammar for parser only
- ITokenSource implementation
Grammar setup is very simple. Just remove all lexer declarations and make sure you declare the grammar as a parser grammar
. A simple example is posted here for convenience:
parser grammar mygrammar; options { language=CSharp2; } @parser::namespace { MyNamespace } document: (WORD {Console.WriteLine($WORD.text);} | NUMBER {Console.WriteLine($NUMBER.text);})*;
Note that the following file will output class mygrammar
instead of class mygrammarParser
.
So now we want to implement a βfakeβ lexer. I personally used the following pseudo code:
TokenQueue q = new TokenQueue(); //Do normal lexer stuff and output to q CommonTokenStream cts = new CommonTokenStream(q); mygrammar g = new mygrammar(cts); g.document();
Finally, we need to define a TokenQueue
. TokenQueue
not strictly necessary, but I used it for convenience. It must have methods for obtaining lexer tokens and methods for outputting Antlr tokens. Therefore, if you do not use your own Antlr tokens, you need to implement the conversion method to Antlr-token. In addition, TokenQueue
must implement ITokenSource
.
Keep in mind that it is very important to set token variables correctly. Initially, I had some problems because I was calculating CharPositionInLine
. If these variables are set incorrectly, then the analyzer may fail. In addition, the normal channel (not hidden) is 0.
It seems to work for me so far. I hope others find this helpful. I am open to feedback. In particular, if you find a better way to solve this problem, feel free to post a separate answer.
luiscubal
source share