There are many advantages to using a parser generator such as bison or antlr, especially when developing a language. You will undoubtedly end up making changes to the grammar when you go, and you will want to get documentation on the final grammar. Tools that produce grammar automatically from the documentation are really useful. They can also help you ensure that the grammar of the language (a), in your opinion, and (b) is not ambiguous.
If your language (unlike C ++) is actually LALR (1) or even better, LL (1), and you use LLVM tools to create AST and IR, then it is unlikely that you will need to do much more. than write a grammar and provide some simple steps to create an AST. This will keep you awhile.
The usual reason people ultimately prefer to create their own parsers, with the exception of biased "real programmers not using parser generators," is that it is not easy to provide good diagnosis of syntax errors, especially with LR (1) parsing. If this is one of your goals, you should try to do your grammar LL (k) parsing (it is still not easy to provide good diagnostics with LL (k), but it's a little easier) and use LL (k) as Antlr.
There is another strategy, which is to first analyze the program text in the simplest way, using the LALR (1) parser, which is more flexible than LL (1), without even trying to provide diagnostics. If the parsing fails, you can parse it again using a slower, maybe even a reverse parser that does not know how to create an AST, but keeps track of the source location and tries to repair the syntax errors. Recovering from syntactic ers without canceling AST is even more difficult than just continuing to parse, so you can say a lot so you donโt try. In addition, source location tracking is very slow, and it is not very useful if you do not need to perform diagnostics (if you do not need it to add debug annotations), so you can speed up the analysis a bit without worrying about location tracking.
Personally, I tend to parse packages because he does not understand what constitutes the real language that PEG analyzes. Other people don't mind that much, and YMMV.
rici
source share