I need a parser for an exotic programming language. I wrote a grammar for it and used the parser generator (PEGjs) to generate the parser. This works fine ... except for one: macros (which replace the placeholder with predefined text). I do not know how to integrate this into grammar. Let me illustrate the problem:
An example of the analyzed program usually looks as follows:
instructionA parameter1, parameter2 instructionB parameter1 instructionC parameter1, parameter2, parameter3
There are no problems so far. But the language also supports macros:
Define MacroX { foo, bar } instructionD parameter1, MacroX, parameter4 Define MacroY(macroParameter1, macroParameter2) { instructionE parameter1, macroParameter1 instructionF macroParameter2, MacroX } instructionG parameter1, MacroX MacroY
Of course, I could define a grammar to identify macros and macro references. But in this case, I do not know how I analyzed the contents of the macro, because it did not understand what the macro contains. It can be only one parameter (this is the simplest), but it can also be several parameters in one macro (for example, MacroX in my example, which represents two parameters) or a whole block of instructions (for example, MacroY). Macros may contain other macros. How to put this in a grammar, if it is not clear that the macro is semantic?
The simplest approach is to start the preprocessor first to replace all macros and only then start the parser. But in this case, the line numbers are messed up. I want the parser to generate error messages containing a line number if there is a parsing error. And if I preprocess the input, the line numbers no longer match.
Help really appreciate.