I am writing a very simple web server that should support an extremely limited special server-side scripting language. Basically, all I need to support is an echo, adding / subtracting / multiplying (without dividing) with only two operands, a simple function called "date ()" that prints the date and uses "&"; operator for string concatenation.
An example would be:
echo "Here is the date: " & date(); echo "9 x 15 = : & 9*15;
I went through and created the code needed to create the tokens, but I'm not sure I use the correct tokens.
I created tokens for the following:
ECHO - The echo command WHITESPACE - Any whitespace STRING - A string inside quotations DATE - The date() function CONCAT - the & operator for concatenation MATH - Any instance of binary operation (5+4, 9*2, 8-2, etc) TERM - The terminal character (;)
MATH, which I do not particularly know. Usually I see that people create a marker specifically for integers, and then for each operator, but since I ONLY want to allow binary operations, I thought it makes sense to group it into one token. If I did everything separately, I would have to do extra work so that I would never accept "5 + 4 + 1".
So, question 1, am I on the right track with which to use tokens?
My next question is: what should I do with these tokens in order to provide the correct syntax? The approach I was thinking about was basically to say: "OK, I know that I have this token, here is a list of tokens that are allowed to enter the current token. Is the next token on the list?"
Based on this, I compiled a list of all my tokens, as well as the fact that the tokens are valid to appear immediately after them (for simplicity, there were no spaces).
ECHO -> STRING|MATH|DATE STRING -> TERM|CONCAT MATH -> TERM|CONCAT DATE -> TERM|CONCAT CONCAT -> STRING|MATH|DATE
The problem is that I'm not sure how best to implement this. Actually, I need to track spaces to make sure there are spaces between tokens. But this means that I have to look forward two tokens at a time that is becoming even more intimidating. I'm also not sure how to manage valid next to tokens without any disgusting part of if blocks. Do I have to check the valid syntax before trying to actually execute the script, or should I do it all at once and just throw an error when I reach the unexpected token? In this simple example, everything will always work, just perfectly understanding from left to right, there are no real priority rules (except for MATH, but this is part of why I combined it into one token, even if it does not feel right.) Despite this, I Don't think about developing a more scalable and elegant solution.
In my research on writing parsers, I see many references to creating the functions "accept ()" and "expect ()", but I cannot find a clear description of what they should do or how they should work.
I think I just donโt know how to implement this, and then how to actually get the summary line at the end of the day.
I am heading in the right direction and does anyone know a resource that can help me understand how best to implement something simple? I have to do this manually and cannot use a tool like ANTLR.
Thanks in advance for your help.