How can I encode a complex formula parser manually?

Hm, this is an agnostic language, I would prefer to do it in C # or F #, but this time I'm more interested in the question "how it will work anyway."

What I want to do ist:

a) I want to LEARN it - this is about my ego this time, this is for a fun project where I want to show myself that I am really good at it.

b) I am a little versed in EBNF (although I do not yet know how the operator works in EBNF - Irony.NET does it right, I checked the examples, but it is a little sinister for me)

c) My analyzer should be able to take this: 5 * (3 + (2 - 9 * (5/7)) + 9), for example, and give me the correct results

d) To be honest, this is apparently the biggest problem when writing a compiler or even a translator for me. I would not have a problem generating even 64-bit assembler code (I CAN write the assembler manually), but the formula analyzer ...

e) Another thought: even simple computers (such as my old Sharp 1246S with about 2 KB of RAM) can do this ... it can't be YOU SAME, right? And even very, very old programming languages ​​have an evaluation formula ... BASIC since 1964, and they could already calculate the form of the formula presented as an example

f) A few ideas, a little inspiration would really be enough - I just don't know how to prioritize the operator and parentheses - I, however, know that it includes AST and that many people use the stack

So what do you think?

+7
compiler-construction language-agnostic parsing interpreter formula
source share
4 answers

You must learn the recursive descent guerrillas .

Check out the Code Golf exercise by doing just that in 10 different ways:

Code Golf: mathematical expression evaluator (for PEMDAS)

Some of these "golf solutions" are recursive guerrillas, which are simply encoded differently.

You will find that parsing the syntax is actually the simplest in the compiler. Parsing the rest of the language is more difficult, but understanding how code elements interact and how to create good code is much more difficult.

You may also be interested in how to express a parser using BNF, and how to do something with this BNF. Here is an example of how to symbolically parse and manipulate algebra with explicit BNF and implicit AST as a basis. This is not what compilers traditionally do, but the mechanism that does this is deeply based on compiler technology.

+5
source share

Traditionally, formula processors on computers use POSTFIX notation. They use the stack, pop 2 items as operands, expose the third element as an operator, and click the result.

What you want is the INFIX to POSTFIX notifications converter, which is really very simple. When you are in postfix processing, this is the easiest thing you will ever do.

+1
source share

For a stack-based parser implemented in PHP that uses the Shanch Shadow algorithm to convert infix to postfix notation and with support for functions with a different number of arguments, you can look at the source of the PHPExcel computer

+1
source share

If you want to use an existing solution, I can recommend a working PSR-0 compatible shunting algorithm implementation: https://github.com/andig/php-shunting-yard/tree/dev .

0
source share

All Articles