You probably want to implement a term rewriting system. For basic math, check out WikiPedia .
Structure of the term rewrite module
Since I recently implemented a solution ...
First prepare the CExpression class, which models the structure of your expression.
Inject a CRule
that contains a template and a replacement. Use special characters as template variables that must be bound during pattern matching and replaced in the replacement expression.
Then we implement the CRule
class. The main applyRule(CExpression, CRule)
method applyRule(CExpression, CRule)
tries to match the rule with any applicable subexpression of the expression. If it matches, return the result.
To complete, implement the CRuleSet
class, which is just a collection of CRule objects. The basic reduce(CExpression)
method applies a set of rules until more rules can be applied, and then the reduced expression is returned.
In addition, you need a CBindingEnvironment
class that matches already matched characters with matched values.
Try rewriting the expression in normal form
Do not forget that this approach works until a certain point, but is likely to be incomplete. This is because all of the following rules perform local overwrites.
To make this local rewrite logic stronger, try to transform the expressions into what I would call a normal form. This is my approach:
If the term contains literal meanings, try translating the term as far to the right as possible.
In the end, this literal meaning can be displayed as right as possible and can be evaluated as part of a fully literal expression.
When to evaluate a fully literal expression
An interesting question is when to evaluate a fully literal expression. Suppose you have an expression
x * ( 1 / 3 )
which would decrease to
x * 0.333333333333333333
Now suppose x is replaced by 3. This will give something like
0.999999999999999999999999
Thus, the expected score returns a slightly incorrect value.
On the other hand, if you hold (1/3) and first replace x with 3
3 * ( 1 / 3 )
rewrite rule would give
1
Thus, it may be useful to evaluate a completely literal expression at the end.
Sample rewrite rules
This is how my rules appear inside the application: The characters _1, _2, ... match any subexpression:
addRule( new TARuleFromString( '0+_1', // left hand side :: pattern '_1' // right hand side :: replacement ) );
or a little harder
addRule( new TARuleFromString( '_1+_2*_1', '(1+_2)*_1' ) );
Some special characters correspond only to special subexpressions. For example. _Literal1, _Literal2, ... correspond only to literal values:
addRule( new TARuleFromString( 'exp(_Literal1) * exp(_Literal2 )', 'exp( _Literal1 + _Literal2 )' ) );
This rule moves a non-literal expression to the left:
addRule( new TARuleFromString( '_Literal*_NonLiteral', '_NonLiteral*_Literal' ) );
Any name starting with the character '_' is a template variable. Although the system complies with the rule, it stores a stack of assignments of already matched characters.
Finally, do not forget that rules can give endless replacement sequences. Thus, reducing the expression, remember the process, which intermediate expressions have already been achieved earlier.
In my implementation, I do not store intermediate expressions directly. I keep an array of MD5 () hashes of the intermediate expression.
A set of rules as a starting point
Here are the rules to get you started:
addRule( new TARuleFromString( '0+_1', '_1' ) ); addRule( new TARuleFromString( '_Literal2=0-_1', '_1=0-_Literal2' ) ); addRule( new TARuleFromString( '_1+0', '_1' ) ); addRule( new TARuleFromString( '1*_1', '_1' ) ); addRule( new TARuleFromString( '_1*1', '_1' ) ); addRule( new TARuleFromString( '_1+_1', '2*_1' ) ); addRule( new TARuleFromString( '_1-_1', '0' ) ); addRule( new TARuleFromString( '_1/_1', '1' ) ); // Rate = (pow((EndValue / BeginValue), (1 / (EndYear - BeginYear)))-1) * 100 addRule( new TARuleFromString( 'exp(_Literal1) * exp(_Literal2 )', 'exp( _Literal1 + _Literal2 )' ) ); addRule( new TARuleFromString( 'exp( 0 )', '1' ) ); addRule( new TARuleFromString( 'pow(_Literal1,_1) * pow(_Literal2,_1)', 'pow(_Literal1 * _Literal2,_1)' ) ); addRule( new TARuleFromString( 'pow( _1, 0 )', '1' ) ); addRule( new TARuleFromString( 'pow( _1, 1 )', '_1' ) ); addRule( new TARuleFromString( 'pow( _1, -1 )', '1/_1' ) ); addRule( new TARuleFromString( 'pow( pow( _1, _Literal1 ), _Literal2 )', 'pow( _1, _Literal1 * _Literal2 )' ) ); // addRule( new TARuleFromString( 'pow( _Literal1, _1 )', 'ln(_1) / ln(_Literal1)' ) ); addRule( new TARuleFromString( '_literal1 = pow( _Literal2, _1 )', '_1 = ln(_literal1) / ln(_Literal2)' ) ); addRule( new TARuleFromString( 'pow( _Literal2, _1 ) = _literal1 ', '_1 = ln(_literal1) / ln(_Literal2)' ) ); addRule( new TARuleFromString( 'pow( _1, _Literal2 ) = _literal1 ', 'pow( _literal1, 1 / _Literal2 ) = _1' ) ); addRule( new TARuleFromString( 'pow( 1, _1 )', '1' ) ); addRule( new TARuleFromString( '_1 * _1 = _literal', '_1 = sqrt( _literal )' ) ); addRule( new TARuleFromString( 'sqrt( _literal * _1 )', 'sqrt( _literal ) * sqrt( _1 )' ) ); addRule( new TARuleFromString( 'ln( _Literal1 * _2 )', 'ln( _Literal1 ) + ln( _2 )' ) ); addRule( new TARuleFromString( 'ln( _1 * _Literal2 )', 'ln( _Literal2 ) + ln( _1 )' ) ); addRule( new TARuleFromString( 'log2( _Literal1 * _2 )', 'log2( _Literal1 ) + log2( _2 )' ) ); addRule( new TARuleFromString( 'log2( _1 * _Literal2 )', 'log2( _Literal2 ) + log2( _1 )' ) ); addRule( new TARuleFromString( 'log10( _Literal1 * _2 )', 'log10( _Literal1 ) + log10( _2 )' ) ); addRule( new TARuleFromString( 'log10( _1 * _Literal2 )', 'log10( _Literal2 ) + log10( _1 )' ) ); addRule( new TARuleFromString( 'ln( _Literal1 / _2 )', 'ln( _Literal1 ) - ln( _2 )' ) ); addRule( new TARuleFromString( 'ln( _1 / _Literal2 )', 'ln( _Literal2 ) - ln( _1 )' ) ); addRule( new TARuleFromString( 'log2( _Literal1 / _2 )', 'log2( _Literal1 ) - log2( _2 )' ) ); addRule( new TARuleFromString( 'log2( _1 / _Literal2 )', 'log2( _Literal2 ) - log2( _1 )' ) ); addRule( new TARuleFromString( 'log10( _Literal1 / _2 )', 'log10( _Literal1 ) - log10( _2 )' ) ); addRule( new TARuleFromString( 'log10( _1 / _Literal2 )', 'log10( _Literal2 ) - log10( _1 )' ) ); addRule( new TARuleFromString( '_Literal1 = _NonLiteral + _Literal2', '_Literal1 - _Literal2 = _NonLiteral' ) ); addRule( new TARuleFromString( '_Literal1 = _NonLiteral * _Literal2', '_Literal1 / _Literal2 = _NonLiteral' ) ); addRule( new TARuleFromString( '_Literal1 = _NonLiteral / _Literal2', '_Literal1 * _Literal2 = _NonLiteral' ) ); addRule( new TARuleFromString( '_Literal1 =_NonLiteral - _Literal2', '_Literal1 + _Literal2 = _NonLiteral' ) ); addRule( new TARuleFromString( '_NonLiteral + _Literal2 = _Literal1 ', '_Literal1 - _Literal2 = _NonLiteral' ) ); addRule( new TARuleFromString( '_NonLiteral * _Literal2 = _Literal1 ', '_Literal1 / _Literal2 = _NonLiteral' ) ); addRule( new TARuleFromString( '_NonLiteral / _Literal2 = _Literal1 ', '_Literal1 * _Literal2 = _NonLiteral' ) ); addRule( new TARuleFromString( '_NonLiteral - _Literal2 = _Literal1', '_Literal1 + _Literal2 = _NonLiteral' ) ); addRule( new TARuleFromString( '_NonLiteral - _Literal2 = _Literal1 ', '_Literal1 + _Literal2 = _NonLiteral' ) ); addRule( new TARuleFromString( '_Literal2 - _NonLiteral = _Literal1 ', '_Literal2 - _Literal1 = _NonLiteral' ) ); addRule( new TARuleFromString( '_Literal1 = sin( _NonLiteral )', 'asin( _Literal1 ) = _NonLiteral' ) ); addRule( new TARuleFromString( '_Literal1 = cos( _NonLiteral )', 'acos( _Literal1 ) = _NonLiteral' ) ); addRule( new TARuleFromString( '_Literal1 = tan( _NonLiteral )', 'atan( _Literal1 ) = _NonLiteral' ) ); addRule( new TARuleFromString( '_Literal1 = ln( _1 )', 'exp( _Literal1 ) = _1' ) ); addRule( new TARuleFromString( 'ln( _1 ) = _Literal1', 'exp( _Literal1 ) = _1' ) ); addRule( new TARuleFromString( '_Literal1 = _NonLiteral', '_NonLiteral = _Literal1' ) ); addRule( new TARuleFromString( '( _Literal1 / _2 ) = _Literal2', '_Literal1 / _Literal2 = _2 ' ) ); addRule( new TARuleFromString( '_Literal*_NonLiteral', '_NonLiteral*_Literal' ) ); addRule( new TARuleFromString( '_Literal+_NonLiteral', '_NonLiteral+_Literal' ) ); addRule( new TARuleFromString( '_Literal1+(_Literal2+_NonLiteral)', '_NonLiteral+(_Literal1+_Literal2)' ) ); addRule( new TARuleFromString( '_Literal1+(_Literal2+_1)', '_1+(_Literal1+_Literal2)' ) ); addRule( new TARuleFromString( '(_1*_2)+(_3*_2)', '(_1+_3)*_2' ) ); addRule( new TARuleFromString( '(_2*_1)+(_2*_3)', '(_1+_3)*_2' ) ); addRule( new TARuleFromString( '(_2*_1)+(_3*_2)', '(_1+_3)*_2' ) ); addRule( new TARuleFromString( '(_1*_2)+(_2*_3)', '(_1+_3)*_2' ) ); addRule( new TARuleFromString( '(_Literal * _1 ) / _Literal', '_1' ) ); addRule( new TARuleFromString( '(_Literal1 * _1 ) / _Literal2', '(_Literal1 * _Literal2 ) / _1' ) ); addRule( new TARuleFromString( '(_1+_2)+_3', '_1+(_2+_3)' ) ); addRule( new TARuleFromString( '(_1*_2)*_3', '_1*(_2*_3)' ) ); addRule( new TARuleFromString( '_1+(_1+_2)', '(2*_1)+_2' ) ); addRule( new TARuleFromString( '_1+_2*_1', '(1+_2)*_1' ) ); addRule( new TARuleFromString( '_literal1 * _NonLiteral = _literal2', '_literal2 / _literal1 = _NonLiteral' ) ); addRule( new TARuleFromString( '_literal1 + _NonLiteral = _literal2', '_literal2 - _literal1 = _NonLiteral' ) ); addRule( new TARuleFromString( '_literal1 - _NonLiteral = _literal2', '_literal1 - _literal2 = _NonLiteral' ) ); addRule( new TARuleFromString( '_literal1 / _NonLiteral = _literal2', '_literal1 * _literal2 = _NonLiteral' ) );
Make rules top-notch
An interesting point: since the above rules are a special expression that is correctly evaluated by the expression parser, users can even add new rules and thereby improve the rewriting capabilities of the application.
Parsing expressions (or more general ones: languages)
For Cocoa / OBjC applications , Dave DeLong DDMathParser is an ideal candidate for parsing mathematical expressions.
For other languages, our old friends Lex and Yacc or the newer GNU Bison can help.
A younger and richer set of ready-to-use syntax files , ANTLR is a modern Java-based parser generator. In addition to the pure command line, ANTLRWorks provides a GUI interface for creating and debugging ANTLR-based parsers. ANTLR generates grammars for various host languages , such as JAVA, C, Python, PHP, or C # . ActionScript runtime is currently disabled .
If you want to learn how to parse expressions (or languages ββin general) from the bottom up, I would suggest this free text book from Niklaus Wirth (or the German book version ), the famous inventor of Pascal and Modula-2.