How to write a parser in C or Objective-C without a parser generator?

I'm trying to make a calculator in C or Objective-C that takes a string along strings

8/2+4(3*9)^2 

and returns the answer 2920. I would prefer not to use a generator like Lex or Yacc, so I want to encode it from scratch. How can I do it? Other than the Dragon book, are there any recommended texts covering this subject?

+8
c objective-c parsing lexical-analysis
source share
7 answers
+1
source share

The Dave DeLong class DDMathParser can save you a lot of time and problems.

+4
source share

If I remember correctly, you can solve this problem with two stacks, one for operators, the other for operands.

 // OPTR stack: store operators // OPND stack: store operands // OP: predefined set of operators OperandType EvaluateExpression(){ InitStack(OPET);Push(OPTR,'#'); initStack(OPND);c=getchar(); while(c!='#'||GetTop(OPTR)!='#'){ if(!In(c,OP)){Push((OPND,c);c=getchar();} //Push to stack if not operator else switch(Precede(GetTop(OPTR),c){ //Top element in stack has a lower priority case '<': Push(OPTR,c); c=getch(); break; case '=': Pop(OPTR,x); c=getch(); break; //Pop top element and push back the calculated result case '>': Pop(OPTR,theta); Pop(OPND,b); Pop(OPND,a); Push(OPND,Operate(a,theta,b)); break; } } return GetTop(OPND); } 
+2
source share

The shunting yard algorithm has already been mentioned. Another classic is a simple recursive descent. Here is a pretty short one that I wrote many years ago:

 #include <stdio.h> #include <string.h> #include <stdlib.h> void expression(void); void show(int ch) { putchar(ch); putchar(' '); } int token() { int ch; while (isspace(ch=getchar())) ; return ch; } void factor() { int ch = token(); if (ch == '(') { expression(); ch = token(); if (ch != ')') { fprintf(stderr, "Syntax error. Expected close paren, found: %c\n", ch); exit(EXIT_FAILURE); } } else show(ch); } void term() { int ch; factor(); ch = token(); if (ch == '*' || ch == '/') { term(); show(ch); } else ungetc(ch, stdin); } void expression() { int ch; term(); ch = token(); if (ch == '-' || ch=='+') { expression(); show(ch); } else ungetc(ch, stdin); } int main(int argc, char **argv) { expression(); return 0; } 

Note that this particular one only parses input and converts it to RPN form. If you want to interpret the result instead, you will replace the print of each operand / operator with an actual evaluation of the result of this part of the expression.

+2
source share

I think this is close to what you want: http://www.codeproject.com/KB/recipes/alxparser.aspx

+1
source share

I did this at CSE340: An Introduction to Lanugages Programming in my CS college year. Therefore, if you really want to code the parser from scratch, be prepared for the fact that it could be a “seventeenth long project”.

You will need to perform tokenization, parsing, assembly of an abstract expression tree, evaluation, etc.

We used Louden Programming Languages: Principles and Practice . I like. Although this did not help you get through the implementation process.

Of course, this would be more than "coding from scratch." You will need to make a grammar, and then assemble a parser to process the rules ... besides learning activities, I'm not sure why you would like to do this.

0
source share

using Objective-C NSLinguisticTagger might be a good solution

 - (void)enumerateTagsInRange:(NSRange)range scheme:(NSString *)tagScheme options:(NSLinguisticTaggerOptions)opts usingBlock:(void (^)(NSString *tag, NSRange tokenRange, NSRange sentenceRange, BOOL *stop))block 
0
source share

All Articles