How can I get started implementing a simple stack-based programming language

I am interested in expanding my knowledge of computer programming by introducing a stack-based programming language. I’m looking for tips on where to start, as I intend to have functions like pushint 1 for this that push an integer with a value of 1 to the top of the stack and control the flow with labels like L01: jump L01: ".

So far I have implemented the C # implementation for what I want my language to work (I wanted to associate with it, but IDEOne is blocked), but it is very dirty and needs to be optimized. It converts the input to XML and then parses it. My goals are to go to a lower level of the language (possibly C / C ++), but my problems are implementing a stack that can store different types of data and does not have a fixed size.

In the end, I would also like to implement arrays and functions. Also, I think I need a better Lexer, and I wonder if parsing would be a good idea for such a simplified language.

Any advice / criticism is welcome, and please consider that I am still new to programming (I just completed CompSci I AP). Links to stack-based open source languages ​​are also welcome.

Here is the basic program that I would like to try and interpret / compile (where [this is a comment] ):

 [Hello World!] pushchar '\n' pushstring "Hello World!" print [Count to 5 and then count down!] pushint 1 setlocal 0 L01: pushchar '\n' getlocal 0 print [print x + '\n'] getlocal 0 increment setlocal 0 [x = x + 1] pushint 5 getlocal 0 lessthan [x < 5] iftrue L01 L02: pushchar '\n' getlocal 0 print [print x + '\n'] getlocal 0 decrement setlocal 0 [x = x - 1] pushint 0 getlocal 0 greaterthan [x > 0] iftrue L02 

Expected Result:

 Hello World! 1 2 3 4 5 4 3 2 1 
+7
source share
2 answers

A stack-based language such as Factor has the following syntax:

 2 3 + 5 - print 

This is equivalent to the following C style code:

 print(2 + 3 - 5); 

The advantage of using a stack-based language is that it is easy to implement. In addition, if a language uses reverse Polish notation , as most stack-based languages ​​do, then all you need for the front end of your language is a lexer. You do not need to parse the tokens in the syntax tree, as there is only one way to decode the token stream.

What you are trying to create is not a stack-based programming language, but a virtual machine stack. Application virtual machines can be stack based or register based . For example, Java Virtual Machine is stack-based. It executes Java bytecode (this is what you create - bytecode for the virtual machine). However, the programming languages ​​that compile for this bytecode (e.g. Java, Erlang, Groovy, etc.) are not stack based.

What you are trying to create is similar to the build level language of your own virtual machine, which, as it turns out, is based on the stack. This says it will be pretty easy to do - stack-based virtual machines implement these register-based virtual machines more easily. Again, all you need is a lexer like flex . Here is a small example in JavaScript using a library called lexer :

 var program = "[print(2 + 3)]"; program += "\n push 2"; program += "\n push 3"; program += "\n add"; program += "\n print"; lexer.setInput(program); var token; var stack = []; var push = false; while (token = lexer.lex()) { switch (token) { case "NUMBER": if (push) stack.push(lexer.yytext); else alert("Unexpected number."); break; case "ADD": if (push) alert("Expected number."); else stack.push(stack.pop() + stack.pop()); break; case "PRINT": if (push) alert("Expected number."); else alert(stack.pop()); break; } push = token === "PUSH"; } 
 <script src="https://rawgit.com/aaditmshah/lexer/master/lexer.js"></script> <script> var lexer = new Lexer; lexer.addRule(/\s+/, function () { // matched whitespace - discard it }); lexer.addRule(/\[.*\]/, function () { // matched a comment - discard it }); lexer.addRule(/\d+/, function (lexeme) { this.yytext = parseInt(lexeme); return "NUMBER"; }); lexer.addRule(/push/, function () { return "PUSH"; }); lexer.addRule(/add/, function () { return "ADD"; }); lexer.addRule(/print/, function () { return "PRINT"; }); </script> 

It is really easy. You can play with the program and modify it according to your needs. Good luck.

+12
source

I think you will find the MetaII article really educational. It shows how to define the pushdown compiler machine and the compiler for it, in 10 short but legible pages. See this answer: https://stackoverflow.com/a/166958/2129/12/12/16/16/12/16/12/16/12/16/16/16/16.jpg This answer is: https://stackoverflow.com/a/166958/2129/12/16/12/16/16/16/16/16/16/16/16/16/16/16/16/16/12/16/16/16.jpg

+2
source

All Articles