A simple, formally defined language for teaching compilers

I am looking for a simple, formally defined language that can be used to study compiler construction. It should be simple to implement the first pass, and then succumb to further optimization efforts.

Feel free to point me in the lisps direction, but I'm specifically looking for other options.

+6
compiler-construction
source share
4 answers

Can I suggest Jack's programming language from http://www.nand2tetris.org/ ?

This is especially suitable for teaching the compiler as part of an academic course.

I'm in the middle of a blog post about writing a compiler for this language, in C #, with code generation in C. Messages that I already have: http://blogs.microsoft.co.il/blogs/sasha/archive/tags /Compiler/default.aspx

+5
source share

I thought chapter 8 of Kernigan and Pike. It covers most of the programming in a Unix environment, and it implements a programming language.

Chapter 8 is called "Program Development." It discusses the development of a non-trivial program at different stages of design. This non-trivial program is a high-speed calculator. For more information about hoc, see http://en.wikipedia.org/wiki/Hoc_(programming_language )

This is a great practical introduction to implementing a simple language using the standard yacc and lex tools. yacc and lex are too much to cover here, but by following the examples in this book and doing the exercises, you will develop an understanding of them.

Development continues at various stages; in the first stage, you don’t even have variables in the language. In the third step, you have variables, defined constants (PI, E, etc.) and built-in functions like sin () and log (). At the last stage, you have a fully implemented language.

Now, is the best language to try and implement? I have no idea, but I know that the Unix programming environment was an excellent book to read in parallel with the traditional compiler book. When I started reading the Aho compiler book (the book of dragons), I re-read chapter 8 of TUPE and followed the examples and exercises. Of course, anyone can rewrite the code from the book, but the exercises require that you well understand what is happening.

In the end, I don’t think it’s exactly which language you prefer to do, but the process that you are carrying out is in its implementation.

+3
source share

I would suggest Wirth PL / 0 .

Why?

  • The grammar is small, but still enough to get a good taste for compiler development:

    program = block "." . block = [ "const" ident "=" number {"," ident "=" number} ";"] [ "var" ident {"," ident} ";"] { "procedure" ident ";" block ";" } statement . statement = [ ident ":=" expression | "call" ident | "begin" statement {";" statement } "end" | "if" condition "then" statement | "while" condition "do" statement ]. condition = "odd" expression | expression ("="|"#"|"<"|"<="|">"|">=") expression . expression = [ "+"|"-"] term { ("+"|"-") term} . term = factor {("*"|"/") factor} . factor = ident | number | "(" expression ")" . 
  • You can implement a virtual machine compiler for PL / 0 in C in about 1000 lines of code.

    • Large enough to be nontrivial, but small enough to be doable.
  • There are three books associated with it:

    • Wirth, Niklaus (1975), Algorithms + Data Structures = Programs, ISBN 0-13-022418-9 (original specification and implementation of PL / 0 (in Pascal)) An excellent gentle introduction to compilation.

    • Liffick, Blaise W., Ed (1979), Byte Book of Pascal, ISBN 0-07-037823-1 (authors develop a minor PL / 0 add-on in Northstar Basic for the early CP / M computer).

    • Wirth, Nicklaus (1986), Compilatorbau, B.G. Teubner, Stuttgart ISBN 3-519-32338-9 (minor addition PL / 0 implemented in Module 2. In German).

  • The Internet is full of examples.

    • I found implementations in C, C ++, Pascal, Modula 2, Java, and Ruby. I bet there are even more.
  • There is an entry on Wikipedia :: -)

  • In addition, several useful groups with a large number of people who want to help answer questions about the compiler:

+2
source share

The Oberon specification is small enough for your purposes: http://www-vs.informatik.uni-ulm.de:81/projekte/Oberon-2.Report/

R5RS or its purely functional subset is not that big (if you ignore the number tower).

+1
source share

All Articles