What is a good approach to creating a new compiler?

I have experience with compiler phrases and am interested in the “Programming Languages” and “Compilers” fields, and I hope someone gives me some explanation, what is a good approach to writing a new compiler from scratch for a new programming language? ( I mean STEPS ).

+6
c compiler-construction programming-languages
source share
4 answers

The first step is to read the Dragon Book .

It offers a good introduction to the whole field of compiler creation, but also goes in enough detail to actually create your own.

For the next steps, I suggest following the chapters of the book. It is not written as a textbook, but nevertheless offers many practical tips that make it an ideal center for your own ideas and research.

+9
source share

Please do not use the Dragon Book, it is old and mostly outdated (and uses strange names for most of the material).

In books, I would recommend the Apple Tiger Book or Cooper Engineering compiler. I highly recommend you use a framework like llvm , so you don’t need to re-implement a bunch of things for generation code, etc.

Here is a tutorial on creating your language using llvm: http://llvm.org/docs/tutorial/

+5
source share

I would look at integrating your langauge / front end with the GNU compiler map.

Thus, you only (ONLY!) Have to write a parser and translator in the gcc format of the portable object. You get a free optimizer, creating an object code for a chip of choice, a linker, etc.

Another alternative would be to target the Java JVM, the virtual machine is well documented, and the JVM instruction set is much more robust than the x86 machine code.

+3
source share

I managed to write a compiler without any specific book (although in the past I read some compiler books, it’s just not to the full extent).

The first thing you need to do is play with any tools like "Compiler Compiler" (flex, bison, antlr, javacc) and make your grammar work. Grammars are basically simple, but there are always insignificant bits that interfere and destroy everything. Especially things like expressions, priority, etc.

Some of the older simpler languages ​​are easier for some reason. This makes the Just Work parsers. Consider a Pascal variant that can only be processed using a recursive decent.

I mention this because without your grammar you have no language. If you cannot parse lex correctly, you won’t get anything very fast. And viewing a dozen lines of example code in your new language turns into a lot of tokens, and the syntax nodes are really really awesome. In "wow, that really works." It literally almost “everything works” or “none of this works,” especially at the beginning. Once it actually works, you feel that you can really take it off.

And to some extent this is true, because as soon as you get this part, you need to complete your main execution time. After you compile "a = 1 + 1", the bulk of the new work is yours, and now you just need to implement the rest of the statements. This is mainly an exercise in managing tables and search links, as well as an idea of ​​where you are at any time in the process.

You can work on your own with a completely new syntax, innovative runtime, etc. But if you have the time, it might be best to make a language that has already been completed, just to understand and implement all the steps, and think if you wrote a language that you really want, how would you do what you do with it existing, otherwise.

There are many mechanics for writing a compiler, and simply successfully performing this process once will give you more confidence when you want to come back and do it again using your own new language.

+2
source share

All Articles