How difficult is it to write an interpreted language if you have AST?

I already have a parser for the language I'm working on. Is its interpretation difficult? I thought it was easy. Parsing and syntax is performed. I just have a tree of objects. Every time an object is created, I create a branch and save its type (string, int, float, class / obj). Then each time a new member is added to the object, I create a branch and repeat.

I am trying to make it simple. I still need to check the object A, which can be added to the object B and the like.

Is it really quite simple after the AST and syntax checking is completed or is there a lot more work?

+7
source share
3 answers

Usually you need to create character tables and check the type. For some langauges you can do it on the fly; for others, I think it’s pretty much for name resolution and type checking in the first place, or you won’t be able to interpret it correctly (C ++ comes to mind).

After creating the symbol tables, you can write an interpreter by going through the tree in exeuction order and do what the operators say. Basic arithmetic is quite simple. Managing strings and dynamic storages is more complicated; you’ll find out how you are going to handle the distribution of storage and the dellocatoid, and for the langauages ​​that manage the storage, you will have to implement some sort of garbage collector. At this point, life becomes complicated.

You will most likely find your langauage related features that you did not take into account. Exception Handling? Multiple appointments? Local areas? Lambda? Shutters? You will learn quite quickly how many modern languages ​​that make them useful.

When you start writing more complex programs, you will need a debugger. Control points? The only step? Variable inspection? Refresh? Start at random places? Read-eval-print loop?

You still need to bind the language to external libraries; most people want to talk to consoles and files; Do you need buffered files, or are you ok with 1 character at a time, and the corresponding performance? You can argue with characater representations (7 bit ascii? 8 bit? UTF8 with asymmetric wide characters? Full Unicode?) And standard support libraries (string concatenation, search, number conversion [including exact floating point conversions in both directions], large arithmetic numbers, floating point traps, ... The list of problems is quite long if you want to use a useful programming language.

The core of the interpreter is likely to be quite small. You will find other material, possibly burning one or two orders of magnitude more effort. Somewhere here, if you want someone to use langauge, you need to document all the choices you made. And heaven helps you if you change the interpreter a little after someone gets a great application.

Then someone will complain about performance. Now you can customize your implementation and begin to reconcile that instead of an interpreter you wrote an interpreter.

Enjoy. If you have AST, you barely scratched the surface. If you do this, you will learn to truly appreciate what modern languages ​​provide out of the box, and how much effort was required to provide it.

+4
source

It depends on how complicated the language you want to write the interpreter in and your choice of tools. Simple translators are simple.

As a definition in AST in Haskell, consider the following: a language that supports functions and higher order numbers:

data Exp = Lam String Exp | App Exp Exp | Var String | Num Int 

Now you can write an interpreter for it as a simple "eval" function:

 eval (App e1 e2) env = (unF (eval e1 env)) (eval e2 env) eval (Lam xe) env = F (\v -> (eval e ((x,v):env))) eval (Num n) env = N n eval (Var x) env = case (lookup x env) of Just v -> v Nothing -> error ("Unbound variable " ++ x) 

What is it. A few boring auxiliary definitions are as follows.

 data Val = F (Val -> Val) | N Int unF (F x) = x instance Show Val where show (F _) = "<procedure>" show (N n) = show n 

In other words, if you copy these three blocks of code into the Haskell source file, you will have a working interpreter that you can check with ghci as follows:

 *Main> eval (App (Lam "x" (Var "x")) (Num 1)) [] 1 *Main> eval (Var "x") [] *** Exception: Unbound variable x 

You can read about creating languages ​​in classic SICP or EOPL or a small book . If you want to create a typed language, you might have to read something else.

However, if you intend to create languages, I can highly recommend reading a lot first. Firstly, it is very useful. And secondly, too many disgusting languages ​​have been inflicted on the world by people who do not know how to create languages ​​(many of which have become very popular for various historical reasons), and we are stuck with them.

+3
source

I would say that the hard part (and the funniest part actually) starts after you have done the AST.

Take a look at LLVM , it has bindings for a large number of languages ​​(I used only C ++ and Haskell, I can’t talk about other languages), and this should help you write a compiler right on time for your language. Actually LLVM simplifies compiler writing than an interpreter!

0
source

All Articles