I canβt say how this should be done, but I did it in F # for the C # compiler here
The approach was mainly - built the AST from the source, leaving things like information like unconstrained. So AST.fs is basically an AST that contains type names, function names, etc.
As the AST begins to compile (in this case) .NET IL, we get more information about the type (we create types in the source - allow these stub types). This then gives us the information needed to create stub labels (the code can have signatures that include stub types as well as built-in types). From here we now have enough type information to resolve any type names or method signatures in the code.
I store this in the TypedAST.fs file. I do this in one go, but the approach may be naive.
Now we have a fully typed AST, then you can do things like compile it, fully analyze it, or as you like.
So, in response to the question βDoes this mean that I need to define two types for the AST node, one for the syntax phase and one for the semantic phase?β, I canβt say definitively that this is so, but it is certainly what I did, and it looks like MS did with Roslyn (although they essentially decorated the original tree with the info IIRC type)
"Are there purely functional programming tricks that help the compiler author with this problem?" Given that AST is mostly reflected in my case, one could generalize it and transform the tree, but the code can be (more) terrible.
i.e.
type 'type AST;
| MethodInvoke of 'type * Name *' type list
| ....
neil danson
source share