Currently, we are still doing the same thing that Eric described. We tried some experiments, but found that the cost of API clarity is too high to pay. Instead, the main change we made was to reduce heap allocations and GC costs, turning SyntaxToken into a structure in the red model. This is a significant savings, since in average source files about half of the nodes in the syntax tree are terminals ( SyntaxToken s). Green nodes are not structures, but on the other, and since they do not contain parental pointers or positions, they are interned - we use the same instance of a green node for all the same nodes (the same little things and text).
I'm not sure what you mean by the "cost" of editing. Time / space / complexity / etc .. In general, we have an incremental lexer that rescans the area affected by editing. Then we have an incremental parser that at best revises the statement that intersects with the recently lexed tokens, and then re-builds the spine of the green tree back to the root, while reusing the rest of the green nodes in the tree. By definition, none of the red nodes can be reused. There is a new root node for the new tree, and it is accessible from any other node.
There were no other optimizations in the compiler. We reuse the cached tree in the IDE after cancellation, but I'm not sure.
Kevin pilch
source share