Abstract syntax tree for a subset of C

For training purposes, we create a step-by-step javascript interpreter for (a subset) of C code.

Basically we have: int, float ..., arrays, functions, for, while ... no pointers. The javascript interpreter is executed and allows us to explain how a logical expression is calculated, show a stack of variables ...

Currently, we manually convert our C examples to some javascript that will run and build a stack of actions (affectation, function call ...), which can later be used to perform step-by-step actions. Since we are limited to a subset of C, this is fairly easy to do.

Now we would like to compile C code in our javascript view. All we need is an abstract syntax tree of C code, and javascript generation is simple.

Do you know a good C-parser capable of generating such a tree? No need to be in javascript (but that would be great), any language is fine, as it can be done offline.

I looked at Emscripten ( https://github.com/kripken/emscripten ), but it is rather a C => javascript compiler, and that is not what we want.

+4
source share
2 answers

I recently used Eli Bendersky pycparser to chat with AST from C code. I think this will work well for your purposes.

+3
source

I think ANTLR has a full C parser.

To complete the translation task, I suspect that you will need full support for symbol tables; you need to know what the symbols mean. Here, most parsers will not be able to; they do not create a complete character table. I think ANTLR does not, but I could be wrong.

Our DMS Software Reengineering Toolkit with its C Front End provides a complete C arser and builds complete character tables. (You may not need this for your application, but it also contains the complete C preprocessor). It also provides control over the flow, data flow, points for analysis and call schedule, which can be useful when transferring C to any of your target virtual machines.

0
source

All Articles