Libclang: missing some instructions in AST?

I wrote a test program (parse_ast.c) to parse the source c file (tt.c) to see how libclang works, the result is AST hierarchical structure:

Here is the test file:

/* tt.c */ // line 1 #include <unistd.h> #include <stdio.h> typedef ssize_t (*write_fn_t)(int, const void *, size_t); void indirect_write(write_fn_t write_fn) { // line 7 (*write_fn)(1, "indirect call\n", 14); } void direct_write() { // line 11 write(1, "direct call\n", 12); // line 12 mising in the ast? } int main() { // line 15 direct_write(); indirect_write(write); // line 17 missing in the ast? return 0; } 

The result shows the following:

  ... ... inclusion directive at tt.c (2, 1) to (2, 20) inclusion directive at tt.c (3, 1) to (3, 19) TypedefDecl at tt.c (5, 1) to (5, 57) TypeRef at tt.c (5, 9) to (5, 16) ParmDecl at tt.c (5, 31) to (5, 35) ParmDecl at tt.c (5, 36) to (5, 49) ParmDecl at tt.c (5, 50) to (5, 56) FunctionDecl at tt.c (7, 1) to (9, 2) ParmDecl at tt.c (7, 21) to (7, 40) TypeRef at tt.c (7, 21) to (7, 31) CompoundStmt at tt.c (7, 42) to (9, 2) CallExpr at tt.c (8, 5) to (8, 42) UnexposedExpr at tt.c (8, 5) to (8, 16) ParenExpr at tt.c (8, 5) to (8, 16) UnaryOperator at tt.c (8, 6) to (8, 15) UnexposedExpr at tt.c (8, 7) to (8, 15) DeclRefExpr at tt.c (8, 7) to (8, 15) IntegerLiteral at tt.c (8, 17) to (8, 18) UnexposedExpr at tt.c (8, 20) to (8, 37) UnexposedExpr at tt.c (8, 20) to (8, 37) StringLiteral at tt.c (8, 20) to (8, 37) IntegerLiteral at tt.c (8, 39) to (8, 41) FunctionDecl at tt.c (11, 1) to (13, 2) CompoundStmt at tt.c (11, 21) to (13, 2) <- XXX no line 12? FunctionDecl at tt.c (15, 1) to (20, 2) CompoundStmt at tt.c (15, 12) to (20, 2) CallExpr at tt.c (16, 5) to (16, 19) UnexposedExpr at tt.c (16, 5) to (16, 17) DeclRefExpr at tt.c (16, 5) to (16, 17) <- XXX no line 17? ReturnStmt at tt.c (19, 5) to (19, 13) IntegerLiteral at tt.c (19, 12) to (19, 13) 

We see that there are three functions (direct_write on line 7 / indirect_write on line 11 / main on line 15), most of the operators can be found in AST, but I can’t find anything that represents the statements on line 12 and line 17. Who Does anyone know the reason?

I am compressing on debian 2.6.32, checking both on clang 3.1 and 3.2 (compiled from the source).

Here is the parse_ast.c program:

 #include <stddef.h> #include <stdio.h> #include <clang-c/Index.h> enum CXChildVisitResult visit_fn(CXCursor cr, CXCursor parent, CXClientData client_data) { unsigned depth; unsigned line, column, offset; enum CXCursorKind kind; CXSourceRange extent; CXSourceLocation start, end; CXString kind_spelling, filename; CXFile file; depth = (unsigned)client_data; // print cursor kind kind = clang_getCursorKind(cr); kind_spelling = clang_getCursorKindSpelling(kind); fprintf(stdout, "%*s%s at", depth, " ", clang_getCString(kind_spelling)); clang_disposeString(kind_spelling); // get extent extent = clang_getCursorExtent(cr); start = clang_getRangeStart(extent); end = clang_getRangeEnd(extent); // print start position clang_getExpansionLocation(start, &file, &line, &column, &offset); filename = clang_getFileName(file); fprintf(stdout, " %s (%u, %u) to", clang_getCString(filename), line, column); clang_disposeString(filename); // print end position clang_getExpansionLocation(end, &file, &line, &column, &offset); fprintf(stdout, " (%u, %u)\n", line, column); // recursive clang_visitChildren(cr, visit_fn, (CXClientData)(depth + 1)); return CXChildVisit_Continue; } int main(int argc, const char * const *argv) { CXIndex Index = clang_createIndex(0, 0); CXTranslationUnit TU = clang_parseTranslationUnit(Index, NULL, argv, argc, 0, 0, CXTranslationUnit_DetailedPreprocessingRecord); clang_visitChildren(clang_getTranslationUnitCursor(TU), visit_fn, 0); clang_disposeTranslationUnit(TU); clang_disposeIndex(Index); return 0; } 

Update :

the problem is due to the absence of the stddef.h header file, it answered in the libclang mail list http://clang-developers.42468.n3.nabble.com/libclang-missing-some-statements-in-the-AST-td4029641.html

+4
source share
2 answers

Check the diagnostics generated with clang_parseTranslationUnit() , even if errors occur, an AST is generated, but it certainly cannot be guaranteed to be meaningful.

I found that commenting out #include lines resulted in compilation errors, but an AST was created that looked like yours (in particular, line 17 was missing).

Replacing the #include lines with typedefs for size_t and ssize_t (as int ) led to a compilation warning about the implicit declaration of write() , but the line with AST turned on contained line 17.

Therefore, I assume there is a problem in your header files that the diagnosis should identify. Diagnostics can be restored, for example,

 for (unsigned I = 0, N = clang_getNumDiagnostics(TU); I != N; ++I) { CXDiagnostic Diag = clang_getDiagnostic(TU, I); CXString String = clang_formatDiagnostic(Diag, clang_defaultDiagnosticDisplayOptions()); fprintf(stderr, "%s\n", clang_getCString(String)); clang_disposeString(String); } 
+4
source

I use libclang for parsing and optimizing c-code, but I cannot see CXCursor_BinaryOperator inside CompoundStmt for parsing source files with your code

for instance

 void OCTS_C_TimerMiliseconds_reset_Timers(OCTS_outC_C_TimerMiliseconds_Timers *outC) { outC->init = kcg_true; /* 1 */ OCTS_Sign_INT_reset_Math(&outC->_1_Context_1); /* 2 */ OCTS_Sign_INT_reset_Math(&outC->Context_2); /* 1 */ OCTS_FallingEdge_reset_Edge(&outC->Context_1); } 

Result:

 FunctionDecl at s.cpp (10, 1) to (17, 2) OCTS_C_TimerMiliseconds_reset_Timers ParmDecl at s.cpp (11, 3) to (11, 44) outC TypeRef at s.cpp (11, 3) to (11, 38) OCTS_outC_C_TimerMiliseconds_Timers CompoundStmt at s.cpp (12, 1) to (17, 2) 
0
source

All Articles