Find all references to the declaration of a specific function in libclang (Python)

I am trying to find (row and column position) all links of a declaration of a specific function when parsing a C ++ source file using libclang in Python.

For example:

#include <iostream> using namespace std; int addition (int a, int b) { int r; r=a+b; return r; } int main () { int z, q; z = addition (5,3); q = addition (5,5); cout << "The first result is " << z; cout << "The second result is " << q; } 

So, for the source file above, I would like to declare a function for addition on line 5, I would like find_all_function_decl_references (see below) to return addition links on lines 15 and 16.

I tried this (adapted from here )

 import clang.cindex import ccsyspath index = clang.cindex.Index.create() translation_unit = index.parse(filename, args=args) for node in translation_unit.cursor.walk_preorder(): node_definition = node.get_definition() if node.location.file is None: continue if node.location.file.name != sourcefile: continue if node_def is None: pass if node.kind.name == 'FUNCTION_DECL': if node.kind.is_reference(): find_all_function_decl_references(node_definition.displayname) # TODO 

Another approach would be to save all function declarations found in the list and run the find_all_function_decl_references method for each of them.

Does anyone have an idea on how to approach this? How would this find_all_function_decl_references method be? (I am very new to libclang and Python.)

I saw this , where def find_typerefs finds all references to some type, but I'm not sure how to implement it for mine.

Ideally, I would like to get all the links for any ad; not only functions, but also variable declarations, parameter declarations (for example, a and b in the example above on line 7), class declarations, etc.

EDIT Following Andrew's comments , here are some details about my settings:

  • LLVM 3.8.0-win64
  • libclang-py3 3.8.1
  • Python3.5.1 (on Windows I assume CPython)
  • For args I tried both suggested in the answer here and those from the other answer.

* Please note that, given my little programming experience, I could evaluate the answer with a brief explanation of how this works.

+7
c ++ python clang libclang
source share
1 answer

What really makes this problem difficult is C ++ complexity.

Consider what you can call in C ++: functions, lambdas, a function call operator, member functions, template functions, and member functions. Thus, in the case of only matching call expressions, you will need to eliminate these cases.

In addition, libclang does not offer an ideal idea of ​​clang AST (some nodes are not fully exposed, especially some nodes associated with templates). Therefore, it is possible (even probable) that an arbitrary piece of code will contain some construct where the libclangs representation for the AST was insufficient to associate the invocation expression with the declaration.

However, if you are willing to limit yourself to a subset of the language, you may be able to make some progress - for example, the following example tries to associate call sites with function declarations. He does this by doing a single pass through all the nodes in the AST conformance declarations with call expressions.

 from clang.cindex import * def is_function_call(funcdecl, c): """ Determine where a call-expression cursor refers to a particular function declaration """ defn = c.get_definition() return (defn is not None) and (defn == funcdecl) def fully_qualified(c): """ Retrieve a fully qualified function name (with namespaces) """ res = c.spelling c = c.semantic_parent while c.kind != CursorKind.TRANSLATION_UNIT: res = c.spelling + '::' + res c = c.semantic_parent return res def find_funcs_and_calls(tu): """ Retrieve lists of function declarations and call expressions in a translation unit """ filename = tu.cursor.spelling calls = [] funcs = [] for c in tu.cursor.walk_preorder(): if c.location.file is None: pass elif c.location.file.name != filename: pass elif c.kind == CursorKind.CALL_EXPR: calls.append(c) elif c.kind == CursorKind.FUNCTION_DECL: funcs.append(c) return funcs, calls idx = Index.create() args = '-x c++ --std=c++11'.split() tu = idx.parse('tmp.cpp', args=args) funcs, calls = find_funcs_and_calls(tu) for f in funcs: print(fully_qualified(f), f.location) for c in calls: if is_function_call(f, c): print('-', c) print() 

To show how well this works, you need a slightly more complex example for analysis:

 // tmp.cpp #include <iostream> using namespace std; namespace impl { int addition(int x, int y) { return x + y; } void f() { addition(2, 3); } } int addition (int a, int b) { int r; r=a+b; return r; } int main () { int z, q; z = addition (5,3); q = addition (5,5); cout << "The first result is " << z; cout << "The second result is " << q; } 

And I get the output:

 impl::addition - <SourceLocation file 'tmp.cpp', line 10, column 9> impl::f addition - <SourceLocation file 'tmp.cpp', line 22, column 7> - <SourceLocation file 'tmp.cpp', line 23, column 7> main 

Scaling it to consider more types of ads will (IMO) be a non-trivial and interesting project in its own right.

Addressing comments

Given that there are some questions about whether the code in this answer gives the results that I provided, I added the code text (which reproduces the contents of this question) and a very minimal vagrant image of the machine with which you can experiment. Once the machine boots up, you can clone the essence and play back the response using the commands:

 git clone https://gist.github.com/AndrewWalker/daa2af23f34fe9a6acc2de579ec45535 find-func-decl-refs cd find-func-decl-refs export LD_LIBRARY_PATH=/usr/lib/llvm-3.8/lib/ && python3 main.py 
+6
source share

All Articles