What really makes this problem difficult is C ++ complexity.
Consider what you can call in C ++: functions, lambdas, a function call operator, member functions, template functions, and member functions. Thus, in the case of only matching call expressions, you will need to eliminate these cases.
In addition, libclang does not offer an ideal idea of ββclang AST (some nodes are not fully exposed, especially some nodes associated with templates). Therefore, it is possible (even probable) that an arbitrary piece of code will contain some construct where the libclangs representation for the AST was insufficient to associate the invocation expression with the declaration.
However, if you are willing to limit yourself to a subset of the language, you may be able to make some progress - for example, the following example tries to associate call sites with function declarations. He does this by doing a single pass through all the nodes in the AST conformance declarations with call expressions.
from clang.cindex import * def is_function_call(funcdecl, c): """ Determine where a call-expression cursor refers to a particular function declaration """ defn = c.get_definition() return (defn is not None) and (defn == funcdecl) def fully_qualified(c): """ Retrieve a fully qualified function name (with namespaces) """ res = c.spelling c = c.semantic_parent while c.kind != CursorKind.TRANSLATION_UNIT: res = c.spelling + '::' + res c = c.semantic_parent return res def find_funcs_and_calls(tu): """ Retrieve lists of function declarations and call expressions in a translation unit """ filename = tu.cursor.spelling calls = [] funcs = [] for c in tu.cursor.walk_preorder(): if c.location.file is None: pass elif c.location.file.name != filename: pass elif c.kind == CursorKind.CALL_EXPR: calls.append(c) elif c.kind == CursorKind.FUNCTION_DECL: funcs.append(c) return funcs, calls idx = Index.create() args = '-x c++ --std=c++11'.split() tu = idx.parse('tmp.cpp', args=args) funcs, calls = find_funcs_and_calls(tu) for f in funcs: print(fully_qualified(f), f.location) for c in calls: if is_function_call(f, c): print('-', c) print()
To show how well this works, you need a slightly more complex example for analysis:
// tmp.cpp
And I get the output:
impl::addition - <SourceLocation file 'tmp.cpp', line 10, column 9> impl::f addition - <SourceLocation file 'tmp.cpp', line 22, column 7> - <SourceLocation file 'tmp.cpp', line 23, column 7> main
Scaling it to consider more types of ads will (IMO) be a non-trivial and interesting project in its own right.
Addressing comments
Given that there are some questions about whether the code in this answer gives the results that I provided, I added the code text (which reproduces the contents of this question) and a very minimal vagrant image of the machine with which you can experiment. Once the machine boots up, you can clone the essence and play back the response using the commands:
git clone https://gist.github.com/AndrewWalker/daa2af23f34fe9a6acc2de579ec45535 find-func-decl-refs cd find-func-decl-refs export LD_LIBRARY_PATH=/usr/lib/llvm-3.8/lib/ && python3 main.py