What you want is not a “test coverage”, this transitive closure “may cause” from the root of the calculation. (In streaming applications, you must enable "cank").
You want to assign a small set (possibly only 1) of the functions that make up the entry points of your application, and you want to keep track of all the possible calls (conditional or unconditional) of this small set. This is a set of features that you must have.
Python makes this very difficult overall (IIRC, I am not a deep Python expert) because of dynamic dispatch and especially because of "eval". Reasoning about which function can be called can be quite complicated for static analyzers applied to highly dynamic languages.
You can use the test coverage as a way of populating the connection "can cause" with specific facts "really called"; which can catch a lot of dynamic mailings (depending on the coverage of your test suite). Then the result you want is the transitive closure of the call "may or not." This may be erroneous, but likely to be less.
Once you get the set of “necessary” functions, the next problem will be the removal of unnecessary functions from the source files that you have. If the number of files you start with is large, the manual effort to remove dead things can be quite high. Even worse, you are likely to review your application, and then reply that you save the changes. Therefore, for each change (release) you need to reliably recount this answer.
My company is creating a tool that does this analysis for Java packages (with corresponding caveats regarding dynamic loads and reflection): an input is a collection of Java files and (as mentioned above) an assigned set of root functions. The tool calculates the call graph, and also finds all the dead member variables and produces two outputs: a) a list of supposedly dead methods and members, and b) a revised set of files with all the "dead" materials. If you believe a), then you use b). If you think that a) is wrong, then you add the elements listed in a) to the set of roots and repeat the analysis until you think that a) is correct. To do this, you need a static analysis tool that analyzes Java, computes the call graph, and then revises the code modules to remove dead records. The basic idea applies to any language.
You will need a similar tool for Python, I would expect.
Perhaps you can stick with simply deleting files that are not completely used, although this can be a lot of work.
Ira Baxter
source share