How do we account for exceptions in static code analysis?

I wrote a utility for creating CFG (Control Flow Graph) for a java method, the nodes of which are base blocks instead of instructions.

I could not consider exceptions which should be edges in CFG. The reasons are as follows:

  • Each command in a try block can potentially throw exceptions / errors that can be handled by any of the nest try-catch blocks. If we consider exceptions as edges, the number of paths to the process increases sharply, as well as the number of nodes in the CFG.
  • We need to know the inheritance hierarchy for exceptions before we can decide which jumps are possible.

How do static code analyzers solve this problem?

I am stuck at this point. If I need to continue, how do I do this?

Edit: in my case, I can limit support to those use cases that can indicate where and which exceptions are thrown. This solved my second problem. I would still like to know how general static code analyzers do this.

+7
java decompiler
source share
1 answer

This is how I dealt with the problem in the Krakatau decompiler :

We need to know the inheritance hierarchy for exceptions before we can decide which jumps are possible.

Krakatau requires that the class definitions of any reference classes be available, so he knows the inheritance hierarchy. However, if I did, I would not. The requirement to define classes makes the decompiler difficult for users, since finding and adding dependencies is a huge pain. In fact, you do not need this if you are okay with an analysis that will be a little less accurate. Instead, you can simply assume that all exceptions can reach all handlers. In practice, I expect this to produce almost the same results.

Each command in a try block can potentially throw exceptions / errors that can be handled by any of the nest catch-catch blocks. If we consider exceptions as edges, the number of paths to process increases dramatically, as well as the number of nodes in the CFG.

Krakatau includes exceptions as edges in the CFG, which leads to identified problems. To reduce the number of edges, I pretended that only certain commands can be executed (method calls, array calls, division, etc.). This is not technically correct, but it is suitable for real-world code. I have never seen anything that really cares about exceptions arising from a link error, Thread.Stop or the like. I later added an option to disable this behavior.

In any case, this was good enough for most codes, but sometimes it caused performance problems. In particular, really large methods with a lot of field calls or method calls will result in huge CFGs that made decompilation very slow. I tried some tricks to optimize this, but in the end the solution was to switch from basic blocks to extended main blocks.

Extended base blocks are similar to base blocks, except that the edges of the exceptions are semi-implicit, resulting in significantly less CFG. An EBB consists of a straight line code with no entry points or exit points in the middle, except for the edges of the exception, and where each command in the block is covered by the same set of exception handlers. Thus, instead of having an exception margin for each command, you have one for each block, which makes things much more efficient.

Even Java methods with thousands of method calls usually have only a few tries / catches and therefore only a few EBBs.

+6
source share

All Articles