Tracking uninitialized static variables

I need to debug the ugly and huge C math library, possibly once created by f2c. The code abuses local static variables, and, unfortunately, somewhere it seems to use the fact that they are automatically initialized to 0. If its input function is called with the same input twice, it gives different results. If I unload the library and reload it again, it will work correctly. It should be fast, so I would like to get rid of loading / unloading.

My question is how to solve these errors with valgrind or any other tool without having to go through all the code manually.

I am looking for places where a local static variable is declared, first read and written only later. The problem is further complicated by the fact that static variables are sometimes passed on through pointers (yep is so ugly).

I understand that it can be argued that errors like this do not have to be detected using an automatic tool, as in some scenarios this is precisely the intended behavior. However, is there a way to make auto-initialized local static variables "dirty"?

+7
source share
5 answers

The devil is in the details, but this may work for you:

First get Frama-C . If you are using Unix, your distribution may have a package. The package will not be the latest version, but it can be good enough, and it will save you some time if you install it this way.

Say your example is as below, only so large that it is unclear what is wrong:

int add(int x, int y) { static int state; int result = x + y + state; // I tested it once and it worked. state++; return result; } 

Enter the command:

 frama-c -lib-entry -main add -deps ugly.c 

The -lib-entry -main add mean "look at the add function". The -deps calculates functional dependencies. You will find these "functional dependencies" in the journal:

 [from] Function add: state FROM state; (and default:false) \result FROM x; y; state; (and default:false) 

The actual inputs that add results depend on and the actual outputs computed from these inputs, including static variables read and changed, are listed here. A static variable that was correctly initialized before its use will usually not be displayed as an input if the analyzer could not determine that it was always initialized before reading.

The log displays state as a dependency on \result . If you expected that the returned result would depend only on the arguments (which means that two calls with the same arguments give the same result), this may mean that there might be something wrong with the state variable.

Another hint shown in the lines above is that the function modifies state .

It can help or not. The -lib-entry option means that the analyzer does not assume that any non-constant static variable retains its value during a function call, which may be too inaccurate for your code. There are ways around this, but then it’s up to you to decide whether you want to play the time spent learning these methods.

EDIT: here is a more complex example:

 void initialize_1(int *p) { *p = 0; } void initialize_2(int *p) { *p; // I made a mistake here. } int add(int x, int y) { static int state1; static int state2; initialize_1(&state1); initialize_2(&state2); // This is safe because I have initialized state1 and state2: int result = x + y + state1 + state2; state1++; state2++; return result; } 

In this example, the same command gives the results:

 [from] Function initialize_1: state1 FROM p [from] Function initialize_2: [from] Function add: state1 FROM \nothing state2 FROM state2 \result FROM x; y; state2 

What you see for initialize_2 is an empty list of dependencies, that is, the function does not assign anything. I will make this case clearer by showing an explicit message, not just an empty list. If you know what any of the functions initialize_1 , initialize_2 or add should perform, you can compare this a priori knowledge with the results of the analysis and see that something is wrong for initialize_2 and add .

SECOND EDIT: now my example shows something strange for initialize_1 , so maybe I should explain this. The variable state1 depends on p in the sense that p used to write to state1 , and if p was different, the final value of state1 would be different. Here is the last example:

 int t[10]; void initialize_index(int i) { t[i] = 1; } int main(int argc, char **argv) { initialize_index(argv[1][0]-'0'); } 

Using the frama-c -deps tc dependencies calculated for initialize_index are:

 [from] Function initialize_index: t[0..9] FROM i (and SELF) 

This means that each of the cells depends on i (it can be changed if i is the index of this particular cell). Each cell can also retain its value (if i indicates a different cell): this is indicated with a mention (and SELF) in the latest version, and in previous versions the more obscure (and default:true) was indicated.

+5
source

Static code analysis tools are pretty good at finding common programming errors, such as using uninitialized variables. Here is a list of free tools that do this for C.

Unfortunately, I cannot recommend any of the tools on the list. I am familiar with the two commercial products Coverity and Klocwork . Coverage is very good (and expensive). Klocwork is like that (but cheaper).

+2
source

What I did at the end is to remove all static determinants from the code using '#define static'. This turns uninitialized static usage into invalid use, and the type of abuse I'm looking for can be detected by tools.

In my actual case, this was enough to determine the location of the error, but in a more general situation, it should be clarified if the static ones really do something important, gradually adding “static” ones when the code fails to continue.

+2
source

I don’t know a single library that will do this for you, but I would consider using regular expressions to find them. Something like

rgrep "static \ s * int" path / to / src / root | grep -v = | grep -v "("

This should return all static int variables declared without an equal sign, and the last channel should remove something with parentheses (get rid of funcions). There are good changes that it won’t work for you, but playing with grep may be the fastest way for you to keep track of this.

Of course, once you find one that works, you can replace int with all other kinds of variables to look for them too. NTN

+1
source

My question is how to uncover these errors ...

But these are not mistakes: the expectation that a static variable is initialized to 0 is completely true, as it assigns it some other value.

Therefore, requesting a tool that automatically detects errors is unlikely to give a satisfactory result.

From your description, it seems that somefunc() returns the correct result on the first call and the wrong result on subsequent calls.

The easiest way to debug such problems is to have two GDB sessions side by side: one just loaded (will calculate the correct answer) and one with a “second iteration” (will calculate the wrong answer). Then go through both sessions “in parallel” and see how their calculation or control flow begins to diverge.

Since you can usually effectively split the problem in half, it often does not take much time to find the error. Errors that are always reproduced are the easiest of them. Just do it.

0
source

All Articles