Is finding pointers in C / C ++ code statically equivalent to Slowing?

I am not too deeply rooted in the very formal side of static code analysis, hence this question.

A few years ago I read that distinguishing code from data using static code analysis is equivalent to "Stopping Problem . " (A quote is needed, but I don’t have it anymore. Stackoverflow has themes here or here .) At least for general computer architectures based on von Neumann architecture , where the code and data have the same memory, it seemed made sense.

Now I look at static analysis of C / C ++ code and pointer analysis; the program is not running. Somehow I get the feeling that tracking all creations and using pointer values ​​is statically similar to the stop problem, because I can’t determine if a given value in memory is a pointer value, that is, I can’t track the value of a pointer value stream through Memory. Parsing an alias can narrow down the problem, but it seems to become less useful in front of multi-threaded code.

(You might even think about tracking arbitrary values, not just pointers: building a full stream of values ​​for any given “interesting” value seems equivalent to a stop problem.)

Since this is just a guess, my question is: are there more formal conclusions on this subject that I can refer to? Am I mistaken?

+7
c ++ c pointers static-analysis halting-problem
source share
4 answers

You can always code this:

extern bool some_program_halts(); extern int* invalid_pointer(); #include <iostream> int main() { using namespace std; if( some_program_halts() ) { cout << *invalid_pointer() << endl; } } 

Checking that this program shares an invalid pointer is equivalent to detecting if the call to some_program_halts() , uh will stop.

+3
source share

This is almost certainly equivalent modulo the fact that C is not an equivalent turing language (this implementation of C is a giant finite state machine, not a turing machine due to the type representation). Pointers should not be stored in their original representations in objects whose effective type is a pointer type; you can check the presentation and perform arbitrary operations on it, for example, encrypt pointers and decrypt them later. Determining whether an arbitrary computation is reversible, or two interdependencies between each other, is (careless), probably equivalent to stopping.

+1
source share

If you understood correctly: yes, checking that a C or C ++ program is accessing an invalid pointer is equivalent to a stop problem (in any case, C or C ++ programs).

Suppose you have a tool that told you if the program was accessible to an invalid pointer, and the program that you wanted to check to stop. By adding additional information to each pointer, you can make it checkable (at run time) whether the pointer is valid or not; add such checks with an infinite loop on failure. Now you have a program without invalid pointer references. Replacing all the places that the program can end with invalid access to the pointer, you get a program with invalid access to the pointer if and only if the original program terminates.

0
source share

Static analysis is almost always an approximate, often provable way of reducing to the problem of stopping programs, such as the one in Alf's answer . However, the approximation may be mistaken on the side of false positives or false negatives.

  • A “conservative” static check will have false negatives . He will never accept a “bad” program, but it will inevitably reject some “fairly complex” good programs.
  • A “liberal” static check will have false positives . Sometimes it accepts a bad program by mistake, but (usually) it also accepts all good programs.

Some examples:

  • A Java type system is conservative: a variable of type T will always contain an instance of type T (or a subtype of T or null) at runtime no matter what.
  • The GCC option for warning about uninitialized variables is liberal: it does not find all the potential uses of an uninitialized variable. Here is an example of a false positive program.
  • In contrast, Java performs a conservative check on an uninitialized variable for local variables. He refuses to compile the program if he sees any potential execution path using a potentially uninitialized variable.

Liberal checks are often used by compilers to issue warnings and external static analysis tools . Things like type systems and compiler optimizations usually rely on conservative checks to be correct.

Many tasks have several reasonable conservative and liberal algorithms of varying accuracy. Alias ​​analysis is certainly one of these .

For more information, see any good compiler tutorial, such as a dragon book .

0
source share

All Articles