A printf call with a template segfaults functor (only for 64-bit, valgrind clean in 32-bit)

I am currently debugging C ++ code written at the end of '90 that analyzes scripts to load data, perform simple operations and print results, etc.

The people who wrote the code used functors to match the string keywords in a file in which it parses the actual function calls, and they are templated (with a maximum of 8 arguments) to handle the many functional interfaces that the user can query in their script.

For the most part, all this works just fine, except in recent years he started segfault on some of our 64-bit build systems. Running through valgrind, to my surprise, I found that errors seem to occur inside of "printf", which is one of the specified functors. Here are some code snippets to show how this works.

Firstly, the script, which is a parsing, contains the following line:

printf( "%5.7f %5.7f %5.7f %5.7f\n", cos( j / 10 ), tan( j / 10 ), sin( j / 10 ), sqrt( j / 10 ) ); 

where cos, tan, sin and sqrt are also functors corresponding to libm (this part is not important if I replace those with fixed numerical values, the same result is obtained).

When it comes to calling printf, this is done as follows. First, the template functor:

 template<class R, class T1, class T2, class T3, class T4, class T5, class T6, class T7, class T8> class FType { public : FType( const void * f ) { _f = (R (*)(T1,T2,T3,T4,T5,T6,T7,T8))f; } R operator()( T1 a1,T2 a2,T3 a3,T4 a4,T5 a5,T6 a6,T7 a7,T8 a8 ) { return _f( a1,a2,a3,a4,a5,a6,a7,a8); } private : R (*_f)(T1,T2,T3,T4,T5,T6,T7,T8); }; 

And then the code that calls it is inside another class of templates - I show the prototype and the corresponding code fragment that FType uses (as well as some additional code that I insert for debugging):

 template<class T1, class T2, class T3, class T4, class T5, class T6, class T7, class T8> static Token evalF( const void * f, unsigned int nrargs, T1 a1, T2 a2, T3 a3, T4 a4, T5 a5, T6 a6, T7 a7, T8 a8, vtok & args, const Token & returnType ) { Token result; printf("Count: %i\n",++_count); if( _count == 2 ) { const char *fmt = *((const char **) &a1); result = printf(fmt,a2,a3,a4,a5,a6,a7,a8); FType<int, const void*,T2,T3,T4,T5,T6,T7,T8> f1(f); result = f1("Hello, world.\n",a2,a3,a4,a5,a6,a7,a8); result = f1("Hello, world2 %5.7f\n",a2,a3,a4,a5,a6,a7,a8); result = f1(fmt,a2,a3,a4,a5,a6,a7,a8); } else { FType<int, T1,T2,T3,T4,T5,T6,T7,T8> f1(f); result = f1(a1,a2,a3,a4,a5,a6,a7,a8); } } 

I inserted the if (_count == 2) bit (since this function is called several times). Under normal circumstances, it only performs operations in the else clause; it calls the FType constructor (which creates the return type as int) with "f", which is a functor for printf (checked in the debugger). Once f1 is constructed, it calls the overloaded call statement with all argument templates, and valgrind starts complaining:

 ==29358== Conditional jump or move depends on uninitialised value(s) ==29358== at 0x92E3683: __printf_fp (printf_fp.c:406) ==29358== by 0x92E05B7: vfprintf (vfprintf.c:1629) ==29358== by 0x92E88D8: printf (printf.c:35) ==29358== by 0x5348C45: FType<int, void const*, double, double, double, double, void const*, void const*, void const*>::operator()(void const*, double, double, double, double, void const*, void const*, void const*) (Interpreter.cc:321) ==29358== by 0x51BAB6D: Token evalF<void const*, double, double, double, double, void const*, void const*, void const*>(void const*, unsigned int, void const*, double, double, double, double, void const*, void const*, void const*, std::vector<Token, std::allocator<Token> >&, Token const&) (Interpreter.cc:542) 

So, this led to experiments inside the if () clause. Firstly, I tried directly calling printf with the same arguments (note the trick of the tag with parameter a1 - format - to compile it, otherwise it complains about many instances of the template where T1 is not (char *), as expected printf). It works great.

Then I tried to call f1 with a replacement format string in which there are no variables (Hello, world). This also works great.

Then I add one of the variables (Hello, World2% 5.7f) and then I start to see valgrind errors as above.

If I run this code on a 32-bit system, it will be cleared of valgrind (otherwise the same versions of glibc, gcc).

Works on several different Linux systems (all 64-bit), sometimes I get segfault (e.g. RHEL5.8 / libc2.5 and openSUSE11.2 / libc-2.10.1), and sometimes not (e.g. libc2.15 with Fedora 17 and Ubunutu 12.04), but valgrind always complains in the same way for all systems, making me think it's an accident if it works or not.

All this makes me suspect some kind of error with glibc in the 64-bit version, although I would be much happier if someone could find something wrong with this code!

One hunch I had was that it was somehow related to parsing lists of variable arguments. How exactly do they play with patterns? It’s actually not clear to me how this works, because it doesn’t know the format string until runtime, so how does it know what specific instances of the template you need to do when compiling? However, this does not explain why everything seems beautiful in the 32-bit version.

Update in response to comments

Thanks to everyone for this useful discussion. I think the answer from ora regarding% al register is probably the correct explanation, although I have not verified it yet. Regardless of the benefit of the discussion, here is a complete, minimal program that reproduces an error on my 64-bit system that others can play with. If you are #define _VOID_PTR at the top, it uses void * pointers to pass pointers to functions, as in the source code (and causes valgrind errors). If you comment out #define _VOID_PTR , it will use correctly prototyped function pointers instead, as suggested by WhosCraig. The problem with this case is that I could not just put int (*f)(const char *, double, double) = &printf; since the compiler complains about prototype mismatch (maybe I'm just fat and is there a way to do this?) I assume that this is the problem the original author tried to deal with void * pointers). To handle this particular case, I create this wrap_printf() function with the correct explicit list of arguments. When I execute this version of the code, it is cleared of valgrind. Unfortunately, this does not tell us if this is a void * vs. storage problem. function or something related to% al; I think most of the evidence points to the latter case, and I suspect that the printf() wrapper with a fixed list of arguments made the compiler "correctly":

 #include <cstdio> #define _VOID_PTR // set if using void pointers to pass around function pointers template<class R, class T1, class T2, class T3> class FType { public : #ifdef _VOID_PTR FType( const void * f ) { _f = (R (*)(T1,T2,T3))f; } #else typedef R (*FP)(T1,T2,T3); FType( R (*f)(T1,T2,T3 )) { _f = f; } #endif R operator()( T1 a1,T2 a2,T3 a3) { return _f( a1,a2,a3); } private : R (*_f)(T1,T2,T3); }; template <class T1, class T2, class T3> int wrap_printf( T1 a1, T2 a2, T3 a3 ) { const char *fmt = *((const char **) &a1); return printf(fmt, a2, a3); } int main( void ) { #ifdef _VOID_PTR void *f = (void *)printf; #else // this doesn't work because function pointer arguments don't match printf prototype: // int (*f)(const char *, double, double) = &printf; // Use this wrapper instead: int (*f)(const char *, double, double) = &wrap_printf; #endif char a1[]="%5.7f %5.7f\n"; double a2=1.; double a3=0; FType<int, const char *, double, double> f1(f); printf(a1,a2,a3); f1(a1,a2,a3); return 0; } 
+4
source share
2 answers

In System V amd64 ABI, which is used by 64-bit Linux (and many other Unix), functions with a fixed number of arguments and a variable number of arguments have a slightly different calling chain.

To quote from the "Binary interface of the System V Application AMD Architecture application" "Project 0.99.5 [2], chapter 3.2.3" Parameter passing ":

For calls that can call functions that use varargs or stdargs (calls that don't contain prototypes, or calls to functions that contain an ellipsis (...) in the declaration).% Al is used as a hidden argument to indicate the number of vector registers to use.

Now a sequence of three steps:

  • printf (3) is such a function of variable arguments. Therefore, the% al register is expected to be filled correctly.

  • Your FType :: _ f is declared as a pointer to a function with a fixed number of arguments. Therefore, the compiler does not care about% al when it calls something through it.

  • When printf () is called via FType :: _ f, it expects% al to be filled correctly (because of 1), but the compiler did not want to fill it (because of 2), and as a result, printf () finds " garbage "in% al.

Using garbage instead of a properly initialized value can easily lead to many undesirable results, including the segfault you are observing.

For more information see:
[1] http://en.wikipedia.org/wiki/X86_calling_conventions#x86-64_calling_conventions
[2] http://x86-64.org/documentation/abi.pdf

+3
source

If your compiler is compatible with C ++ 11 and therefore can handle variadic templates , and everything is in order to reorder the order of the parameters, you could do something like:

 template<typename F, typename ...A> static Token evalF(vtok& args, const Token& resultType, F f, A... a) { Token result; f(a...); return result; } 

Works well if you see, for example. this example .

+1
source

All Articles