Different declarations of the same function / global variable in two files

I have 2 questions regarding different declarations of the same function and global variable in two files in case of C and C ++.

  • Various function declarations

    Consider the following code snippets:

    file_1.c

    void foo(int a); int main(void) { foo('A'); } 

    file_2.c

     #include <stdio.h> void foo(char a) { printf("%c", a); //prints 'A' (gcc) } 

    As we can see, the prototype is different from the definition located in file_2.c , but the function displays the expected value.

    If it comes to C ++, the above program is invalid because of the undefined reference to foo(int) during the connection. This is probably due to the presence of other function signatures - compared to C, where the function name does not contain any additional characters indicating the type of function arguments.

    But when it comes to C, then what? Since prototypes with the same name have the same signature, regardless of the number of arguments and its types, the linker will not throw an error. But what type of conversions are performed here? It looks like this: 'A'int → back to char ? Or maybe this behavior is undefined / implementation defined?

  • Various global variable declarations

    We have two files and two different declarations of the same global variable:

    file_1.c

     #include <stdio.h> extern int a; int main(void) { printf("%d", a); //prints 65 (g++ and gcc) } 

    file_2.c

     char a = 'A'; 

    Both in C and C ++ output 65.

    Although I would like to know that both standards speak about this situation.

    In the C11 standard, I found the following snippet:

    J.5.11 Several external definitions (Appendix J.5 General extensions)
    There can be more than one external definition for an identifier object with explicit use of the extern keyword or without it; if the definitions are not consistent or more than one is initialized, the behavior is undefined (6.9.2).

    Please note that this refers to the presence of two or more definitions, there is only one in my code, so I'm not sure if this article is a good starting point in this case ...

+6
source share
4 answers

Q1. According to the C99 specification, section 6.5.2.2.9, this behavior is undefined in C:

If a function is defined with a type that is incompatible with the type (of the expression) that the expression that indicates the function to call points to, the behavior is undefined.

The expression "points to" a function that takes an int , and the function is defined as taking a char .

Q2. The case of variables is also undefined because you are reading or assigning int to / from char . Assuming 4-byte integers, this will have access to three bytes past the memory location where it really is. You can verify this by declaring more variables, for example:

 char a = 'A'; char b = 'B'; char c = 'C'; char d = 'D'; 
+5
source

This is why you place declarations in the headers, so even the C compiler can catch the problem.

1)

The results of this are pretty random; in your case, the char parameter can be passed as int (for example, in a register or even on the stack to preserve alignment or something else). Or you are lucky because of the enthusiasm that first saves the low byte.

2)

Most likely, this will be a successful result due to endianess and some added "0" bytes to fill the segment. Again, do not rely on this.

+2
source

Overloaded functions in C ++ work because the compiler encodes each unique method and list of parameter combinations into a unique name for the linker. This coding process is called mangling, and the reverse is demangling.

But in C. there is no such thing. When the compiler encounters a character (either the name of a variable or the name of a function) that is not defined in the current module, it assumes that it is defined in some other module, generates a linker character to write to the table and leaves it for the linker. Here we do not check the parameters.

And also, if there is no type conversion here. You basically send the value to foo. Here it is the assembler code:

 movl $65, (%esp) call foo 

And foo reads it, removing it from the stack. Since this input value is defined as char, it stores the input value in the al register (one byte):

 movb %al, -4(%ebp) 

So, for given input data exceeding 256, you will see the variable a in foo, circulating 256 each.

About your second question. In C characters, for initialized variables and functions are defined as strong and multiple strong characters are not allowed, but I'm not sure if this deals with C ++ or not.

+1
source

Just to let you know, I accidentally found a paragraph in the C11 standard that covers both issues - this is 6.2.7.2:

All declarations related to the same object or function must be of a compatible type; otherwise, the behavior is undefined.

+1
source

Source: https://habr.com/ru/post/923852/


All Articles