Why printf ("% s", (char []) {'H', 'i', '\ 0'}) works like printf ("% s", "Hi"), but printf ("% s", (char *) {'H', 'I', '\ 0'}); will fail?

I really need help with this. He shook my foundation at C.Long and the detailed answers would be much appreciated. I divided my question into two parts.

A: Why printf("%s",(char[]){'H','i','\0'}); works and prints Hi just like regular printf("%s","Hi"); does printf("%s","Hi"); ? Can we use (char[]){'H','i','\0'} as a replacement for "Hi" anywhere in our C code "I mean, when we write "Hi" in C, this usually means that Hi is stored somewhere in memory, and a pointer to it is passed in. You could say how accurate are they?

B: When printf("%s",(char[]){'H','i','\0'}) is successful, the same as printf("%s","Hi") , why then printf("%s",(char*){'A','B','\0'} cannot run for a long time and seg-faults if I run it despite warnings? Is it just me it is amazing because in C there is no char[] that must be decomposed into char* , for example, when we pass it in function arguments, why doesnโ€™t it do it here, and char* fails ?, does not pass char demo[] as a function argument, the same as char demo* ? Why then the results here are not the same?

Please help me with this. I feel that I have not yet understood the very basics of C. I am very disappointed. Thanks!

+7
source share
5 answers

Regarding fragment # 2:

The code works because of a new feature on C99 called composite literals. You can read about them in several places, including the GCC Documentation , Mike Ash's article, and a bit of Google search.

In fact, the compiler creates a temporary array on the stack and populates it with 3 bytes - 0x48 , 0x69 and 0x00 . This temporary array, once created, then decays to a pointer and passed to the printf function. It is very important to note compound literals that they are not const by default, like most C-lines.

Regarding fragment # 3:

In fact, you are not creating an array - you are superimposing the first element in a scalar intializer, which in this case is in H or 0x48 in the pointer. You can see that by changing %s in your printf statement to %p , which gives me this result:

  0x48

Thus, you must be very careful what you do with compound literals - they are a powerful tool, but itโ€™s easy to shoot in the foot with them.

+7
source

Your third example:

 printf("%s",(char *){'H','i','\0'}); 

itโ€™s not even legal (strictly speaking, this is a violation of the restriction), and you should get at least one warning when compiling it. When I compiled it using gcc with default settings, I received 6 warnings:

 cc:3:5: warning: initialization makes pointer from integer without a cast [enabled by default] cc:3:5: warning: (near initialization for '(anonymous)') [enabled by default] cc:3:5: warning: excess elements in scalar initializer [enabled by default] cc:3:5: warning: (near initialization for '(anonymous)') [enabled by default] cc:3:5: warning: excess elements in scalar initializer [enabled by default] cc:3:5: warning: (near initialization for '(anonymous)') [enabled by default] 

The second argument to printf is a composite literal. It has the right (but odd) to have a composite literal of type char* , but in this case, part of the list of initializers of the composite literal is invalid.

After printing the warnings, what gcc seems to be doing is (a) converting the expression 'H' , which is of type int , to char* giving the value of the garbage pointer and (b) ignoring the rest of the initialization elements, 'i' and '\0' . The result is a char* pointer value pointing to a (possibly virtual) address of 0x48 - subject to an ASCII-based character set.

Ignoring redundant initializers is acceptable (but noteworthy), but there is no implicit conversion from int to char* (except in the special case of the null pointer constant, which is not applicable here). gcc did its job by issuing a warning, but it could (and IMHO) reject it with a fatal error message. He will do this with the -pedantic-errors option.

If your compiler warned you about these lines, you should have included these warnings in your question. If he did not, roll up the warning level or get a better compiler.

In more detail about what happens in each of the three cases:

 printf("%s","Hi"); 

The string literal C as "%s" or "Hi" creates an anonymous, statically allocated char array. (This object is not const , but trying to modify it has undefined behavior, this is not ideal, but there are historical reasons for this.) A trailing '\0' null character is added to make it a valid string.

An array type expression in most contexts (exceptions are when the operand of the unary sizeof or & operator or when it is a string literal in the initializer used to initialize the array object) is implicitly converted to a ("decays to") pointer to the first element of the array. Thus, the two arguments passed to printf are of type char* ; printf uses these pointers to move the corresponding arrays.

 printf("%s",(char[]){'H','i','\0'}); 

It uses a function added to the C99 language (1999 edition of the ISO C standard), called a composite literal. It is similar to a string literal because it creates an anonymous object and refers to the value of that object. The combined literal has the form:

 ( type-name ) { initializer-list } 

and the object has the specified type and is initialized with the value specified in the list of initializers.

The above is almost equivalent:

 char anon[] = {'H', 'i', '\0'}; printf("%s", anon); 

Again, the second argument to printf refers to the array object and splits into a pointer to the first element of the array; printf uses this pointer to move the array.

Finally, this:

 printf("%s",(char*){'A','B','\0'}); 

as you say, fails. The type of a composite literal is usually an array or structure (or union); it didnโ€™t actually occur to me that it could be a scalar type, such as a pointer. The above is almost equivalent:

 char *anon = {'A', 'B', '\0'}; printf("%s", anon); 

Obviously, anon is of type char* , which means printf for the format "%s" . But what is the initial value?

The standard requires that the initializer for a scalar object be the only expression, optionally enclosed in curly braces. But for some reason this requirement is under "semantics", therefore, its violation is not a violation of the restriction; this is just undefined behavior. This means that the compiler can do whatever it likes and may or may not issue diagnostics. The gcc authors apparently decided to issue a warning and ignore all but the first initializer in the list.

After that, it becomes equivalent:

 char *anon = 'A'; printf("%s", anon); 

The constant 'A' is of type int (for historical reasons it is int , not char , but the same argument will apply anyway). There is no implicit conversion from int to char* , and in fact the aforementioned initializer is a violation of the constraint. This means that the compiler must run the diagnostic (gcc does) and may reject the program (gcc does not work if you do not use -pedantic-errors ). Once the diagnostics are issued, the compiler can do whatever it likes; undefined behavior (there is some legal disagreement on this issue, but in fact it does not matter). gcc decides to convert the value of A from int to char* (probably for historical reasons, returning to when C was even less strongly typed than today), resulting in a pointer to the garbage with a view that probably looks like 0x00000041 or 0x0000000000000041`.

The garbage pointer is then passed to printf , which tries to use it to access the line at that location in memory. Fun comes.

Two important things to keep in mind:

  • If your compiler prints warnings, pay close attention to them. gcc in particular warns of a lot that IMHO should be fatal errors. Never ignore warnings if you do not understand what warning means, enough for your knowledge to redefine the capabilities of the compiler authors.

  • Arrays and pointers are two different things. Several C language rules seem compliant to make them look as if they are the same. You can temporarily walk away with the assumption that arrays are nothing more than masking pointers, but this assumption will eventually come back to bite you. Read section 6 of the comp.lang.c FAQ ; he explains the relationship between arrays and pointers better than I can.

+8
source

(Okay ... someone completely reworked the question. Recycled the answer.)

Array # 3 contains hexadecimal bytes. (We do not know about this 4th):

48 49 00 xx

When it passes the contents of this array, in the second case, it takes these bytes as the address of the string to print. It depends on how these 4 bytes are converted to a pointer in your actual CPU hardware, but they say that โ€œ414200FFโ€ is the address (since we assume that the 4th byte is 0xFF, we do it anyway.) We also suggest that the pointer has a length of 4 bytes and a serial number, etc. It does not matter for the answer, but others are free to state.

Note. One of the other answers seems like it takes 0x48 and extends it to (int) 0x00000048 and calls this pointer. May be. But if GCC did this and @KiethThompson did not say that he checked the generated code, this does not mean that another C compiler will do the same. The result is the same.

This is passed to the printf () function, and it tries to go to that address to get some characters to print. (A Seg error occurs because this address may not be available on the machine and in any case not assigned to your process for reading.)

In case # 2, he knows his array, not the pointer, so he passes the address of the memory in which the bytes are stored, and printf () can do this.

See other answers for a more formal language.

One thing worth considering is that at least some C compiler probably doesn't know the printf call from any other function call. Thus, it takes the value "format string" and saves a pointer to the call (which is included in the string), and then takes the second parameter and saves everything that it receives, according to the declaration of the function, be it int or char or a pointer to the call . The function then pulls them from the place where the caller places them according to the same declaration. The declaration for the second or more parameters must be something really general in order to be able to accept a pointer, int, double and all the various types that may be there. (I say that the compiler probably does not look at the format string when deciding what to do with the second and next parameters.)

It may be interesting to see what happens for:

 printf("%s",{'H','i','\0'}); printf("%s",(char *)(char[]){'H','i','\0'}); // This works according to @DanielFischer 

Predictions?

+3
source

In each case, the compiler creates an initialized object of type char [3]. In the first case, it treats the object as an array, so it passes a pointer to its first element of the function. In the second case, it treats the object as a pointer, so it passes the value of the object. printf expects a pointer, and the value of the object is invalid if it is considered as a pointer, so the program crashes at runtime.

+2
source

The third version should not compile. 'H' not a valid initializer for a pointer type. GCC gives a warning, but not by default.

-one
source

All Articles