Unsigned int for unsigned long long well defined?

Question

Unsigned int for unsigned long long well defined?

I wanted to see what happens behind the scenes when unsigned long long was assigned the value unsigned int . I made a simple C ++ program to try and move everything from the main ():

 #include <iostream> #include <stdlib.h> void usage() { std::cout << "Usage: ./u_to_ull <unsigned int>\n"; exit(0); } void atoiWarning(int foo) { std::cout << "WARNING: atoi() returned " << foo << " and (unsigned int)foo is " << ((unsigned int)foo) << "\n"; } void result(unsigned long long baz) { std::cout << "Result as unsigned long long is " << baz << "\n"; } int main(int argc, char** argv) { if (argc != 2) usage(); int foo = atoi(argv[1]); if (foo < 0) atoiWarning(foo); // Signed to unsigned unsigned int bar = foo; // Conversion unsigned long long baz = -1; baz = bar; result(baz); return 0; }

The resulting assembly produced this for the main:

 0000000000400950 <main>: 400950: 55 push %rbp 400951: 48 89 e5 mov %rsp,%rbp 400954: 48 83 ec 20 sub $0x20,%rsp 400958: 89 7d ec mov %edi,-0x14(%rbp) 40095b: 48 89 75 e0 mov %rsi,-0x20(%rbp) 40095f: 83 7d ec 02 cmpl $0x2,-0x14(%rbp) 400963: 74 05 je 40096a <main+0x1a> 400965: e8 3a ff ff ff callq 4008a4 <_Z5usagev> 40096a: 48 8b 45 e0 mov -0x20(%rbp),%rax 40096e: 48 83 c0 08 add $0x8,%rax 400972: 48 8b 00 mov (%rax),%rax 400975: 48 89 c7 mov %rax,%rdi 400978: e8 0b fe ff ff callq 400788 < atoi@plt > 40097d: 89 45 f0 mov %eax,-0x10(%rbp) 400980: 83 7d f0 00 cmpl $0x0,-0x10(%rbp) 400984: 79 0a jns 400990 <main+0x40> 400986: 8b 45 f0 mov -0x10(%rbp),%eax 400989: 89 c7 mov %eax,%edi 40098b: e8 31 ff ff ff callq 4008c1 <_Z11atoiWarningi> 400990: 8b 45 f0 mov -0x10(%rbp),%eax 400993: 89 45 f4 mov %eax,-0xc(%rbp) 400996: 48 c7 45 f8 ff ff ff movq $0xffffffffffffffff,-0x8(%rbp) 40099d: ff 40099e: 8b 45 f4 mov -0xc(%rbp),%eax 4009a1: 48 89 45 f8 mov %rax,-0x8(%rbp) 4009a5: 48 8b 45 f8 mov -0x8(%rbp),%rax 4009a9: 48 89 c7 mov %rax,%rdi 4009ac: e8 66 ff ff ff callq 400917 <_Z6resulty> 4009b1: b8 00 00 00 00 mov $0x0,%eax 4009b6: c9 leaveq 4009b7: c3 retq

-1 from C ++ makes it clear that -0x8(%rbp) matches baz (due to $0xffffffffffffffff ). -0x8(%rbp) written to %rax , but the top four bytes of %rax did not seem to be assigned, %eax was assigned

Does this mean that the top 4 bytes of -0x8(%rbp) are undefined?

+5

c ++ assembly x86 x86-64 unsigned-integer

asimes Jan 28 '15 at 15:48

source share

2 answers

From C ++ 98 (and C ++ 11 does not seem to have changed) 4.7 / 2 (integral transforms - no promotions matter), we find out:

If no destination type is specified, the resulting value is the smallest unsigned integer matching the source integer (modulo 2n, where n is the number of bits used to represent the unsigned type).

This clearly shows that until the source and destination are unsigned and the destination is at least as large as the source, the value does not change. If the compiler generated code that could not make the larger value equal, the compiler does not work.

+3

Mark b Jan 28 '15 at 16:16

source share

harold · Accepted Answer · 2015-01-28T16:16:32+0000

In Intel® 64 and IA-32 Software Developer's Guide , Volume 1, Chapter 3.4.1.1 (General Purpose Registries in 64-Bit Mode), he says

32-bit operands will generate a 32-bit result with zero extension to a 64-bit result in the general-purpose target register.

So, after mov -0xc(%rbp),%eax , the upper half of rax , and it is zero.

This also applies to the encoding 87 C0 xchg eax, eax , but not to its encoding 90 (which is defined as nop , overriding the above rule).

Unsigned int for unsigned long long well defined?

More articles: