How are numbers greater than 2 ^ 32 handled by a 32-bit machine?

Question

How are numbers greater than 2 ^ 32 handled by a 32-bit machine?

I am trying to understand how calculations with numbers greater than 2 ³² occur on a 32-bit machine.

C code

$ cat size.c #include<stdio.h> #include<math.h> int main() { printf ("max unsigned long long = %llu\n", (unsigned long long)(pow(2, 64) - 1)); } $

Gcc output

 $ gcc size.c -o size $ ./size max unsigned long long = 18446744073709551615 $

Corresponding assembly code

 $ gcc -S size.c -O3 $ cat size.s .file "size.c" .section .rodata.str1.4,"aMS",@progbits,1 .align 4 .LC0: .string "max unsigned long long = %llu\n" .text .p2align 4,,15 .globl main .type main, @function main: pushl %ebp movl %esp, %ebp andl $-16, %esp subl $16, %esp movl $-1, 8(%esp) #1 movl $-1, 12(%esp) #2 movl $.LC0, 4(%esp) #3 movl $1, (%esp) #4 call __printf_chk leave ret .size main, .-main .ident "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3" .section .note.GNU-stack,"",@progbits $

What exactly happens on lines 1 - 4?

Is this some kind of string concatenation at the assembly level?

+7

c gcc x86 32-bit

Lazer Oct 9 '10 at 21:17

source share

7 answers

similar to how we process numbers greater than 9, with numbers 0-9. (using positional digits). assuming the question is conceptual.

+6

alvin Oct 10 '10 at 14:57

source share

In your case, the compiler knows that 2 ^ 64-1 is just 0xffffffffffffffff, so it pushed -1 (low dword) and -1 (high dword) onto the stack as your printf argument. This is just an optimization.

In general, 64-bit numbers (and even large values) can be stored with several words, for example. a unsigned long long uses two dword s. To add two 64-bit numbers, two additions are made: one on the low 32 bits and one on the high 32 bits plus carry:

 ; Add 64-bit number from esi onto edi: mov eax, [esi] ; get low 32 bits of source add [edi], eax ; add to low 32 bits of destination ; That add may have overflowed, and if it did, carry flag = 1. mov eax, [esi+4] ; get high 32 bits of source adc [edi+4], eax ; add to high 32 bits of destination, then add carry.

You can repeat this sequence of add and adc as much as you want to add arbitrarily large numbers. The same thing can be done with subtraction - just use sub and sbb (subtract with borrowing).

Multiplication and division is much more complicated, and the compiler usually creates some helper functions to deal with them whenever you multiply 64-bit numbers together. Packages such as GMP , which support very, very large integers, use SSE / SSE2 to speed things up. Take a look at this Wikipedia article for more information on multiplication algorithms.

+3

wj32 Oct 9 '10 at 21:36

source share

As others pointed out, all 64-bit arithmetic in your example has been optimized. This answer focuses on the int title question.

Basically, we process each 32-bit number as a digit and work in the database 4294967296. Thus, we can work on large arbiters.

Addition and subtraction are the easiest. We work with numbers one at a time, starting with the least significant and moving on to the most significant. Usually, the first digit is executed using the usual add / subtract command, and subsequent digits are executed using the special instructions “add with transfer” or “subtract with borrowing”. The carry flag in the status register is used to transfer carry / borrow bits from one digit to another. Thanks to the double addition signed, both unsigned addition and subtraction are the same.

Multiplication is a bit more complicated, and multiplying two 32-bit digits can produce a 64-bit result. Most 32-bit processors will have instructions that multiply two 32-bit numbers and output the 64-bit result in two registers. Then an appendix will be needed to combine the results into a final answer. Thanks to the double addition, signed and unsigned multiplication, they are the same, provided that the desired size of the result matches the size of the argument. If the result is greater than the arguments, special care is required.

For comparison, we start with the most significant figure. If it is equal, we move on to the next digit until the results are equal.

The industry is too complex to describe in this post, but there are many examples of algorithms. e.g. http://www.hackersdelight.org/hdcodetxt/divDouble.c.txt

Some real examples are from gcc https://godbolt.org/g/NclqXC , assembler is in intel syntax.

First add. adding two 64-bit numbers and getting a 64-bit result. Asm is the same for both signed and unsigned versions.

 int64_t add64(int64_t a, int64_t b) { return a + b; } add64: mov eax, DWORD PTR [esp+12] mov edx, DWORD PTR [esp+16] add eax, DWORD PTR [esp+4] adc edx, DWORD PTR [esp+8] ret

It's pretty simple, load one argument into eax and edx, and then add another using add and then add with the wrap. The result remains in eax and edx to return to the caller.

Now multiply two 64-bit numbers to get a 64-bit result. Again, the code does not change from signed to unsigned. I added a few comments to make it easier to follow.

Before we look at the code, consider the math. a and b are 64-bit numbers. I will use lo () to represent the lower 32-bit numbers of a 64-bit number and hi () to represent the upper 32 bits of a 64-bit number.

(a * b) = (lo (a) * lo (b)) + (hi (a) * lo (b) * 2 ^ 32) + (hi (b) * lo (a) * 2 ^ 32) + (hi (b) * hi (a) * 2 ^ 64)

(a * b) mod 2 ^ 64 = (lo (a) * lo (b)) + (lo (hi (a) * lo (b)) * 2 ^ 32) + (lo (hi (b) * lo (a)) * 2 ^ 32)

lo ((a * b) mod 2 ^ 64) = lo (lo (a) * lo (b))

hi ((a * b) mod 2 ^ 64) = hi (lo (a) * lo (b)) + lo (hi (a) * lo (b)) + lo (hi (b) * lo (a) )

 uint64_t mul64(uint64_t a, uint64_t b) { return a*b; } mul64: push ebx ;save ebx mov eax, DWORD PTR [esp+8] ;load lo(a) into eax mov ebx, DWORD PTR [esp+16] ;load lo(b) into ebx mov ecx, DWORD PTR [esp+12] ;load hi(a) into ecx mov edx, DWORD PTR [esp+20] ;load hi(b) into edx imul ecx, ebx ;ecx = lo(hi(a) * lo(b)) imul edx, eax ;edx = lo(hi(b) * lo(a)) add ecx, edx ;ecx = lo(hi(a) * lo(b)) + lo(hi(b) * lo(a)) mul ebx ;eax = lo(low(a) * lo(b)) ;edx = hi(low(a) * lo(b)) pop ebx ;restore ebx. add edx, ecx ;edx = hi(low(a) * lo(b)) + lo(hi(a) * lo(b)) + lo(hi(b) * lo(a)) ret

Finally, when we try division, we see.

 int64_t div64(int64_t a, int64_t b) { return a/b; } div64: sub esp, 12 push DWORD PTR [esp+28] push DWORD PTR [esp+28] push DWORD PTR [esp+28] push DWORD PTR [esp+28] call __divdi3 add esp, 28 ret

The compiler decided that sharing was too difficult to implement inline and instead called a library procedure.

+2

plugwash Jul 27 '16 at 20:43

source share

The compiler really did a static optimization of your code. lines # 1 # 2 # 3 are parameters for printf ()

+1

Nicolas Viennot Oct 9 '10 at 21:20

source share

As @Pafy mentions, the compiler rated this as a constant.

2 to 64th minus 1 0xffffffffffffffff .

As 2 32-bit integers these are: 0xffffffff and 0xffffffff ,
which, if you take it as a pair of 32-bit signed types, end up as: -1 and -1 .

Thus, for your compiler, the generated code is equivalent:

 printf("max unsigned long long = %llu\n", -1, -1);

In the assembly, it is written like this:

 movl $-1, 8(%esp) #Second -1 parameter movl $-1, 12(%esp) #First -1 parameter movl $.LC0, 4(%esp) #Format string movl $1, (%esp) #A one. Kind of odd, perhaps __printf_chk #in your C library expects this. call __printf_chk

By the way, the best way to calculate degrees 2 is to shift 1 left. For example. (1ULL << 64) - 1 .

+1

asveikau Oct 9 '10 at 21:34

source share

No one in this thread noticed that the OP asked for an explanation of the first 4 lines, not lines 11-14.

The first four lines:

  .file "size.c" .section .rodata.str1.4,"aMS",@progbits,1 .align 4 .LC0:

Here is what happens in the first 4 lines:

 .file "size.c"

This is an assembler directive that says that we are going to run a new logical file called "size.c".

 .section .rodata.str1.4,"aMS",@progbits,1

It is also a directive for read-only strings in a program.

 .align 4

This directive sets the location counter to always a multiple of 4.

 .LC0:

This is the label LC0 , to which, for example, you can jump.

I hope I gave the correct answer to this question, as I have precisely answered the OP question.

0

bodacydo Oct 19 '14 at 3:44

source share

Matthew slattery · Accepted Answer · 2010-10-09T21:37:54+0000

__printf_chk - a wrapper around printf that checks for stack overflow and accepts an additional first parameter, a flag (for example, see here .)

pow(2, 64) - 1 been optimized to 0xffffffffffffffff , since the arguments are constants.

According to the usual calling conventions, the first argument __printf_chk() ( int flag ) is the 32-bit value on the stack (in %esp during the call command). The next argument const char * format is a 32-bit pointer (the next 32-bit word on the stack, i.e. In %esp+4 ). And the 64-bit number that is printed takes up the following two 32-bit words (in %esp+8 and %esp+12 ):

 pushl %ebp ; prologue movl %esp, %ebp ; prologue andl $-16, %esp ; align stack pointer subl $16, %esp ; reserve bytes for stack frame movl $-1, 8(%esp) #1 ; store low half of 64-bit argument (a constant) to stack movl $-1, 12(%esp) #2 ; store high half of 64-bit argument (a constant) to stack movl $.LC0, 4(%esp) #3 ; store address of format string to stack movl $1, (%esp) #4 ; store "flag" argument to __printf_chk to stack call __printf_chk ; call routine leave ; epilogue ret ; epilogue

The compiler effectively rewrote this:

 printf("max unsigned long long = %llu\n", (unsigned long long)(pow(2, 64) - 1));

... in it:

 __printf_chk(1, "max unsigned long long = %llu\n", 0xffffffffffffffffULL);

... and at runtime, the stack structure for the call looks like this (showing the stack as 32-bit words with addresses increasing from the bottom of the diagram up):

  : : : Stack : : : +-----------------+ %esp+12 | 0xffffffff | \ +-----------------+ } <-------------------------------------. %esp+8 | 0xffffffff | / | +-----------------+ | %esp+4 |address of string| <---------------. | +-----------------+ | | %esp | 1 | <--. | | +-----------------+ | | | __printf_chk(1, "max unsigned long long = %llu\n", | 0xffffffffffffffffULL);

How are numbers greater than 2 ^ 32 handled by a 32-bit machine?

More articles: