As others pointed out, all 64-bit arithmetic in your example has been optimized. This answer focuses on the int title question.
Basically, we process each 32-bit number as a digit and work in the database 4294967296. Thus, we can work on large arbiters.
Addition and subtraction are the easiest. We work with numbers one at a time, starting with the least significant and moving on to the most significant. Usually, the first digit is executed using the usual add / subtract command, and subsequent digits are executed using the special instructions “add with transfer” or “subtract with borrowing”. The carry flag in the status register is used to transfer carry / borrow bits from one digit to another. Thanks to the double addition signed, both unsigned addition and subtraction are the same.
Multiplication is a bit more complicated, and multiplying two 32-bit digits can produce a 64-bit result. Most 32-bit processors will have instructions that multiply two 32-bit numbers and output the 64-bit result in two registers. Then an appendix will be needed to combine the results into a final answer. Thanks to the double addition, signed and unsigned multiplication, they are the same, provided that the desired size of the result matches the size of the argument. If the result is greater than the arguments, special care is required.
For comparison, we start with the most significant figure. If it is equal, we move on to the next digit until the results are equal.
The industry is too complex to describe in this post, but there are many examples of algorithms. e.g. http://www.hackersdelight.org/hdcodetxt/divDouble.c.txt
Some real examples are from gcc https://godbolt.org/g/NclqXC , assembler is in intel syntax.
First add. adding two 64-bit numbers and getting a 64-bit result. Asm is the same for both signed and unsigned versions.
int64_t add64(int64_t a, int64_t b) { return a + b; } add64: mov eax, DWORD PTR [esp+12] mov edx, DWORD PTR [esp+16] add eax, DWORD PTR [esp+4] adc edx, DWORD PTR [esp+8] ret
It's pretty simple, load one argument into eax and edx, and then add another using add and then add with the wrap. The result remains in eax and edx to return to the caller.
Now multiply two 64-bit numbers to get a 64-bit result. Again, the code does not change from signed to unsigned. I added a few comments to make it easier to follow.
Before we look at the code, consider the math. a and b are 64-bit numbers. I will use lo () to represent the lower 32-bit numbers of a 64-bit number and hi () to represent the upper 32 bits of a 64-bit number.
(a * b) = (lo (a) * lo (b)) + (hi (a) * lo (b) * 2 ^ 32) + (hi (b) * lo (a) * 2 ^ 32) + (hi (b) * hi (a) * 2 ^ 64)
(a * b) mod 2 ^ 64 = (lo (a) * lo (b)) + (lo (hi (a) * lo (b)) * 2 ^ 32) + (lo (hi (b) * lo (a)) * 2 ^ 32)
lo ((a * b) mod 2 ^ 64) = lo (lo (a) * lo (b))
hi ((a * b) mod 2 ^ 64) = hi (lo (a) * lo (b)) + lo (hi (a) * lo (b)) + lo (hi (b) * lo (a) )
uint64_t mul64(uint64_t a, uint64_t b) { return a*b; } mul64: push ebx ;save ebx mov eax, DWORD PTR [esp+8] ;load lo(a) into eax mov ebx, DWORD PTR [esp+16] ;load lo(b) into ebx mov ecx, DWORD PTR [esp+12] ;load hi(a) into ecx mov edx, DWORD PTR [esp+20] ;load hi(b) into edx imul ecx, ebx ;ecx = lo(hi(a) * lo(b)) imul edx, eax ;edx = lo(hi(b) * lo(a)) add ecx, edx ;ecx = lo(hi(a) * lo(b)) + lo(hi(b) * lo(a)) mul ebx ;eax = lo(low(a) * lo(b)) ;edx = hi(low(a) * lo(b)) pop ebx ;restore ebx. add edx, ecx ;edx = hi(low(a) * lo(b)) + lo(hi(a) * lo(b)) + lo(hi(b) * lo(a)) ret
Finally, when we try division, we see.
int64_t div64(int64_t a, int64_t b) { return a/b; } div64: sub esp, 12 push DWORD PTR [esp+28] push DWORD PTR [esp+28] push DWORD PTR [esp+28] push DWORD PTR [esp+28] call __divdi3 add esp, 28 ret
The compiler decided that sharing was too difficult to implement inline and instead called a library procedure.