bob.s
.data variable: .word 0,0,0,0 .word 0,0,0,0 .word 0,0,0,0 .word 0,0,0,0 .word 0,0,0,0 .word 0,0,0,0 .text .globl runAssemblyCode runAssemblyCode: mov $0xFFFFFFFF,%eax start_loop: decl variable+0 decl variable+8 decl variable+16 ;decl variable+24 dec %eax jne start_loop retq
ted.c
#include <stdio.h> #include <time.h> void runAssemblyCode ( void ); int main ( void ) { volatile unsigned int ra,rb; ra=(unsigned int)time(NULL); runAssemblyCode(); rb=(unsigned int)time(NULL); printf("%u\n",rb-ra); return(0); }
gcc -O2 ted.c bob.s -o ted
this was with additional instruction:
00000000004005d4 <runAssemblyCode>: 4005d4: b8 ff ff ff ff mov $0xffffffff,%eax 00000000004005d9 <start_loop>: 4005d9: ff 0c 25 28 10 60 00 decl 0x601028 4005e0: ff 0c 25 30 10 60 00 decl 0x601030 4005e7: ff 0c 25 38 10 60 00 decl 0x601038 4005ee: ff 0c 25 40 10 60 00 decl 0x601040 4005f5: ff c8 dec %eax 4005f7: 75 e0 jne 4005d9 <start_loop> 4005f9: c3 retq 4005fa: 90 nop
I donβt see the difference, maybe you can fix my code, or others can try their systems to see what they see ...
which is an extremely painful instruction plus if you are doing something other than a memory byte decrement that is not aligned and will be painful for the memory system. therefore, this procedure should be sensitive to cache lines, as well as the number of cores, etc.
It took about 13 seconds with or without additional instructions.
amd phenom 9950 quad core processor
on the
Intel (R) Core (TM) 2 CPU 6300
took about 9-10 seconds with or without additional instructions.
Two processors: Intel (R) Xeon (TM) CPU
It took about 13 seconds with or without additional instructions.
In this case: Intel (R) Core (TM) 2 Duo CPU T7500
8 seconds with or without.
All work with Ubuntu 64 bit 10.04 or 10.10, maybe 11.04 there.
A few more machines, 64 bits, ubuntu
Intel (R) Xeon (R) CPU X5450 (8 cores)
6 seconds with or without additional instructions.
Intel (R) Xeon (R) CPU E5405 (8 cores)
9 seconds with or without.
What is the speed of your DDR / DRAM on your system? Which processor are you using (cat / proc / cpuinfo if on linux).
Intel (R) Xeon (R) CPU E5440 (8 cores)
6 seconds with or without
Ahh, found one core, xeon: Intel (R) Xeon (TM) CPU
15 seconds with or without additional instructions