Slow performance when executing x86 instructions stored in a data segment?

I have a simple program that first writes some native x86 instructions to a declared buffer, and then sets a pointer to that buffer and makes a call. However, I notice a serious performance hit when this buffer is allocated on the stack (as opposed to a heap or even in a global data area). I checked that the beginning of the sequence of instructions in the data buffer is at the 16-byte boundary (I assume that the processor needs (or wants) it). I don’t know why this may affect where I follow my instructions in the process, but in the program below, β€œGOOD” is executed after 4 seconds on my dual-core workstation, and β€œBAD” takes 6 minutes or so,Is there any kind of alignment problem / i -cache / prediction? My evaluation license for VTune has just ended, so I can’t even analyze it :( Thank you.


#include <stdio.h>
#include <string.h>
#include <stdlib.h>

typedef int (*funcPtrType)(int, int);

int foo(int a, int b) { return a + b; }

void main()
{
  // Instructions in buf are identical to what the compiler generated for "foo".
  char buf[201] = {0x55,
                   0x8b, 0xec,
                   0x8b, 0x45, 0x08,
                   0x03, 0x45, 0x0c,
                   0x5D,
                   0xc3
                  };

  int i;

  funcPtrType ptr;

#ifdef GOOD
  char* heapBuf = (char*)malloc(200);
  printf("Addr of heap buf: %x\n", &heapBuf[0]);
  memcpy(heapBuf, buf, 200);
  ptr = (funcPtrType)(&heapBuf[0]);
#else // BAD
  printf("Addr of local buf: %x\n", &buf[0]);
  ptr = (funcPtrType)(&buf[0]);
#endif

  for (i=0; i < 1000000000; i++)
    ptr(1,2);
}

:

$cl -DGOOD ne3.cpp
Microsoft (R) 32- C/++ 11.00.7022 80x86
(C) Microsoft Corp. 1984-1997. .

ne3.cpp
Microsoft (R) 32- 5.10.7303
Copyright (C) Microsoft Corp 1992-1997. .

/out:ne3.exe
ne3.obj
$ time./ne3
Addr buf: 410eb0

real 0m 4.33s
0m 4.31s
sys 0m 0.01s
$
$
$ cl ne3.cpp
Microsoft (R) 32- C/++ 11.00.7022 80x86
(C) Microsoft Corp. 1984-1997. .

ne3.cpp
Microsoft (R) 32- 5.10.7303
Copyright (C) Microsoft Corp 1992-1997. .

/out:ne3.exe
ne3.obj
$ time./ne3
Addr buf: 12feb0

real 6m41.19s
6m40.46s
sys 0m 0.03s
$

.

  • Shasank
+5
2

?

, MMU. , . - . , , , , - SW - .

i-cache?

, . , x86 /, , , . , , (, ), , , , .

CPU DRAM , , , , , , - , "" HW, . , Intel AMD , , , .

+3

, , i , i for , . - (, , ), .

, (, ).

, . . , , ( , sys ).

"" , , , VirtualProtect().

+2

All Articles