I am trying to understand why our software is much slower when working under virtualization. Most of the statistics that I have seen say that in the worst case it should be only a 10% penalty for performance, but on a Windows virtual server, performance can be 100-400%. I tried to talk about the differences, but the profile results do not make much sense to me. Here is what I see when I profile on my 32-bit Vista field without virtualization: 
And here is one run on a Windows 2008 64-bit server with virtualization: 
Slow spends a very large amount of time on RtlInitializeExceptionChain , which shows as 0.0 on fast. Any idea what this does? In addition, when I attach my machine to a process, there is only one thread, PulseEvent , however, when I connect to the server, there are two threads: GetDurationFormatEx and RtlInitializeExceptionChain . As far as I know, the code we wrote uses only one thread. Also, for what it's worth, this is a console application written in pure C without an interface.
Can anyone shed light on all this for me? Even just information on what some of these ntdll and kernel32 calls are ntdll ? I'm also not sure how many of these differences are related to 64/32-bit and how many of them are related virtual / non-virtual. Unfortunately, I do not have easy access to other configurations to determine the difference.
Morinar
source share