Why is the main thread slower than the worker thread in pthread-win32?

void* worker(void*) { int clk = clock(); float val = 0; for(int i = 0; i != 100000000; ++i) { val += sin(i); } printf("val: %f\n", val); printf("worker: %d ms\n", clock() - clk); return 0; } int main() { pthread_t tid; pthread_create(&tid, NULL, worker, NULL); int clk = clock(); float val = 0; for(int i = 0; i != 100000000; ++i) { val += sin(i); } printf("val: %f\n", val); printf("main: %d ms\n", clock() - clk); pthread_join(tid, 0); return 0; } 

The main thread and the worker thread should work equally fast, but the result:

  val: 0.782206 worker: 5017 ms val: 0.782206 main: 8252 ms 

The main topic is much slower, I do not know why ....


The problem is resolved. This is a compiler issue, GCC (MinGW) behaves strangely on Windows. I learned the code in Visual Studio 2012, there is no difference in speed.

+2
c ++ pthreads-win32
source share
2 answers
  Main thread and the worker thread are supposed to run equally fast, but the result is: 

I have never seen a thread system outside the real-time operating system that provided such guarantees. With window streams and all other streaming systems (I also use posix streams and any light streams on MacOS X and streams in C # streams) on desktop systems, I understand that there are no guarantees of performance in terms or how fast one stream will be connected with another.

A possible explanation (speculation) may be that since you are using a modern quad-core processor, this may increase the clock speed of the main core. When single-threaded workloads are mainly used, modern i5 / i7 / AMD-FX systems increase the clock frequency on a single core to a level with a preliminary rating that cooling stocks can dissipate heat. At more parallel loads, all cores receive a lower bump at the clock frequency, again pre-estimated based on heat dissipation, and when in standby mode, all cores fade to minimize energy consumption. It is possible that the amount of background work is mainly performed on one core and the amount of time that the second thread spends on the second core is not enough to justify the transition to a mode in which the speed of all cores is increased.

I would try again with 4 threads and 10x workload. If you have a tool that controls CPU usage and clock speed, I would check this out. Using this information, you can conclude that I am right or wrong.

Another option would be profiling and see if any of the work takes time. It is possible that OS calls take longer than your workload.

You can also test your software on another computer with various performance characteristics, such as a constant clock speed or a single core processor. This will provide additional information.

+1
source share

What can happen is that the execution of the workflow alternates with the main execution, so that some execution time of the workflow is counted against the main time. You can try to put sleep(10) (some time longer than the execution time of the worker and the main one) at the very beginning of the worker and start again.

0
source share

All Articles