How to reduce CPU usage in a program?

I wrote a multi-threaded program that does some calculations with a lot of floating point operations. More specifically, it is a program that sequentially compares animation sequences. That is, it compares the frame data from animation A with all frames in animation B, for all frames in animation A. I perform this intensive operation for different animations in parallel, so the program can work on pairs AB, pairs BC and pairs CA in parallel to each other . The program uses QtConcurrent and the "map" function, which maps the container with movements to the function. QtConcurrent manages the thread pool for me, I am working on an Intel Quad Core processor, so it spawns 4 threads.

Now the problem is that my process destroys my processor. Usage is 100% permanent, and I actually get a blue screen of death if I run my program on a sufficiently large set of movements (page error in an undeveloped area). I suspect that this is due to the fact that my computer is overclocked. However, could this be due to the way I encoded my program? Some very intense climbing tools that I used to test the stability of my car never crashed my computer. Is there a way to control how my program uses my processor to reduce the load? Or maybe I do not understand my problem?

+6
c ++ performance parallel-processing qtconcurrent
source share
13 answers

There are great answers here.

I would add only from the point of view of performing a large performance tuning, if only each thread was not optimized aggressively, there is a possibility that it has many opportunities to shorten the cycle.

To draw an analogy with a remote rally, there are two ways to win:

  • Make the car faster.
  • Make fewer stops and side outages

In my experience, most software, as written at the beginning, is quite far from using the most direct route, especially as the software gets larger.

To find the wasted cycles in your program, as Kenneth Cochran said, never guess. If you fix something without proving that it is a problem, you are investing in a hunch.

A popular way to look for performance issues is to use profilers.

However, I do this a lot, and my method is this: http://www.wikihow.com/Optimize-Your-Program%27s-Performance

+5
source share

Overclocking a PC can lead to all the strange problems. If you suspect this is the root cause of your problem, try synchronizing it reasonably and repeat the tests.

It can also be a kind of rather strange memory error in which you damage your RAM in such a way that Windows (I think the OS is due to BSOD) can no longer recover (very unlikely, but who knows).

Another possibility that I can think of is that you have some error in your streaming implementation that kills windows.

But first, I would look at the problem of overclocking ...

+9
source share

the operation described by you is already very parallelizable. Performing multiple tasks can be detrimental to productivity. The reason for this is that the cache of any processor has a limited size, and the more you try to do this at the same time, the smaller the share of each thread in the cache.

You can also examine the parameters using your GPU to absorb part of the load. Modern graphics processors are much more efficient for most types of video conversion than a processor of similar generations.

+5
source share

I suspect this is because my computer is overclocked.

It is definitely possible. Try setting it to normal speed for a while.

Could this be due to the way I encoded my program?

A program running in user mode is unlikely to cause a BSOD.

+4
source share

I guess I would say that your 3-core machine (or 4, considering 100% usage) does not work for you, and parallelization will greatly damage your performance if you use more threads than cores. Make only one thread per processor core and all you do is never have access to data on different threads at the same time . Caching algorithms on most multi-core processors will completely kill your performance. In this case, on an N-core L-frame animation processing processor, I would use stream 1 in frames 0- (L / N), stream 2 in frames (L / N) - (2 * L / N) ... thread N on frames ((N-1) * L / N) -L. Perform different combinations (AB, BC, CA) so that you don’t break your cache, it should also be easier for the code.

How's the note? Real computing like this should use a 100% processor, which means it runs as fast as it can.

+4
source share

Acceleration is the most likely cause of instability. When using any intensive processor algorithm, some processor processing will occur. Overclocking doesn't hold up; I would find a good performance profiler to find performance bottlenecks. Never guess where the problem is. You could spend months optimizing something that does not affect performance, and worse performance might even decrease.

+2
source share

It's too easy to blame hardware. I suggest you try running your program on a different system and see how it turns out with the same data.

You probably have a mistake.

+1
source share

Look at using SIMD operations. I think you need SSE in this case. They are often a better first step than parallelization, since they are easier to get right and provide a fairly powerful growth for most types of linear algebra operations.

Once you get it using SIMD, look at concurrency. It looks like you are slamming the processor too, so you might be able to do some dreams instead of busy, perhaps make sure you clean or reuse the threads correctly.

+1
source share

If the BSOD error code (useful for searching) is missing, it will be a little harder to help you with this.

You can try physically reinstalling your memory ((pull it out and release it.) I and some others that I know worked on several machines where necessary. For example, I once tried to update an OS X machine, and it all the time collapsing ... finally I pulled out the memory and threw it back, and everything was in order.

0
source share

Sleep (1); halves CPU usage. I ran into the same problem that works with CPU intensive algorithm.

0
source share

If your processor has two or more cores, you can go to the task manager and go to the processes and right-click on the program name and click Set affinity and install fewer cores for the program.

It will take longer to complete the actions you ask for, but will lead to a significant reduction in CPU usage.

0
source share

I think the blue screen of death is triggered when the kernel memory area is damaged. Therefore, the use of multithreading for parallel operations could not be the reason for this.

Well, if you create several threads, each of which carries out heavy floating-point operations, then, of course, your CPU utilization will reach 100%.

It would be better if you could sleep in each thread so that the other process gets some chance. You can also try to lower the priority of threads.

-one
source share

If after some work on Windows, place one function call to inform the CPU, you want the processor to make other processes. Make a call to the wait function as follows:

Slepp (0);

-one
source share

All Articles