same answer as:
Changed: you asked what your options are. If your heart is set up for profiling, then find a profiler.
On the other hand, if you really have a performance problem, a simple method works or is better than almost every profiler. I say almost everyone, because in some profilographs you can just apologize for what you need to know, this is a time cost related to individual instructions, especially call instructions.
The time value of the instruction is the time that would be saved if the command could be deleted, and a good estimate is the fraction of the call stack samples containing it. You do not need to evaluate this fraction with high accuracy. If the instruction is in 5 out of 10 samples, it is probably somewhere in the range from 45% to 55%. It doesn’t matter if you could get rid of it, you would save.
So finding performance issues is not difficult. Just take a few samples of the call stack, collect a set of instructions for these samples, and rank the instructions for the fraction of samples containing them. Among the instructions with a high proportion are some of them that you can optimize, and you do not need to guess where they are.
I simplify things a bit because it is often useful to examine more status information than just a call stack to see if some work is really needed. But I hope this is done.
People doubt that it can work in the presence of recursion or work on large programs. A little thought (and experimentation) shows that these objections do not contain water.
Mike dunlavey
source share