Operational characteristics of fundamental operations for computational estimation of algorithmic complexity

I created a compiler for a general-purpose programming language. As part of the tool chain, I would like to include a profiler with the ability to evaluate the time complexity of this expression. Apparently, it is simple enough to calculate the algorithmic complexity, i.e. Provided that all constant time operations take the same amount of time, but I would also like to approximate the real complexity. To do this, I need information on the relative performance of individual processor operations, such as inc , add , mul , etc., as well as some higher-level operations, such as I / O.

I understand that this depends both on the architecture and on the implementation, it can give at best only fuzzy results and something like a double question. But can anyone find out about any high quality resources available to me first? If views on an open source implementation of higher-level operations would give me enough information to provide a fair assessment of their complexity?

+4
source share
4 answers

In most modern processors, the concept of "cycle time for a specific team" is not particularly useful. The pipeline will process several instructions at once, and they will compete for various resources inside the CPU, therefore the performance of this command can be understood only in the context of the surrounding instructions. And the details will differ significantly, even in different models of the processor family.

In addition, if you are doing anything related to data, then the behavior of the cache is likely to be as important as the execution time of the command.

For x86: see Agner Fog Software Optimization Resources .

+2
source

Intel has some information about their assembly implementation in its database of articles. The good ones are fairly dense (for example, this 600-page PDF file), but they have a lot of interesting information, including some tables with approximate delay times. There's also a table with some delays for their 64-bit architecture , so you can look for a similar 32-bit if you want to.

I personally have no idea about any information for AMD processors. Google may show some results, but I have not used an AMD machine with Athlon 3000 days, so I did not need to look for such information.

+2
source

From what I know:
inc: min O (1) max O (log n)
add, sub: O (log n)
mul, div: O (n)

malloc: O (n * m) n is the assigned size, m is the number of previous distributions.
free: O (1) (sometimes O (log m)).

+1
source

Reinhard Wilhelm's group in Saarbrรผcken conducts a time analysis study, including cache behavior.

+1
source

All Articles