Algorithmic complexity analysis: practically using Knut Ordinary Operations (oops) and memory operations (mems) methods

When implementing most algorithms (sorting, searching, bypassing the graph, etc.), a compromise often arises that can be made in reducing access to memory due to additional ordinary operations.

Knuth has a useful method for comparing the complexity of various implementations of algorithms, abstracting it from specific processors and distinguishing only ordinary operations (oops) and memory operations (mems).

In compiled programs, the compiler usually manages to organize low-level operations and hopes that the operating system will decide whether the data is stored in the cache (faster) or in virtual memory (slower). In addition, the exact amount / cost of instructions is encapsulated by the compiler.

With Forth, there is no longer such encapsulation, and one is much closer to the machine, although perhaps to a stacked machine running on top of the registrar.

Ignoring the influence of the operating system (so that there are no memory stops, etc.) and assuming at the moment a simple processor

(1) Can anyone advise how regular stack operations in Forth (e.g. dup, rot, over, swap, etc.) are compared to the cost of Fetch memory access fetch (@) or store (!)

(2) Is there a thumb rule I can use to decide how many common operations to trade off while maintaining memory access?

What I'm looking for is something like "access to memory, equal to 50 normal operations, or 500 ordinary operations, or 5 ordinary" Ballpark ", is absolutely normal.

I am trying to understand the relative consumption of extraction and storage vs. rot, swap, dup, drop, over, correct by an order of magnitude.

+4
source share
1 answer

In this article, How long does it take to extract a single word from memory? talks about breaking the main memory, with some thumb symbol numbers, but basically you can follow a lot of instructions when stopped for main memory. As others have said, numbers vary greatly between systems.

Main memory storages are a big area of ​​interest, especially because processors have more cores, but, as a rule, not much higher memory bandwidth. Some research is underway to compress data in main memory, so the processor can use "spare" loops and densely packed cache lines http://oai.cwi.nl/oai/asset/15564/15564B.pdf

For those who are really interested in details, most processor manufacturers publish detailed guides for optimizing memory, etc., mainly intended for high-performance and compilers, but read by all 2gl and 3gl programmers.

Ps. Go Forth.

+3
source

All Articles