How many indirect pointers affect performance?

Is pointer dereferencing noticeably slower than direct access to this value? I suppose my question is: How fast is a postulator statement?

+7
source share
5 answers

Passing a pointer can be much slower due to how a modern processor works. But it has nothing to do with run-time memory.

Instead, prediction and cache affect speed.

Prediction is easy when the pointer has not been changed or when it has been changed in predictable ways (for example, increasing or decreasing by four in a cycle). This allows the CPU to function substantially ahead of the actual code execution, find out what the pointer value will be, and load this address into the cache. Prediction becomes impossible when a pointer value is created by a complex expression, such as a hash function.

The cache enters the game because the pointer may point to memory that is not in the cache and needs to be removed. This is minimized if the prediction works, but if the prediction is impossible, then in the worst case you can have a double impact: the pointer is not in the cache, and the pointer is also not in the cache. In the worst case, the processor will stand twice.

If a pointer is used to point to a function, the processor branch predictor comes into play. In C ++ virtual tables, the values โ€‹โ€‹of the functions are constant, and the predictor is easy. The CPU will have ready-made code to run in the pipeline when execution passes through an indirect transition. But if this is an unpredictable pointer to a function, the performance impact can be severe, because it will be necessary to flush the pipeline, which with each jump will spend 20-40 cycles of the processor.

+20
source

Depends on things like:

  • regardless of whether the value of "direct access" is in the register or on the stack (which also indicates the indirection of the pointer)
  • whether the destination address is already in the cache
  • cache architecture, bus architecture, etc.

those. there are too many variables to think about them with benefit without narrowing it.

If you really want to know, compare it with specific equipment.

+3
source

this requires more memory access:

  • read the address stored in the pointer variable
  • read the value at read

This cannot be equal to 2 simple operations, as accessing an address that has not yet been loaded into the cache may take longer.

+2
source

This is true. This requires additional sampling.
Access to the variable by value, the variable is directly read from its memory.
Accessing the same via a pointer adds overhead to selecting the variable address from the pointer, and then reading the value from this memory location.

Of course, assuming that the variable does not fit in the register, which would be in some scenarios, such as hard loops. I believe that the question is looking for an answer to overhead, without suggesting such scenarios.

+2
source

Assuming you are dealing with a real pointer (and not some kind of smart pointer), the dereferencing operation does not consume at all (data). It (potentially) includes an additional memory link: one to load the pointer itself, the other to access the data pointed to by the pointer.

If you use a pointer in a narrow loop, it is usually loaded into the register for a while. In this case, the cost is mainly associated with additional pressure in the register (i.e. if you use the register to store this pointer, you cannot use it to store something else at the same time). If you have an algorithm that otherwise would accurately fill in the registers, but with the registration of the pointer, an overflow in memory could make a difference. At one time this was a rather big loss, but with most modern processors (with a lot of registers and on-board cache), which was rarely a big problem. The obvious exception would be the integrated CPU with fewer registers and no cache (and no internal memory).

The bottom line is that it is usually quite insignificant, often below the threshold, where you can even measure it reliably.

+2
source

All Articles