What is the limit of optimization with SIMD?

I need to optimize some C code that does a lot of physical computing using the SIMD extensions on the SPE Cell Processor. Each vector operator can process 4 floats at a time. Therefore, ideally, I would expect 4x acceleration in the most optimistic case.

Do you think that the use of vector operators can give great acceleration?

thank

+5
source share
5 answers

The best optimization occurs when you rethink the algorithm. Eliminate unnecessary steps. Find a more direct way to achieve the same result. Compute a solution in the domain that is more relevant to the problem.

, n, , .

+4

, 4 , , SIMD ( , ), . .

- , . , , 4- . , . , , , , CPU .

+4

, . , , , . ...

+3

.

  • - , , , .
  • SIMD , FPU/ALU (, PAVG/PMIN .. SSE2). , .
  • Cell, SIMD , , . .

Cell PPC, 20- (C SSE2) Atom, (16 ).

+2

. x86 ( SSE).

. SSE, .

, , SSE, , . , . , , SSE, .

And then there is the opportunity to hint to the memory controller how you want to access the memory, for example. if you want to store data so that it bypasses the cache or not. For starving bandwidth algorithms that can give you extra extra speed on this.

+1
source

All Articles