It is not clear to me from the Intel documentation if Hyperthreading processors exchange a register file between threads or have two different ones (I would assume that they are really different, because otherwise the context switching time between HT threads will be quite high, but this is purely a guess).
As for acceleration, it will depend on your combination of teams and planning. Remember that the HT processor does not have additional execution resources (ALUs, load / storage units, etc.). The performance improvement is associated with better possibilities for using these resources, since typical code, especially on a modern processor, spends a reasonable amount of time blocked, waiting for memory to load and to complete execution before execution can continue. HT allows you to move these loads and storage so that one thread hangs while reading, the other can switch and start using execution resources that were previously absent.
I would have guessed what kind of performance increase that you will see with multi-threading of the SSE program will depend on the ratio of memory operating elements to arithmetic operations. If, for example, your SSE program loads 4 SSE registers from memory, performs 10,000 SSE operations on them, and then writes 4 registers back, you are unlikely to see most of the benefits of HTs that can block memory access, because 99% of the time the execution of your programs will be spent in ALU SIMD, not memory access.
On the other hand, if your program is very complex, then multithreading your program can significantly increase the performance of multi-core processors and can give you much more than a 30% improvement, since in this case your code can access the full execution resources of several cores simultaneously .
source share