Understanding Tomasulo Algorithm

So, I'm trying to understand the Tomasulo algorithm to execute a command out of order. Here is what I get so far:

  • Instructions are loaded in order and stored in the command queue.

  • Rename registration happens somewhere nearby ...? From what I understand, this is to avoid the dangers of WAR / WAW by providing labels in registers. Let's say you have add r1, r2, r3 (1) add r3, r5, r6 (2) You have a WAR danger and you need to make sure that instruction (1) reads the old value of r3 before adding it to r1. Therefore, I assume that in the command queue (?), The hardware renames the registers, i.e. add r1, r2, r3 # 1 add r3 # 2, r5, r6 Or something like that.

  • Instructions are sent to the booking station. As I understand it, each function block has its own set of booking stations. But does it look like a queue (FIFO) of instructions for this function block to execute when matching labeled operands are available on the shared data bus?

  • Since instructions can end in random order (not in order), and more instructions can continue ... Is there a way when the shared data bus updates the register file before additional instructions appear? I heard that a reorder buffer is used, which basically sorts the instructions in order (this should mean that the instructions have some kind of tag), and then the registry results are transferred back to the register file.

What confuses me is the implementation of register renaming and the structure of booking stations.

Thanks for any help.

+4
source share
2 answers

The Tomasulo algorithm is really not tied to any particular equipment, and in fact, on real machines, register renaming usually happens before instructions are inserted into the command queue. Let me get a couple of highlights. First, the registers expressed in your program are logical registers. Secondly, real hardware registers are called physical registers. The Tomasulo algorithm is just a mechanism for mapping logical registers to physical registers. In real machines, there are usually two tables displaying logical registers for physical registers. One at the rename stage and one at the commit stage. There should also be more physical registers than logical ones. It works essentially like this:

  • For each logical input, find the renaming calendar mapping table to find out which physical register currently has this logical register value.
  • For each logical output, find the unused physical register and map it to it. Refresh the renaming scene display table. If the physical register is not available, wait until it becomes available.
  • At the fixing stage, when the command is completed, rewrite the old logical ones into physical mappings of the logical outputs of the command with new mappings. Free physical registers that were part of old mappings.
  • If there is an incorrect prediction in the pipeline (or any kind of squash), the table at the fixing stage overwrites the table at the renaming stage and everything in the intermediate pipeline is reset.

The part about booking stations is really a concrete implementation of a non-standard conveyor and, in essence, comes with a physical register. There are many cars that do not really have a concept of a booking station.

Hennessy and Patterson's books on computer architecture are a standard textbook on similar materials. I tried to explain this as simply as possible, but there are literally thousands of suggestions for optimizing this material.

+4
source
  • The Tomasulo algorithm has nothing to do with the reordering buffer. The goal of Tomasulo's algorithm is to enable out-of-order execution, while the reordering buffer motivation is to implement an exact interrupt.

  • The usual register renaming scheme provides more physical registers than ISA needs. In this case, before the command was sent to the command queue, the name of its architectural register (for example, r5) was changed to physical (for example, p19). However, the Tomasulo algorithm takes a different approach, where the command register is actually "renamed" to the part number (the part can be a single slot / record in the backup station or in any register).

More information can be found on this slide:

http://www.slideshare.net/onesuper/understanding-tomasolu-algorithm

0
source

All Articles