Memory access conflict condition in vector processors with memory

Hennessy-Patterson's book on computer architecture (5ed quantitative approach) states that in a vector architecture with multiple memory banks, a bank conflict can occur if the following condition is true (Page 5 of 5):

(number of banks) / least total number (number of banks, step) <Bank hours

However, I think that it should be GreatestCommonFactor instead of LCM, because a memory conflict would occur if the number of effective banks you have is less than the time taken. By the effective number of banks, I mean this - let's say you have 8 banks and step 2. Then you have 4 banks, because access to the memory will be built in only four banks (for example, let your calls all be even numbers, starting from 0, then your appeals will be arranged in banks 0,2,4,6).

In fact, this formula even fails for the example below. Suppose we have 8 memory banks with a busy time of 6 clock cycles, with a total memory delay of 12 clock cycles, how long will it take to complete the 64-element vector load in steps of 1? - Here they calculate the time as 12 + 64 = 76 measures. However, a memory bank conflict will occur in accordance with a given condition, so we obviously cannot have one access per cycle (64 in the equation).

Am I really wrong, or did the wrong formula survive in the 5 editions of this book (unlikely)?

+7
vectorization cpu-architecture hardware processor
source share
1 answer

It should include GCD (banks, step); your argument that this is correct.

Try this for several different steps and see what happens for the number of banks = b = 8 .

 # generated with the calc(1) function define f(s) { print s, " | ", lcm(s,8), " | ", gcd(s,8), " | ", 8/lcm(s,8), " | ", 8/gcd(s,8) }` stride | LCM(s,b) | GCF(s,b) | b/LCM(s,b) | b/GCF(s,b) 1 | 8 | 1 | 1 | 8 # 8 < 6 = false: no conflict 2 | 8 | 2 | 1 | 4 # 4 < 6 = true: conflict 3 | 24 | 1 | ~0.333 | 8 # 8 < 6 = false: no conflict 4 | 8 | 4 | 1 | 2 # 2 < 6 = true: conflict 5 | 40 | 1 | 0.2 | 8 6 | 24 | 2 | ~0.333 | 4 7 | 56 | 1 | ~0.143 | 8 8 | 8 | 8 | 1 | 1 9 | 72 | 1 | ~0.111 | 8 x >=8 2^0..3 <=1 1 2 4 or 8 

b / LCM (s, b) is always <= 1, so it always predicts conflicts.

I think that GCF (aka GCD) looks right for the step values โ€‹โ€‹that I have looked so far. You have a problem only if the step does not distribute the circulation among all banks and that is indicated by b / GCF (s, b).


Stride = 8 must be the worst using the same pot every time. gcd (8,8) = lcm (8,8) = 8. Thus, both expressions give 8/8 = 1, which is less than the time of bank occupation / recovery, which allows you to correctly predict conflicts.

Stride = 1, of course, the best case (no conflicts, if there are enough banks to hide the busy time). gcd (8,1) = 1 does not correctly predict conflicts: (8/1 = 8, which is at least 6). lcm (8,1) = 8. ( 8/8 < 6 true) incorrectly predicts conflicts.

+3
source share

All Articles