What is a "false separation"? How to reproduce / avoid?

Question

What is a "false separation"? How to reproduce / avoid?

Today I realized that with my professor in the Parallel Programming class they understand what “false separation” is. What my professor said does not make sense, so I immediately pointed it out. She thought that “false separation” would lead to a program error.

I said that “false separation” occurs when another memory address is assigned to the same cache line, writing data to one of them will cause the other to exit the cache. If processors write between two false shared addresses, rotate them and rotate, both of them will not be able to remain in the cache, so all operations will result in DRAM access.

This is my opinion so far. In fact, I'm not quite sure what I said ... If I have a misunderstanding, just indicate this, please.

So there are a few questions. The cache is supposed to be consistent with 64 bytes, 4-band associative-associative.

Is it possible that two addresses separated by more than 64 bytes are "false sharing"?
Is it possible that a single-threaded program is faced with the problem of "false exchange"?
What is the best code example for playing a “fake exchange”?
In general, what should be noted in order to avoid a “false exchange” for programmers?

+6

optimization caching parallel-processing computer-architecture false-sharing

Aean Mar 31 '14 at 15:49

source share

1 answer

chrk · Accepted Answer · 2014-11-19T16:29:19+0000

I shared my point of view with your questions.

Two addresses separated by more bytes than the block size will not be on the same cache line. Thus, if the kernel has the first address in its cache, and the other core requests the second address, the first one will not be removed from the cache because of this request. Thus, an erroneous omission of sharing will not occur.
I can not imagine how false sharing will occur in the absence of concurrency, since there will be no one but a single stream to compete for the cache line.

Taking it here using OpenMP, a simple example for playing a fake exchange would be:

double sum=0.0, sum_local[NUM_THREADS]; #pragma omp parallel num_threads(NUM_THREADS) { int me = omp_get_thread_num(); sum_local[me] = 0.0; #pragma omp for for (i = 0; i < N; i++) sum_local[me] += x[i] * y[i]; #pragma omp atomic sum += sum_local[me]; }

Some general notes that I can think of in order to avoid a false exchange would be as follows:
a. Use personal data whenever possible.
b. Sometimes you can use padding to align data to make sure no other variables are in the same cache as the shared data.

Any corrections or additions are welcome.

What is a "false separation"? How to reproduce / avoid?

More articles: