Embedded C - Using "volatile" to Confirm Consistency

Consider the following code:

// In the interrupt handler file: volatile uint32_t gSampleIndex = 0; // declared 'extern' void HandleSomeIrq() { gSampleIndex++; } // In some other file void Process() { uint32_t localSampleIndex = gSampleIndex; // will this be optimized away? PrevSample = RawSamples[(localSampleIndex + 0) % NUM_RAW_SAMPLE_BUFFERS]; CurrentSample = RawSamples[(localSampleIndex + 1) % NUM_RAW_SAMPLE_BUFFERS]; NextSample = RawSamples[(localSampleIndex + 2) % NUM_RAW_SAMPLE_BUFFERS]; } 

My intention is that PrevSample , CurrentSample and NextSample consistent, even if gSampleIndex updated during a call to Process() .

Will the localSampleIndex assignment trick, or is it likely that it will be optimized even if gSampleIndex is mutable?

+8
c volatile embedded
source share
2 answers

In your function, you only access the volatile variable once (and this is the only volatile one in this function), so you donโ€™t have to worry about code reorganization that the compiler can execute (and prevents volatile ). What standard says for these optimizations in 5.1.2.3:

In an abstract machine, all expressions are evaluated according to semantics. An actual implementation should not evaluate part of an expression if it can infer that its value is not used and that the necessary side effects are not created (including those caused by a function call or access to a mutable object).

Pay attention to the last sentence: "... the necessary side effects are not created (... access to the unstable object)."

Just volatile won't let the optimization compiler get around this code. Just to mention a few: no command reorders other volatile variables. without deleting an expression, without caching, without spreading values โ€‹โ€‹across functions.

BTW I doubt that any compiler can break your code (with or without volatile ). It is possible that the local variable of the stack will be changed, but the value will be stored in the registry (for sure it will not re-access the memory cell). What you need volatile for is the visibility of values.

EDIT

I think some clarification is needed.

Let me safely assume that you know what you are doing (you work with interrupt handlers, so this should not be your first C program): the processor word matches the type of your variable, and the memory is correctly aligned.

Let me also suggest that your interrupt is not reentrant (some magic cli / sti stuff or something that your processor uses for this), unless you are planning some kind of debugging and custom procedure.

If these assumptions are fulfilled, then you do not need atomic operations . What for? Since localSampleIndex = gSampleIndex is atomic (because it is correctly aligned, the word size is the same and it is volatile ), with ++gSampleIndex there is no race condition ( HandleSomeIrq will no longer be called while it is still running). More than useless , they are mistaken .

You might think: โ€œWell, I might not need atomic, but why can't I use them? Even if such an assumption is fulfilled, it will be * extra *, and it will achieve the same goal.โ€ No it's not . Atomic does not have the same semantics of volatile variables (and rarely volatile used / should be used for memory I / O and signal processing). Volatile (usually) useless with atomic ones (unless a specific architecture talks about it), but it has a big difference: visibility. When you update gSampleIndex in HandleSomeIrq , the standard ensures that the value is immediately visible to all streams (and devices). with standard atomic_uint ensures that it will be visible for a reasonable amount of time.

To make it short and clear: volatile and atomic are not the same thing. . Atomic operations are useful for concurrency, volatile are useful for lower-level things (interrupts, devices). If you still think โ€œhey, they do exactly what I need,โ€ please read some useful links selected from the comments: cache coherence and a nice read about atomatics.

Summarizing:
In your case, you can use an atomic variable with a lock (have both atomic access and a visibility value), but no one on this earth would put a lock inside the interrupt handler (if absolutely definitely, no doubt there was no need, and from the code that you sent is not your case).

0
source share

Basically, volatile not enough to ensure that Process sees only consistent gSampleIndex values. In practice, however, you should not run into any problems if uinit32_t directly supported by the hardware. The correct solution would be to use atomic conversions.

Problem

Suppose you are working in a 16-bit architecture, so the instruction

 localSampleIndex = gSampleIndex; 

compiled into two instructions (loading the upper half, loading the lower half). Then an interrupt can be called between two instructions, and you get half the old value in combination with half the new value.

Decision

The solution is to access gSampleCounter using atomic operations only. I know three ways to do this.

C11 atomics

In C11 (supported with GCC 4.9), you declare your variable as atomic:

 #include <stdatomic.h> atomic_uint gSampleIndex; 

Then you only care about accessing the variable using documented atomic interfaces. In the IRQ handler:

 atomic_fetch_add(&gSampleIndex, 1); 

and in the function Process :

 localSampleIndex = atomic_load(gSampleIndex); 

Do not worry about options for _explicit atomic functions if you are not trying to scale your program to a large number of cores.

GCC atoms

Even if your compiler does not yet support C11, it probably has some support for atomic operations. For example, in GCC you can say:

 volatile int gSampleIndex; ... __atomic_add_fetch(&gSampleIndex, 1, __ATOMIC_SEQ_CST); ... __atomic_load(&gSampleIndex, &localSampleIndex, __ATOMIC_SEQ_CST); 

As above, do not worry about weak consistency if you are not trying to achieve good scaling behavior.

Realization of atomic operations themselves

Since you are not trying to protect against simultaneous access from several cores, just with race conditions with an interrupt handler, you can implement a matching protocol using only standard C primitives. The Decker algorithm is the oldest known such protocol.

+8
source share

All Articles