The compiler should only not change the meaning of the program.
The value of the program is determined by the semantics of its C operators, for example, a volatile counter is a way to refinance, on a semantic level, interaction with external agents.
volatile alone, however, is useless when it comes to thread synchronization, it only has local effects.
Thus, you should understand only what the C standard implies, if the standard does not specify ordering semantics for the operator or any side effect, then it is not.
To optimize the code, the compiler must prove that optimization does not change the meaning.
This is usually a complex (or even insoluble) problem, so it is only performed in a simple context.
Consider
The C standard states that a = simple(2, 3); holds until b = simple(3, 4); because the end of the expression is a point in the sequence.
This is gcc code with full optimization.
lea 0x18f0(%rip),%rcx # 0x100403030, "%d %d\n" mov $0x62,%r8d mov $0x32,%edx callq 0x100401110 <printf>
I used cygwin, so ABI is Windows. It is equivalent
printf("%d %d\n", 50, 98);
This is an ad hoc example, the simple function is clean and accepts compile-time constant expressions , so the result is known at compile time.
This is proof that gcc is needed to optimize calls .
When writing code without locking, you don’t have to worry about compiler optimization at all if you use the correct semantics (e.g. volatile for read and write access as side effects only for optimization ).
What you really need to worry about is memory order , as stated in my comment.
C11 finally confirms all this in its memory model.