Is an expression with undefined behavior that never actually executes makes the program erroneous?

In many discussions about the behavior of undefined (UB), it was suggested that in the simple presence of any construct in the program that has UB in the program, the mandate of the corresponding implementation does anything (including nothing). My question is whether this should be done in this sense even in cases where UB is associated with code execution, while the behavior (otherwise) specified in the standard stipulates that this code should not be executed (and this is possible for specific input to the program, it may not be solvable at compile time).

Clearly more informal, the smell of UB requires a coordinated implementation in order to decide that the whole program stinks, and even those parts of the program for which the behavior is well defined refuse to execute correctly. Program Example:

#include <iostream> int main() { int n = 0; if (false) n=n++; // Undefined behaviour if it gets executed, which it doesn't std::cout << "Hi there.\n"; } 

For clarity, I assume that the program is well-formed (therefore, in particular, UB is not associated with preprocessing). In fact, I am ready to limit the UB associated with “ratings” that are clearly not compilation objects. The definitions related to the given example, I think, (emphasis mine):

After that, an asymmetric, transitive, pairwise relation between the estimates made by one thread (1.10) is performed, which causes a partial order among those estimates

Calculating the values ​​of the operands, the operator is ordered before calculating the value of the result of the operator. If a side effect on a scalar object is independent of ... or calculating a value using the value of the same scalar object, the behavior is undefined.

It is implicitly clear that the subjects in the last sentence, “side effect” and “calculation of values” are examples of “evaluation”, since this definition of “ordered to” is defined for.

I believe that in the above program, the standard provides that there are no evaluations for which the condition in the final sentence is fulfilled (illogical with respect to each other and the type being described), and that the program does not have UB; This is not a mistake.

In other words, I am convinced that the answer to the question about my name is negative. However, I would appreciate the (motivated) opinions of other people on this issue.

Maybe an additional question for those who favor an affirmative answer, will it be a mandate that reformatting your hard drive may occur when collecting an erroneous program?

Some related pointers on this site:

  • Observed behavior and undefined behavior - What happens if I don't call the destructor?
  • Comments on this answer https://stackoverflow.com>
  • C ++ What early undefined behavior might show up?
  • The difference between undefined Behavior and improper formation, does not require a diagnostic message and its two responses that represent opposing points of view
+17
c ++ undefined-behavior language-lawyer
Jun 12 '14 at 14:15
source share
8 answers

If the side effect of the scalar object is independent of etc

Side effects are changes in the state of the execution environment (1.9 / 12). Change is a change, not an expression, which, if evaluated, could potentially lead to a change. If there is no change, there is no side effect. If there is no side effect, then no side effect has anything to do with anything else.

This does not mean that any code that never runs is UB-free (although I'm sure most of it). Each occurrence of UB in the standard should be considered separately. (The exhausted text is probably too careful, see below).

The standard also states that

The corresponding implementation, executing a well-formed program, should provide the same observable behavior as one of the possible executions of the corresponding instance of an abstract machine with the same program and the same input. However, if any such execution contains an undefined operation, this International Standard does not require the implementation of this program to be executed with this input (not even for operations preceding the first undefined operation).

(my accent)

This, as far as I can tell, is the only normative reference which says that the expression "undefined behavior" means: undefined operation during program execution. No execution, no UB.

+8
Jun 12 '14 at 15:02
source share

No. Example:

 struct T { void f() { } }; int main() { T *t = nullptr; if (t) { t->f(); // UB if t == nullptr but since the code tested against that } } 
+6
Jun 12 '14 at 15:44
source share

The decision about whether the program will perform integer division by 0 (this is UB) is generally equivalent to a stop task. The compiler cannot determine this in general. Thus, the mere presence of a possible UB cannot logically affect the rest of the program: requiring this in the standard will require each compiler vendor to provide a solution to the problem of stopping the compiler.

Even easier The following program has UB only if the user enters 0:

 #include <iostream> using namespace std; auto main() -> int { int x; if( cin >> x ) cout << 100/x << endl; } 

It would be absurd to claim that this program in itself has UB.

However, as soon as undefined behavior occurs, then everything can happen: further execution of the code in the program is then compromised (for example, the stack may be dirty).

+5
Jun 12 '14 at 18:40
source share

In general, the best we can say here is that it depends.

In one case, when the answer is no, it happens when working with undefined values. The last project explicitly makes its undefined behavior to get an indefinite value during evaluation, with some exceptions, but the code example clearly shows how thin it can be:

[Example:

 int f(bool b) { unsigned char c; unsigned char d = c; // OK, d has an indeterminate value int e = d; // undefined behavior return b ? d : 0; // undefined behavior if b is true } 

- end of example]

so this line of code:

 return b ? d : 0; 

is only undefined if b is true . This seems to be an intuitive approach, and it seems that John also sees it if we read His time to get some serious information about using undefined Behavior .

In this case, yes, the code is erroneous even if we do not call the code that causes undefined behavior:

 constexpr const char *str = "Hello World" ; constexpr char access() { return str[100] ; } int main() { } 

clang selects the error access error, even if it is never called ( see it live ).

+3
Jun 12 '14 at 16:31
source share

There is a clear distinction between innate undefined behavior, such as n = n ++, and code that can determine behavior or undefined depending on the state of the program at runtime, for example x / y for int. In the latter case, the program should work if y is 0, but in the first case, the compiler asked to generate code that is completely illegitimate - he can refuse to compile within his rights, he may simply not be “bulletproof” against this and, therefore, his condition optimizer (register allocation, records about what values ​​can be changed after reading, etc.) are damaged, which leads to dummy machine codes for this and the surrounding source code. Perhaps the early analysis recognized the situation “a = b ++” and generated the code for the previous one if you jump over the two-byte instruction, but when n = n ++ is encountered, no command is output, so the if statement jumps somewhere in the following codes operations. Anyway, this is just a game. Putting “if” in front or even wrapping it with another function is not documented as “containing” undefined behavior ... code bits are not corrupted by undefined behavior - the standard says “the program has undefined behavior”.

+3
Jun 13 '14 at 1:15
source share

It should be, if not a "must."

Behavior by definition from ISO C (the corresponding definition is not found in ISO C ++, but it should somehow apply):

3.4

1 behavior

appearance or action

And UB:

WG21 / N4527

1.3.25 [defns.undefined]

undefined behavior

for which this International Standard does not impose requirements [Note: Undefined behavior can be expected if there is no explicit definition of behavior in this International Standard or when a program uses an erroneous construction or erroneous data. Acceptable Undefined behavior varies from completely ignoring the situation with unpredictable results, maintaining during the translation or execution of the program in a documented manner specific to the environment (with or without a diagnostic message), to stop translating or executing (with the delivery of a diagnostic message). Many erroneous software constructs do not generate Undefined behavior; they must be diagnosed. -end note]

Despite the "behave during translation" above, the word "behavior" used by ISO C ++ mainly refers to program execution.

WG21 / N4527

1.9 Program Execution [intro.execution]

1 The semantic descriptions in this International Standard define a parameterized non-deterministic abstract machine. There are no requirements for the structure of the corresponding implementations in this International Standard. In particular, they do not need to copy or emulate the structure of an abstract machine. Rather, appropriate implementations are needed to emulate (only) the observed behavior of an abstract machine, as described below .5

2 Some aspects and operations of an abstract machine are described in this International Standard as an implementation (for example, sizeof(int) ). They make up the parameters of an abstract machine. Each implementation should include documentation describing its characteristics and behavior in these relationships .6 Such documentation should determine the instance of the abstract machine that corresponds to this implementation (referred to as the “corresponding instance” below).

3 Some other aspects and operations of an abstract machine are described in this International Standard as undefined (for example, evaluating expressions in a new-initializer if the allocation function does not allocate memory (5.3.4)). Where possible, this International Standard defines a set of acceptable behaviors. They define the non-deterministic aspects of an abstract machine. Thus, an abstract machine instance can have more than one possible execution for a given program and a given input.

4 Some other operations are described in this International Standard as Undefined (for example, the effect of trying to modify a const object). [Note. This International Standard does not establish the requirements for the behavior of programs containing Undefined behavior. -end note]

5 The corresponding implementation executing a well-formed program should produce the same observable behavior as one of the possible executions of the corresponding instance of an abstract machine with the same program and the same input. However, if any such execution contains an Undefined operation, this international standard does not require the implementation of this program to be executed with this input (even with respect to operations preceding the first Undefined operation).

5) This provision is sometimes called the “as is” rule, because the implementation can ignore any requirement of this International Standard if the result is as if this requirement were met, as far as possible it is determined by the observed behavior of the program. For example, the actual implementation should not evaluate part of the expression if it can infer that its value is not used and that no side effects affecting the observed behavior of the program are produced.

6) Conditionally supported constructs and locale-specific behavior are also included in this documentation. See 1.4.

It is clear that the behavior of Undefined was caused by a specific language construct used incorrectly or in a non-portable way (which does not comply with the standard). However, the standard does not mention anything about which specific piece of code in the program will call it. In other words, “with Undefined behavior” is a property (of conformity) of the entire executable program, and not its smaller parts .

The standard could give a stronger guarantee to make the behavior correct as soon as some specific code is not executed, only when there is a way to precisely match the C ++ code with the corresponding behavior. It is difficult (if not impossible) without a detailed semantic model of implementation. In short, the operational semantics given by the above abstract machine model are not enough to achieve a stronger guarantee . But in any case, the ISO C ++ will never be a JVMS or ECMA-335. And I do not expect that there will be a complete set of formal semantics describing the language.

The key issue here is the meaning of "execution." Some people think that “executing a program” means starting a program. This is not entirely true. Note that the representation of the program running in the abstract machine is not specified. (Also note that “this international standard does not require an implementation conformance structure.”) The code executed here may literally be C ++ code (not necessarily machine code or some other form of intermediate code that is not specified by the standard at all). This effectively allows you to use the main language as an interpreter, an online partial evaluator, or some other monsters that translate C ++ code on the fly. As a result, there is actually no way to completely separate the translation phases (defined by ISO C ++ [lex.phases]) completely before the execution process without knowledge of specific implementations. Therefore, it is necessary to resolve the UB that occurs during translation when it is too difficult to specify portable, well-defined behavior.

In addition to the above problems, perhaps for most ordinary users, one (non-technical) reason is enough: you just don’t need to provide a stronger guarantee, resolve bad code and win one of the (most likely most important) aspects of the utility UB itself: to encourage quickly discarding some (unnecessarily) intolerable smelly code effortlessly to “fix” them, which will ultimately be in vain.

Additional notes:

Some words are copied and restored from one of my answers to this comment .

+1
Sep 07 '15 at 8:52
source share

The AC compiler allows you to do whatever he likes as soon as the program enters a state through which there is no defined sequence of events that would allow the program to avoid calling Undefined Behavior at some point in the future (pay attention to any loop that does not have side effects and which does not have an exit condition that the compiler should have recognized causes Undefined Behavior on its own). The compiler’s behavior in such cases is related to the laws of neither time nor causality. In situations where an Undefined Behavior occurs in an expression whose result is never used, some compilers do not generate any code for the expression (therefore, it will never "execute"), but this will not prevent compilers from using Undefined Behavior to draw other conclusions about the behavior programs.

For example:

 void maybe_launch_missiles(void) { if (should_launch_missiles()) { arm_missiles(); if (should_launch_missiles()) launch_missiles(); } disarm_missiles(); } int foo(int x) { maybe_launch_missiles(); return x<<1; } 

According to the current C standard, if the compiler could determine that disarm_missiles() would always return without completion, but the three other external functions that call above could exit, the most efficient standard-compatible replacement for the foo(-1); operator foo(-1); (return value is ignored) will should_launch_missiles(); arm_missiles(); should_launch_missiles(); launch_missiles(); should_launch_missiles(); arm_missiles(); should_launch_missiles(); launch_missiles(); .

The behavior of the program will be determined only if the call to should_launch_missiles() ends without return, if the first call returns a non-zero value and arm_missiles() ends without return, or if both calls return non-zero and launch_missiles() ends without returning. A program that works correctly in these cases will adhere to the standard no matter what it does in any other situation. If returning from maybe_launch_missiles() results in an Undefined Behavior, the compiler will not need to recognize the possibility that any call to should_launch_missiles() may return zero.

As a result, some modern compilers, the effect of shifting the left negative number can be the worst of all, which can be caused by any type of Undefined Behavior on a typical C99 compiler on platforms that separate code and data spaces and trap stack overflows. Even if the code is involved in Undefined Behavior, which can cause a random transfer of control, there would be no means by which it could call successively arm_missiles() and launch_missiles() without an intermediate call to disarm_missiles() , if at least one call should_launch_missiles() returns a nonzero value. However, a hypermoder compiler may deny such protections.

0
Apr 19 '15 at 19:04
source share

In the context of an embedded system critical to security, the published code will be considered defective:

  • Code must not pass code verification and / or standards compliance (MISRA, etc.).
  • Static analysis (lint, cppcheck, etc.) should mark this as a defect
  • Some compilers may flag this as a warning (also implying a defect.)
-2
12 . '14 15:23
source share



All Articles