Are there C and C ++ standards that a special value in the address space must exist solely to represent the value of null pointers?

Following the discussion of this question about null pointers in C and C ++, I would like the final question to be split here.

If this can be done from C and C ++ standards (answers can target both standards) that dereference a pointer variable whose value is equal to nullptr (or (void *)0 ) undefined, does this mean that these languages ​​require so that the special value in the address space is dead, which means that it is unusable, except for the role of the nullptr representation? What if the system has a really useful function or data structure at the same address as nullptr ? Should this never happen, because the responsibility of the compiler is to find out the non-conflicting value of the null pointer for each system compiled by the compiler? Or should a programmer who needs to access such a function or data structure be content when programming in "w91> behavior mode" to achieve his goals?

This is similar to blurring the lines of compiler and computer system roles. I would ask if this is right, but I think there is no room for this.

This blog post digs about solving a problem situation.

0
source share
4 answers

Does this mean that these languages ​​require that the special value in the address space be dead, which means that it is unusable, except for the role of the nullptr representation?

Not.

The compiler needs a special value to represent a null pointer and must take care that it does not place any object or function at this address, because all pointers to objects and functions must compare unequal with a null pointer. The standard library should take similar precautions when implementing malloc and friends.

However, if there is already something at this address, then to which no strictly appropriate program has access, then implementations are allowed to support dereferencing the null pointer to access it. Highlighting the null pointer is undefined in standard C, so an implementation can make it do whatever it likes, including the obvious.

Both C standards and C ++ understand the concept of the as-if rule, which basically means that if a valid input, the implementation is indistinguishable from one that conforms to the standard, then it conforms to the standard. Standard C uses a trivial example:

5.1.2.3 program execution

10 EXAMPLE 2 Execution of a fragment

 char c1, c2; /* ... */ c1 = c1 + c2; 

"whole stocks" require an abstract machine to raise the value of each variable to an int , and then add two ints and truncate the sum. If adding two char can be performed without overflow or without overflow, in order to get the correct result, the actual execution should produce only the same result, possibly omitting promotions.

Now, if the values c1 and c2 come from registers and allow you to force change values ​​outside the char range in these registers (for example, the built-in assembly), then the fact that the implementation optimizes an integer number of promotions can be noticeable. However, since the only way to observe this is through undefined behavior or implementation extensions, there is no way for any standard code to be affected by this, and the implementation can do this.

This is the same logic that is used to obtain useful results when dereferencing null pointers: there are only two ways to see from the code that there is something significant at this specific address: getting a null pointer from a guaranteed estimate, create a pointer to the object or just try it . The first is what the compiler mentioned, and the standard library should take care. The latter is not something that can affect a valid standard program.


A well-known example is the interrupt vector table for DOS implementations, which is located at the zero address. It is usually accessed simply by dereferencing a null pointer. C and C ++ standards do not support and cannot cover access to the table of interrupt vectors. They do not define this behavior, but they do not restrict its access. Implementations should be and be able to provide extensions to access them.

+2
source

It depends on what is meant by the phrase "address space". Standard C uses this information informally, but does not define what it means.

For each type of pointer, there must be a value (null pointer) that is not evenly compared with a pointer to any object or function. This means, for example, that if the type of the pointer has a width of 32 bits, then there can be no more than 2 32 -1 valid non-empty values ​​of this type. There may be fewer if some addresses have more than one representation, or if not all representations correspond to real addresses.

So, if you define an “address space” to cover 2 N different addresses, where N is the width in bits of the pointer, then yes, one of these values ​​should be reserved as a zero value of the pointer.

On the other hand, if the "address space" is narrower than that (for example, typical 64-bit systems cannot actually access different memory locations 2 64 ), then the reserved value since the null pointer can easily be outside the "address space".

Some notes:

  • The null pointer representation may or may not be a null bit.
  • Not all types of pointers are necessarily the same size.
  • Not all pointer types necessarily use the same representation for a null pointer.

In most modern implementations, all types of pointers are the same size, and they all represent a null pointer as all-bits-zero, but there are good reasons, for example, for function pointers to be wider than object pointers, or make void* wider than int* , or use a representation other than all-bits-zero for a null pointer.

This answer is based on the C standard. Most of them also apply to C ++. (One difference is that C ++ has types from pointer to element, which are usually wider than regular pointers.)

+5
source

Yes, that’s exactly what it means.

[C++11: 4.10/1]: [..] The null pointer constant can be converted to a pointer type; the result is a null value of a pointer of this type and is different from any other value of an object pointer or type of a function pointer. [..]

The null pointer value should not be 0x00000000 , but it must be unique; There is no other way to make this rule work.

This, of course, is not the only rule of the abstract machine that implicitly imposes strict restrictions on practical implementations.

What if the OS creates a really useful function or data structure at the same address as nullptr?

The OS will not do this, but can be used .

+2
source

Does this mean that these languages ​​require that the special value in the address space be dead, which means that it is unusable, except for the role of the nullptr representation?

Yes.

C has requirements for a null pointer that make it different from object pointers:

(C11, 6.3.2.3p3) "[...] If a null pointer constant is converted to a pointer type, the resulting pointer, called the null pointer, is guaranteed to compare unevenly with a pointer to any object or function."

What if the system has a really useful function or data structure at the same address as nullptr? If this never happens because the compiler’s responsibility is to find out if the null pointer is consistent for each system, does the compiler compile?

The new C standard from Derek M. Jones provides the following implementation comment:

All null bits are a convenient representation of the runtime of the null pointer constant for many implementations, since it is always the lowest address in the repository. (The INMOS transporter [632] signed the address space, which placed zero in the middle.) Although there may be information about the program’s boot program in this place, it is unlikely that any objects or functions will be placed here. Many operating systems do not use this storage location because experience has shown that program errors sometimes cause it to be written to the location indicated by the null pointer constant (more developer-oriented environments to raise exceptions when accessing this location).

Another implementation method, when the host environment does not include the address zero as part of the address space, is to create an object (sometimes called _ _null) as part of the standard library. All references to the null pointer constant refer to this object, the address of which will be compared unevenly with any other object or function.

+2
source

Source: https://habr.com/ru/post/1213574/


All Articles