Is conversion from pointer to type to pointer to an array of type safe?

A few days ago, I came across a code that widely uses the conversion of a type pointer into a type array to give a two-dimensional view of a linear vector in memory. The following is a simple example of such a technique:

#include <stdio.h> #include <stdlib.h> void print_matrix(const unsigned int nrows, const unsigned int ncols, double (*A)[ncols]) { // Here I can access memory using A[ii][jj] // instead of A[ii*ncols + jj] for(int ii = 0; ii < nrows; ii++) { for(int jj = 0; jj < ncols; jj++) printf("%4.4g",A[ii][jj]); printf("\n"); } } int main() { const unsigned int nrows = 10; const unsigned int ncols = 20; // Here I allocate a portion of memory to which I could access // using linear indexing, ie A[ii] double * A = NULL; A = malloc(sizeof(double)*nrows*ncols); for (int ii = 0; ii < ncols*nrows; ii++) A[ii] = ii; print_matrix(nrows,ncols,A); printf("\n"); print_matrix(ncols,nrows,A); free(A); return 0; } 

Given that the pointer for the type is incompatible with the pointer for the type array , I would like to ask if there are risks associated with this casting, or if I can assume that this casting will work as intended on any platform.

+7
source share
4 answers

It is guaranteed that a multidimensional array T arr[M][N] has the same memory layout as a one-dimensional array with the same total number of elements T arr[M * N] . The layout is the same because the arrays are adjacent (6.2.5p20), and because sizeof array / sizeof array[0] guaranteed to return the number of elements in the array (6.5.3.4p7).

However, it does not follow from this that it is safe to point to a pointer to a type of a pointer to an array of type or vice versa. Firstly, the alignment problem; although an array of a type with fundamental alignment must also have fundamental alignment (at 6.2.8p2), it is not guaranteed that the alignments are the same. Since the array contains objects of the base type, the alignment of the type of the array should be no less strict than the alignment of the type of the base object, but it can be more strict (not that I have ever seen such a case). However, this does not apply to allocated memory, since malloc guaranteed to return a pointer suitable for any fundamental alignment (7.22.3p1). This means that you cannot safely pointer to automatic or static memory on an array pointer, although the opposite is permitted:

 int a[100]; void f() { int b[100]; static int c[100]; int *d = malloc(sizeof int[100]); int (*p)[10] = (int (*)[10]) a; // possibly incorrectly aligned int (*q)[10] = (int (*)[10]) b; // possibly incorrectly aligned int (*r)[10] = (int (*)[10]) c; // possibly incorrectly aligned int (*s)[10] = (int (*)[10]) d; // OK } int A[10][10]; void g() { int B[10][10]; static int C[10][10]; int (*D)[10] = (int (*)[10]) malloc(sizeof int[10][10]); int *p = (int *) A; // OK int *q = (int *) B; // OK int *r = (int *) C; // OK int *s = (int *) D; // OK } 

Further, it is not guaranteed that casting between types of arrays and non-arrays actually leads to a pointer to the correct location , since custom rules (6.3.2.3p7) do not cover this usage. It is very unlikely that this will lead to anything other than a pointer to the correct location, and char * listing has guaranteed semantics. Moving from a pointer to an array type to a pointer to a base type, itโ€™s better to just indicate the pointer indirectly:

 void f(int (*p)[10]) { int *q = *p; // OK assert((int (*)[10]) q == p); // not guaranteed assert((int (*)[10]) (char *) q == p); // OK } 

What is semantics of array text signature? As you know, operation [] is just syntactic sugar for adding and indirectness, therefore semantics are the words of the + operator; as described by 6.5.6p8, the operand of the pointer must point to an element of the array that is large enough so that the result falls into the array or only at the end. This is a problem for throws in both directions; when you click on the pointer to the type of array, the addition is not valid , because in this place there is no multidimensional array; and when going to the pointer to the base type, the array in this place has only the size of the internal binding of the array:

 int a[100]; ((int (*)[10]) a) + 3; // invalid - no int[10][N] array int b[10][10]; (*b) + 3; // OK (*b) + 23; // invalid - out of bounds of int[10] array 

Here we begin to see actual problems with common implementations , not just theory. Since the optimizer has the right to assume that undefined behavior does not occur, access to a multidimensional array through a pointer to the base object can be considered not an alias of any elements except those that are in the first internal array:

 int a[10][10]; void f(int n) { for (int i = 0; i < n; ++i) (*a)[i] = 2 * a[2][3]; } 

The optimizer may assume that access to a[2][3] not an alias (*a)[i] and pulls it outside the loop:

 int a[10][10]; void f_optimised(int n) { int intermediate_result = 2 * a[2][3]; for (int i = 0; i < n; ++i) (*a)[i] = intermediate_result; } 

This, of course, will give unexpected results if f is called with n = 50 .

Finally, it is worth asking if this applies to allocated memory. 7.22.3p1 indicates that the pointer returned by malloc "can be assigned to a pointer to any type of object with a fundamental alignment requirement, and then used to access such an object or an array of such objects in the allocated space"; there is nothing about further casting the returned pointer to another type of object, so the conclusion is that the type of allocated memory is fixed by the first type of pointer to which the returned void pointer is returned; if you click on double * , then you cannot add it to double (*)[n] , and if you add to double (*)[n] , you can only use double * to access the first elements of n .

Thus, I would say that if you want to be absolutely safe , you should not point between a pointer and a pointer to array types, even with the same base type. The fact that the layout is the same doesn't matter except memcpy and other accesses with the char pointer.

+1
source

UPDATE : strikethrough part - it's true, but it does not matter.

As I wrote in a comment, the question is really in a two-dimensional array, subarrays (strings) contain inner padding. Each line should not have additions, since the standard defines arrays that should be adjacent. In addition, the external array must not contain indentation. Actually, scanning according to the C standard, I do not find mention of addition in the context of arrays, therefore I interpret "adjacent" to mean that at the end of the subarray there was never a gasket inside the multidimensional array. sizeof(array) / sizeof(array[0]) guaranteed to return the number of elements in the array, there can be no such addition.

This means that the layout of the multidimensional array of nrows columns and nrows columns should be the same as for the 1-d nrows * ncols . Thus, to avoid an incompatible type error, you could do

 void *A = malloc(sizeof(double[nrows][ncols])); // check for NULL double *T = A; for (size_t i=0; i<nrows*ncols; i++) T[i] = 0; 

then go to print_array . This should avoid the potential trap of smoothing the pointer; pointers of different types cannot point to the same array if at least one of them is of type void* , char* or unsigned char* .

+2
source

Standard C allows you to convert a pointer to an object (or incomplete) to a pointer to another object (or incomplete) type.

There are a few caveats:

  • if the resulting pointer is incorrectly aligned, the behavior is undefined. The standard does not guarantee that in this case. In fact, this is unlikely.

  • the standard indicates only one actual use of the resulting pointer and its conversion to the original type of pointer. In this case, the standard guarantees the latter (the resulting pointer, converted back to the original type of pointer) will be compared with the original pointer. Using the resulting pointer for anything else does not extend to the standard.

  • the standard requires explicit casting when performing such transformations, which is absent in the print_matrix function print_matrix in the code you sent.

So, according to the letter of the standard, the use of code in a sample goes beyond its scope. In practice, however, this is likely to work well on most platforms - the compiler is supposed to allow this.

+1
source

My first thought here is that C really uses this implementation when creating a 2D array, that is, it expands the memory linearly:

 [11, 12, 13, 14, 15, 21, 22, 23, 24, 25....] // This is known as ROW-MAJOR form 

the way it stands out in your code

 A = malloc(rows*columns); 

As such, I see no harm in this, since A is a pointer to double, and "inner-C" actually converts A [] [] to a pointer to double (NOTE: this is not true for a pointer to pointers! *), Therefore there is no difference.

 * A = malloc ( rows ); for_each_Ai ( Ai = malloc (columns) ); 

^ all code pseudo codes are obviously

As for your part of platform independence, this code should be ok. However, if they also do other vile pointer things, beware of judgment.

0
source

All Articles