How does C get the array offset to the right for an array of strings?

I am doing something for a class where I want to use a different format string based on certain conditions. I defined it like this:

const char *fmts[] = {"this one is a little long", "this one is short"}; 

later i can use

 printf(fmts[0]); 

or

 printf(fmts[1]); 

and it works.

Does the compiler do something for us? I guess this will take the longest line and save all of them that way. But I would like to know someone who knows. Thanks

+4
source share
6 answers

He does it the same way as for any other data type. The array of "strings" is actually an array of pointers to characters that are the same size. Thus, in order to get the correct address for the pointer, it multiplies the index by the size of an individual element, and then adds it to the base address.

Your array will look like this:

  <same-size> +---------+ fmts: | fmts[0] | ------+ +---------+ | | fmts[1] | ------|--------------------------+ +---------+ | | VV this one is a little long\0this one is short\0 

Characters for the strings themselves are not stored in the array, they exist elsewhere. As you have, they are usually stored in read-only memory, although you can also malloc them or even define them as a modifiable array of characters with something like:

 char f0[] = "you can modify me without invoking undefined behaviour"; 

You can see this in action with the following code:

 #include<stdio.h> const char *fmts[] = { "This one is a little long", "Shorter", "Urk!" }; int main (void) { printf ("Address of fmts[0] is %p\n", (void*)(&(fmts[0]))); printf ("Address of fmts[1] is %p\n", (void*)(&(fmts[1]))); printf ("Address of fmts[2] is %p\n", (void*)(&(fmts[2]))); printf ("\n"); printf ("Content of fmts[0] (%p) is %c%c%c...\n", (void*)(fmts[0]), *(fmts[0]+0), *(fmts[0]+1), *(fmts[0]+2)); printf ("Content of fmts[1] (%p) is %c%c%c...\n", (void*)(fmts[1]), *(fmts[1]+0), *(fmts[1]+1), *(fmts[1]+2)); printf ("Content of fmts[2] (%p) is %c%c%c...\n", (void*)(fmts[2]), *(fmts[2]+0), *(fmts[2]+1), *(fmts[2]+2)); return 0; } 

which outputs:

 Address of fmts[0] is 0x40200c Address of fmts[1] is 0x402010 Address of fmts[2] is 0x402014 Content of fmts[0] (0x4020a0) is Thi... Content of fmts[1] (0x4020ba) is Sho... Content of fmts[2] (0x4020c2) is Urk... 

Here you can see that the actual addresses of the array elements are equidistant - 0x40200c + 4 = 0x402010 , 0x402010 + 4 = 0x402014 .

However, the values ​​are not, as they relate to strings of different sizes. The lines are in the same memory block (in this case, this is not necessary in any way), as shown below, with the * character of the beginning and end of individual lines:

  | +0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +a +b +c +d +e +f +0123456789abcdef ---------+------------------------------------------------------------------- 0x04020a0| *54 68 69 73 20 6f 6e 65 20 69 73 20 61 20 6c 69 This one is a li 0x04020b0| 74 74 6c 65 20 6c 6f 6e 67 00*53 68 6f 72 74 65 ttle long.Shorte 0x04020c0| 72 00*55 72 6b 21 00* r.Urk!. 
+16
source

fmts points to pointers to char. It does not indicate the lines themselves.

In other words: the difference in the addresses fmts[0] and fmts[1] is the size of the char * type.

+3
source

You do not have an array of strings. You have an array of pointers to strings, or rather, an array of pointers to the first characters of strings. All pointers are the same size, so the problem of determining the offset just does not occur.

If you really wanted to have an array of strings, you should declare something like this

 const char fmts[][64] = { "this one is a little long", "this one is short" }; 

i.e. you will have to declare an array of arrays. In this case, you must specify a sufficient fixed size for your real array of strings ( 64 in my example), and this value will determine a fixed offset from one row to the next in the array.

As you correctly noted in your question, the minimum size that you can specify in this example is determined by the longest string in the array. However, the compiler will not calculate it for you. You must explicitly specify it yourself.

+3
source

The answer is that you do not have an array of strings as such, you have an array of pointers to char s. Pointers are the same size, printf() just casts them.

+2
source

Yes, the compiler will make the first pointer a point of the first character of the first line, and the second pointer point to the first character of the second line.

Since this is an "array of pointers to a character", each pointer can point to any place, it does not need to be the same length or whatever.

+2
source

You have not declared an array of strings. You have declared an array of pointers to strings. An array of strings will look like this:

 char fmts[][40] = {"this one is a little long", "this one is short"}; 

and, as you can see, you had to specify the maximum length as a dimension of the second array (only one dimension of a multidimensional array can be defined implicitly in C).

+2
source

All Articles