Is an array or object pointer / reference handle affect its size?

I know that if I have an array int A[512] , the reference A can point to the first element. In memory pointer arithmetic, referenced as A + index .

But if I am not mistaken, the pointer / link also occupies the machine word of space. Assuming int takes a machine word, does this mean that 512 integers of the specified array occupy 513 space words?

Is the same true / false for objects and their data members in C ++ or C #?

Update: Wow, you guys are fast. To clarify, I'm interested in how C ++ and C # differ in how they deal with this, and how I can put objects the size of a cache line (if possible).

Update: I became aware of the difference between pointers and arrays. I understand that arrays are not pointers and that the pointer arithmetic indicated above is only valid after converting the array to a pointer. I do not think that this difference is relevant to the general question. I wonder how arrays and other objects are stored in memory in both C ++ and C #.

+4
source share
7 answers

Please note that when you talk about setting data to a cache line, a variable containing a link and the actual data it refers to will not be in close proximity. The link will end in the register (in the end), but it was probably originally saved as part of another object somewhere else in memory or as a local variable on the stack. The contents of the array itself can still be included in the cache lines during operation, regardless of what other overhead data is associated with the "object". If you're curious about how this works in C #, Visual Studio has a Disassembler view that shows the x86 or x64 assembly generated for your code.

Array references have special support at the IL (intermediate language) level, so you will find that the way to load / use memory is essentially the same as using an array in C ++. Under the hood, indexing into an array is exactly the same operation. Where you start to notice the differences, you index arrays using "foreach" or start referencing "unbox" when the array is an array of object types.

Please note that when creating objects locally, the method may display one difference from the memory location between C ++ and C #. C ++ allows you to create instances of arrays on the stack, which creates a special case when the memory of the array is actually stored in the immediate vicinity of the "link" and other local variables. In C #, the contents of an array (managed) will always be allocated to the heap.

On the other hand, when referring to objects allocated in heaps, C # can sometimes have better memory locality than C ++, especially for short-lived objects. This is because the GC stores objects by their “generation” (how long they have been alive) and the heap densification that it does. Short-term objects stand out quickly on a growing heap; during assembly, the heap is also compacted, preventing "fragmentation", which can cause subsequent allocations in the uncompressed heap, which will be scattered in memory.

You can get similar advantages in C ++ memory using the technique of combining objects (or avoiding frequent small short-lived objects), but this requires a bit of extra work and design. The cost of this, of course, is that the GC needs to start up with thread capturing, push forward generations, compact and reassign links, causing measurable overhead at somewhat unpredictable times. In practice, overhead is rarely a problem, especially with the Gen0 collection, which is highly optimized for the pattern of using frequently allocated short-lived objects.

+1
source

You have a misunderstanding regarding arrays and pointers in C ++.

Array

 int A[512]; 

In this declaration, you will get an array of 512 int s. Nothing more. There is no pointer, nothing. Just an array from int s. The size of the array will be 512 * sizeof(int) .

Name

The name A refers to this array. This is not a pointer type. This is an array type. This is the name, and it refers to an array. Names are simply compilation constructs that tell the compiler what you are talking about. Names do not exist at run time.

Conversion

There is a conversion called an array to pointer conversion, which may take place in some cases. A transformation takes an expression that has an array type (for example, a simple expression A ) and converts it into a pointer to its first element. That is, in some situations, the expression A (which denotes an array) can be converted to int* (which points to the first element in the array).

Pointer

A pointer created by a conversion between arrays and pointers exists for the duration of the expression of which it is a part. This is only a temporary object that appears in these specific circumstances.

Circumstances

Converting from an array to a pointer is a standard conversion, and the circumstances in which this can happen include:

  • When moving from an array to a pointer. For example, (int*)A

  • When initializing an object of type pointer, for example. int* = A; .

  • Whenever a glvalue refers to an array, appears as the operand of an expression waiting for prvalue.

    This is what happens when you index an array, for example A[20] . The index operator expects a prvalue of type pointer, so A undergoes conversion from array to pointer.

+1
source

No, objects in the CLR are not mapped to the "simple" C++ (I immagine) memory map you are referencing. Remember that you can control objects in the CLR using reflection, which means that each object must have additional information (manifest) inside it. This already adds more memory, which is just the contents of the object, add to that also a pointer to locking control in a multi-threaded environment, and you will go far from the expected memory allocation for the CLR object.

Also remember that the size of the pointer deviates between bit machines 32 and 64 .

0
source

An array, int A[512] takes 512 * sizeof (int) (+ any addition, the compiler decides to add this particular instance, most likely there will be no filling).

The fact that array A can be converted to a pointer to int A and used with A + index exploits the fact that the implementation of A[index] almost always has the same instructions as A + index . Conversion to a pointer occurs in both cases, because to go to A[index] we need to take the first address of the array A and add index times sizeof(int) - whether you write it like A[index] or A + index does not matter . In both cases, A refers to the first address in the array and index number of elements in it.

There is no unnecessary place.

The above applies to C and C ++.

In C # and other languages ​​that use "managed memory", there is additional overhead for tracking each variable. This does not affect the size of the variable A , but, of course, it needs to be stored somewhere, and therefore, each variable, whether it is a single integer or a very large array, will have some overhead stored somewhere, including the size of the variable and some "reference count" (how many places the variable is used, and if it can be deleted).

0
source

I think you are confusing an array and a C ++ pointer.

The array from int is just that array of locations in memory, each of which occupies sizeof(int) , in which you can store N-1 int s.

A pointer is a type that can point to a memory location and occupies the size of the processor register in memory, so on a 32-bit machine sizeof(int*) will be 32 bits.

If you want to have a pointer to your array, you do this: int * ptr = &A[0]; This points to the first element in the array. Now your pointer takes up memory (processor word size), and you have an int s array.

When you pass an array to a function in C or C ++, it splits into a pointer to the first element of the array. This does not mean that the pointer is an array, it speaks of the decay of the array into a pointer.

In C #, your array is a reference type, and you have no pointers, so you don't worry about that. It just takes the size of your array.

0
source

Relatively native C ++ :

But if I'm not mistaken, the pointer / link also occupies the machine word of space

A link does not necessarily take up space in memory. In clause 8.3.2 / 4 of the C ++ 11 standard:

It is not indicated whether reference (3.7) is required for storage.

In this case, you can use A as a pointer, and indeed, if necessary, it splits into a pointer (for example, when passing it as an argument to a function), but the type A is int[512] and not int* : therefore A not a pointer . For example, you cannot do this:

 int A[512]; int B; A = &B; 

To store A (i.e. used to store the memory address where the array starts) there should be no memory space, therefore, most likely, your compiler will not allocate extra bytes of memory to store the address A

0
source

Here we have several different examples, given that we even have several languages ​​for discussion.

Let's start with a simple example, a simple array in C ++:

 int array[512]; 

What happens in terms of memory allocation here? On the stack, 512 words of memory are allocated for the array. No heap of memory is allocated. There is no overhead; there are no pointers to the array, there is nothing, only 512 words of memory.

Here is an alternative method to create an array in C ++:

 int * array = new int[512]; 

Here we create an array on the heap. It will allocate 512 words of memory without the extra memory allocated on the heap. Then, as soon as this is done, the address at the beginning of this array will be placed on a variable in the stack, taking up an additional memory word. If you look at the total memory for the entire application, yes, it will be 513, but it is worth noting that it is on the stack and the rest is on the heap (stack memory is much cheaper to allocate and does not cause fragmentation, but if you abuse it or misuse it, you can easily deal with it.

Now in C #. In C #, we don’t have two different syntaxes, all you have is:

 int[] array = new int[512]; 

This will create a new array object on the heap. It will contain 512 words of memory for the data in the array, as well as some additional memory for the overhead of the array object. He will need 4 bytes to hold an array counter, a synchronization object, and several other bits of overhead that we really don't need to think about. These overheads are small and independent of array size.

There will also be a pointer (or "link", as it would be more appropriate to use in C #), to this array, which is pushed onto the stack, which will occupy the memory word. Like C ++, stack memory can be allocated / deallocated very quickly and without memory fragmentation, so when considering the memory size of your program it often makes sense to separate it.

0
source

All Articles