GC.Collect on a bunch of only 2nd and large objects

My application has a certain time when several large objects are released at once. At that time, I would like to do garbage collection, in particular with a large heap of objects (LOH).

I know that you cannot do this, you must call GC.Collect(2) , because GC is only called in LOH when it makes a generation 2 collection. However, I read in the documentation that the call to GC.Collect(2) will continue to run GC in generations 1 and 0.

Is it possible to force GC to collect only gene 2 and not include gene 1 or gene 0?

If this is not possible, is there a reason for the GC to be designed this way?

+7
garbage-collection large-object-heap
source share
4 answers

It's impossible. The GC is designed so that the generation 2 assembly also always collects the generation 0 and 1.

Change Found for you a source for the GC developers blog :

Gen2 GC requires a complete collection (Gen0, Gen1, Gen2, and LOH! Large GCed objects on each Gen2 GC even when the GC was not caused by lack of space in the LOH. Note that it is not a GC that collects only large objects.) That occupy much more time than the collections of the younger generation.

Change 2 . From the same blog. Using the GC Efficiently Part 1 and Part 2 , obviously, the Gen0 and Gen1 collections are quickly compared to the Gen2 collection, so it seems reasonable to me that only running Gen2 will not bring great results. There may be a more fundamental reason, but I'm not sure. Perhaps the answer is in some article on this blog.

+12
source share

Since all new distributions (other than large objects) always go to Gen0, GC is designed to collect data from the specified generation and below. When you call GC.Collect(2) , you tell GC to assemble from Gen0, Gen1, and Gen2.

If you are sure that you are dealing with a large number of large objects (objects that are large enough to be placed on the LOH when allocating time), the best option is to ensure that you set them to null (Nothing in VB) when you are done with them. The LOH attempt is trying to be smart and reuse blocks. For example, if you select a 1MB object on LOH and then delete it and set it to null, you will be left with a 1 MB hole. The next time you select something on an LOH of 1 MB or less, it will fill this hole (and continue to fill it until the next distribution is too large to fit in the remaining space, and at that moment it will highlight the new block.)

Keep in mind that generations in .NET are not physical things, but logical divisions that help improve GC performance. As all new distributions go to Gen0, this is always the first generation to be assembled. Each collection cycle that works, everything in the lower generation that survives in the collection, “advances” to the next higher generation (until it reaches Gen2).

In most cases, the GC does not need to go beyond collecting Gen0. The current GC implementation is able to collect Gen0 and Gen1 at the same time, but cannot collect Gen2, while Gen0 or Gen1 are assembled. (.NET 4.0 greatly facilitates this limitation and, for the most part, GC can build Gen2, while Gen0 or Gen1 can also build.)

+6
source share

To answer the question “why”: physically there are no such things as Gen0 and Gen1 or Gen2. They all use the same memory block in the virtual address space. The difference between them is really only realized, moving along an imaginary boundary limit.

Each (small) object is allocated from the heap area of ​​Gen0. If after the collection - she survives, she moves "down" to this area of ​​the managed block of the heap, which ultimately was simply freed from garbage. This is done by compacting the heap. After completing the complete collection, a new “border” for Gen1 is set into space immediately after the remaining objects.

So, if you go out and try to clear Gen0 and / or Gen1, you will open holes in the heap that should be closed, compacting the “full” heaps - even objects in Gen0. Obviously, this would not make any sense, since most of these objects would be garbage anyway. It makes no sense to move them. And it makes no sense to create and leave large holes on the heap (otherwise densification).

0
source share

Whenever the system collects garbage of a certain generation, it must check every object that may contain a link to any object of this generation. In many cases, old objects will only contain links to other old objects; if the system builds Gen0, it can ignore any objects that contain only links to Gen1 and / or Gen2 files. Similarly, if he makes a collection of Gen1, he can ignore any objects that contain only references to Gen2. Since checking and marking objects represents the majority of the time required to collect garbage, the ability to skip old objects completely represents significant time savings.

By the way, if you are interested in how the system “knows” whether an object can contain links to newer objects, the system has special code for setting a pair of bits in each object descriptor if the object is written. The first bit is reset in each garbage collection, and if it is still reset in the next garbage collection, the system will know that it cannot contain references to Gen0 objects (since any objects that existed when the object was last recorded and were not cleared by previous collection of Gen1 or Gen2). The second bit is reset in each Gen1 garbage collection, and if it is still reset in the next Gen1 garbage collection, the system will know that it cannot contain references to Gen0 or Gen1 objects (any objects to which it contains Gen2 references). Please note that the system does not know or care about whether the information that was written to the object is included using the Gen0 or Gen1 link. The trap required when writing to an unlabeled object is expensive and greatly complicates performance if it needs to be processed every time the object is written. To avoid this, objects are marked when any is recorded, so any additional records can continue without interruption before the next garbage collection.

0
source share

All Articles