Allocating an extremely large heap in .NET can be insanely fast, and the number of blocking collections does not prevent it from being so fast. The problems you are observing are caused by the fact that you are not just allocating, but also having code that causes dependency reorganization and the actual garbage collection, at the same time as the distribution occurs.
There are several methods:
try using LatencyMode ( http://msdn.microsoft.com/en-us/library/system.runtime.gcsettings.latencymode(v=vs.110).aspx ), set it to LowLatency while you are actively loading data - see . comments on this answer also
use multiple threads
Do not fill in cross-references to newly selected objects during active loading. first they go through the active distribution phase, use only whole indexes for cross-references, but do not control the links; then force full GC pairs to have everything in Gen2, and only then populate your extended data structures; you may need to rethink your deserialization logic for this to happen.
try to force your largest root collections (arrays of objects, strings) to the second generation as early as possible; do this by pre-distributing them and running the full GC two times before you start filling in the data (loading millions of small objects); if you use some kind of aroma of the generic dictionary, do not forget to distribute its capacity in advance to avoid reorganization
any large array of links is a great source of GC overhead - as long as both arrays and reference objects are not in Gen2; the larger the array, the greater the overhead; prefers index arrays to link arrays, especially for temporary processing needs
avoid multilevel or temporary objects freed or promoted , while in the active loading phase on any thread, carefully review your code to concatenate strings, box and foreach iterators, t will automatically be optimized for for loops
If you have an array of links and a hierarchy of function calls that have some long-running narrow loops, avoid introducing local variables that cache the reference value from some position in the array; instead, cache the offset value and continue to use something like the myArrayOfObjects [offset] structure at all levels of your function calls; it helped me with processing pre-populated massive Gen2 data structures, my personal theory here is that it helps the GC manage time dependencies on your local stream data structures, thereby improving concurrency
Here are the reasons for this behavior, as far as I learned due to filling up to 100 GB of RAM when the application starts, with multiple threads:
when the GC moves data from one generation to another, it actually copies it and thus changes all the links; therefore, the less cross-references you have during the active loading phase, the better
GC supports many internal data structures that control links; if you make massive changes to the links themselves - or if you have a lot of links that need to be changed during the GC - this leads to significant overhead of the CPU and memory during blocking and parallel GC; sometimes I watched as the GC constantly consumes 30-80% of the processor without any assemblies - just doing some processing that looks strange until you realize that whenever you put a reference to some array or some a temporary variable in a closed loop, the GC must modify and sometimes reorganize the dependency tracking data structures.
the GC server uses the Gen0 streaming segments associated with the stream and is able to redirect the entire segment to the next Gen (without actually copying data - not sure about this), keep this in mind when developing a multi-threaded data loading process
ConcurrentDictionary, being a great API, does not scale very well in extreme multi-core scenarios, when the number of objects exceeds several millions (think of using an unmanaged hash table optimized for parallel insertion, for example, for Intel T)
if possible or applicable, consider using your own distributed distribution pool (Intel TBB, again)
BTW, the latest update for .NET 4.5, has defragmentation support for a large heap of objects. Another great reason to update it.
.NET 4.6 also has an API that does not request GC at all (GC.TryStartNoGCRegion) if certain conditions are met: https://msdn.microsoft.com/en-us/library/dn906202(v=vs.110).aspx
Also see Maoni Stevens related post: https://blogs.msdn.microsoft.com/maoni/2017/04/02/no-gcs-for-your-allocations/
Roman polunin
source share