According to the tools, sgen / gc calls semaphore_wait_trap:

Sgen is documented as stopping all other threads during collection:
Before assembling (minor or major), the collector must stop all current threads so that it can have a stable view of the current state of the heap, without changing other flows.
In other words, when the code tries to allocate memory, and GC is required, the time it takes to do this appears in semaphore_wait_trap since your application thread. I suspect that the mono-profile does not profile the gc stream itself, so you do not see the time in the collection code.
Then the original result is a GC summary:
GC summary GC resizes: 0 Max heap size: 0 Object moves: 1002691 Gen0 collections: 123, max time: 14187us, total time: 354803us, average: 2884us Gen1 collections: 3, max time: 41336us, total time: 60281us, average: 20093us
If you want your code to run faster, do not build it so often.
Understanding the actual cost of collection can be done through dtrace, since sgen has dtrace probes .
source share