Very high number of GC threads in ServerGC application

TL; DR: An application with a GC server turned on shows dozens and dozens of special GC streams from top to bottom. What can explain this?


I am stuck these days on the strange multithreading / competition issue that arises in a .NET service. Symptoms are as follows:

  • The program freezes for a long period of time (from several minutes to several minutes).
  • The number of threads is abnormally high
  • There is a peak of discord only when the program stops responding (see the following graph)
  • The same program is deployed on different servers, and in some cases there is no problem at all (the same hardware / OS / CLR)

Peak of contention

I immediately suspected a problem in our code that would cause the managed thread pool to run a huge number of threads over time, trying to share one or more shared resources. It turned out that we had a very small and very controlled use of ThreadPool.

I managed to get a dump file of a service that was not yet hanging, which already had a lot of threads (more than 100, when in the normal state there should be about 20)

Using windbg + sos, we found that the size of ThreadPool is fine:

0:000> !threadpool CPU utilization: 0% Worker Thread: Total: 8 Running: 1 Idle: 7 MaxLimit: 32767 MinLimit: 32 Work Request in Queue: 0 -------------------------------------- Number of Timers: 1 -------------------------------------- Completion Port Thread:Total: 1 Free: 1 MaxFree: 64 CurrentLimit: 1 MaxLimit: 1000 MinLimit: 32 

Only 8 worker threads ... Then I listed all the managed thread threads and found a lot of them that I could not recognize. The following is an example:

 0:000> !eestack (...) Thread 94 Current frame: ntdll!NtWaitForSingleObject+0xa Child-SP RetAddr Caller, Callee 0000008e25b2f770 000007f8f5a210ea KERNELBASE!WaitForSingleObjectEx+0x92, calling ntdll!NtWaitForSingleObject 0000008e25b2f810 000007f8ece549bf clr!CLREventBase::WaitEx+0x16c, calling kernel32!WaitForSingleObjectEx 0000008e25b2f820 000007f8f5a2152c KERNELBASE!SetEvent+0xc, calling ntdll!NtSetEvent 0000008e25b2f850 000007f8ece54977 clr!CLREventBase::WaitEx+0x103, calling clr!CLREventBase::WaitEx+0x134 0000008e25b2f8b0 000007f8ece548f8 clr!CLREventBase::WaitEx+0x70, calling clr!CLREventBase::WaitEx+0xe4 0000008e25b2f8e0 000007f8ed06526d clr!SVR::gc_heap::gc1+0x323, calling clr!SVR::GCStatistics::Enabled 0000008e25b2f940 000007f8ecfbe0b3 clr!SVR::gc_heap::bgc_thread_function+0x83, calling clr!CLREventBase::Wait 0000008e25b2f980 000007f8ecf3d5b6 clr!Thread::intermediateThreadProc+0x7d 0000008e25b2fd00 000007f8ecf3d59f clr!Thread::intermediateThreadProc+0x66, calling clr!_chkstk 0000008e25b2fd40 000007f8f8281832 kernel32!BaseThreadInitThunk+0x1a 0000008e25b2fd70 000007f8f8aad609 ntdll!RtlUserThreadStart+0x1d (...) 

Using the !threads -special command, I finally discovered that these threads were special GC threads:

 0:000> !threads -special ThreadCount: 81 UnstartedThread: 0 BackgroundThread: 49 PendingThread: 0 DeadThread: 21 Hosted Runtime: no (...) OSID Special thread type 1 804 DbgHelper 2 f48 GC 3 3f8 GC 4 1380 GC 5 af4 GC 6 1234 GC 7 fac GC 8 12e4 GC 9 17fc GC 10 644 GC 11 16e0 GC 12 6cc GC 13 9d4 GC 14 f7c GC 15 d5c GC 16 d74 GC 17 8d0 GC 18 1574 GC 19 8e0 GC 20 5bc GC 21 82c GC 22 e4c GC 23 129c GC 24 e28 GC 25 45c GC 26 340 GC 27 15c0 GC 28 16d4 GC 29 f4c GC 30 10e8 GC 31 1350 GC 32 164 GC 33 1620 GC 34 1444 Finalizer 35 c2c ProfilingAPIAttach 62 50 Timer 64 14a8 GC 65 145c GC 66 cdc GC 67 af8 GC 68 12e8 GC 69 1398 GC 70 e80 GC 71 a60 GC 72 834 GC 73 1b0 GC 74 2ac GC 75 eb8 GC 76 ec4 GC 77 ea8 GC 78 28 GC 79 11d0 GC 80 1700 GC 81 1434 GC 82 1510 GC 83 9c GC 84 c64 GC 85 11c0 GC 86 1714 GC 87 1360 GC 88 1610 GC 89 6c4 GC 90 cf0 GC 91 13d0 GC 92 1050 GC 93 1600 GC 94 16c4 GC 95 1558 GC 96 1b74 IOCompletion 97 ce4 ThreadpoolWorker 98 19a4 ThreadpoolWorker 99 1a00 ThreadpoolWorker 100 1b64 ThreadpoolWorker 101 1b38 ThreadpoolWorker 102 1844 ThreadpoolWorker 103 1b90 ThreadpoolWorker 104 1a10 ThreadpoolWorker 105 1894 Gate 

More than 60 "GC" threads ... So, I checked the settings of my different service instances and realized that the problematic ones were configured using the GC Server , while others did not.

Additional Information:

  • We are using .NET 4.5
  • We use Windows 2012 Server on all machines
  • We work on two octa-core servers (2 processors, 16 physical cores, 32 logical cores).

What I'm trying to do now:

  • I am trying to get other dumps (when there will be even more threads in the program, when the program freezes, etc.).
  • I will try to disable the GC Server option on problematic instances, but the problem may take some time.

So here are my questions :

  • Is it normal for a GC server configured for a .NET program to have so many GC threads? I thought that the GC server has only one GC thread for each processor.
  • Could this be related to the problem that I see in these services, i.e. hundreds of threads over time, with a huge process freezing due to competition?
+8
garbage-collection multithreading clr windbg
source share
2 answers

From the GC server there will be one thread for each logical core (that is, a set of affiliates in this core). Therefore, in your case there should be at least 32 threads. If you enabled the background GC, then there may be more workflows processing the schedule for each heap ( link ).

Also keep in mind that these GC threads will execute with THREAD_PRIORITY_HIGHEST , which can easily starve any threads that are not yet suspended by the GC ( link ).

Now, as far as your other threads, 500+ in the process are going to create a lot of controversy, regardless of the garbage collector. Therefore, figuring out that these topics will be important for your research.


What to see

  • See if Background GC is enabled, and if so, try to start without it (this mode is supported in 4.5 Server GC).
  • Try reducing the maximum number of threads in thread pools (32767 is an unhealthy maximum).

You can also use procdump.exe to help capture mini-maps when performance degrades.

+1
source share

I had similar problems on a NUMA server. Things that helped me:

  • Limit thread pool
  • Limit the processor affinity mask for a hovering process. It looks strange, but reducing the number of processors for a process sometimes makes it run faster under high concurrency load. I suspect spin locks (waiting).
0
source share

All Articles