What makes the JVM do a large garbage collection?

I have a Java application that shows different GC behavior in different environments. In one environment, the heap usage schedule is a slow sawtooth with the main GC every 10 hours or so, only when the heap> 90% is full. In a different environment, the JVM makes the main GC every hour at the point (the heap usually ranges from 10% to 30% at this time).

My question is, what are the factors that make the JVM decide to make the main GC?

Obviously, it collects when the heap is almost full, but there are other reasons in the game that I suppose are related to the hourly scheduled task in my application (although there is no memory burst at this time).

I assume that the behavior of the GC is highly dependent on the JVM; I use:

  • Java HotSpot (TM) 64-bit VM Server 1.7.0_21 Oracle Corporation
  • There are no specific GC settings, so use the default settings for the 64-bit server (PS MarkSweep and PS Scavenge).

Additional Information:

  • This is a web application running in Tomcat 6.
  • In both environments, the Permian gene bypasses about 10%.
  • The sawtooth environment has a maximum heap of 7 GB and the other 14 GB.

No guesses please. The JVM must have rules for deciding when to implement the main GC, and these rules must be encoded deep in the source. If someone knows what it is or where they are documented, please share!

+7
java garbage-collection jvm jvm-hotspot
source share
4 answers

Garbage collection is a rather complicated topic , and although you can find out all the details about it, I think what happens in your case is quite simple.

Suns Garbage Collection Configuration Guide in the Explicit Garbage Collection section warns you:

applications can interact with garbage collection ... by explicitly using complete garbage collections ... This can cause a large collection to do when it might not be necessary ... One of the most common uses of explicit garbage collection is with RMI ... RMI periodically creates complete collections

This guide says that the default time between garbage collections is one minute, but the sun.rmi Properties link under sun.rmi.dgc.server.gcInterval reads:

The default value is 3600000 milliseconds (one hour).

If you look at the main collections every hour in one application, but not the other, perhaps because the application uses RMI, perhaps only internally, and you did not add -XX:+DisableExplicitGC to the launch flags.

Disable an explicit GC or test this hypothesis by setting -Dsun.rmi.dgc.server.gcInterval=7200000 and observing if GC occurs every two hours.

+5
source share

I found four conditions that can cause the main GC (given my JVM configuration):

  • The old gen area is full (even if it can be grown, the main GC will start first)
  • The perm gen area is full (even if it can be grown, the main GC will start first)
  • Someone manually calls System.gc() : a bad library or something related to RMI (see links 1 , 2 and 3 )
  • The zones of the young generations are full, and nothing can be transferred to the old gene (see 1 )

As others have noted, cases 1 and 2 can be improved by allocating a large amount of heap and pergene and setting -Xms and -Xmx to the same value (along with primary equivalents) to avoid changing the dynamic heap.

Case 3 can be avoided by using the -XX:+DisableExplicitGC .

Case 4 requires a more complex setup, for example -XX:NewRatio=N (see Oracle Customization Guide ).

+3
source share

It depends on your configurations, since HotSpot is configured differently in different Java environments. For example, on a server with more than 2 GB and two processors, some JVMs will be configured in the "-server" mode instead of the "-client default" mode, which adjusts the sizes of memory spaces (generations) in different ways, and the effect on when it happens garbage collection.

A full GC can happen automatically, but also if you call the garbage collector in your code (for example: using System.gc() ). Automatically, it depends on how small collections behave.

At least two algorithms are used. If you use the default values, the copy algorithm is used for small collections, and the markup algorithm is used for main collections.

The copy algorithm consists in copying the used memory from one block to another, and then clearing the space containing the blocks, without reference to them. The JVM copy algorithm uses a large area for objects created for the first time (called Eden ) and two smaller ones (called survivors ). Surviving objects are copied once from Eden and several times from survivor spaces during each junior collection until they are hired and copied to another space (called tenured space), where they can only be deleted in the main collection.

Most objects in Eden die quickly, so the first collection copies the remaining objects to the remaining spaces (which are much smaller by default). There are two survivors s1 and s2 . Each time Eden filled, the preserved objects from Eden and s1 copied to s2 , Eden and s1 are cleared. Next time, survivors from Eden and s2 will be copied to s1 . They continue to be copied from s1 to s2 to s1 until a certain number of copies are reached, or because the block is too large and not suitable, or some other criteria. Then the saved block of memory is copied to the generation of tenured .

tenured objects do not affect secondary collections. They accumulate until the area is full (or the garbage collector is called). Then the JVM will run the markup algorithm in the main collection, which will save only the preserved objects that still have links.

If you have larger objects that are not suitable for survivors, they can be copied directly to the tenured space, which will fill up faster, and you will get larger collections more often.

In addition, the size of the remaining living spaces, the number of copies between the sizes s1 and s2 , Eden , related to the size s1 and s2 , the size of the generation generation, all this can be automatically adjusted differently in different environments with JVM ergonomics, which can automatically choose behavior -server or -client . You can try to run both JVMs as -server or -client and check if they all behave differently.

+2
source share

Even if this leads to a vote ... My best guess (you will need to check this out) would be that the heap should expand, and when that happens the full gc will be launched. Not all memory is immediately allocated to the JVM.

You can verify this by setting -Xms and -Xmx to the same value, e.g. 7GB each

+1
source share

All Articles