-XX: + UseParallel * Old * GC should speed up Full GCs on a multi-core machine.
You can also profile various NewRatio values. Your cached objects will live on in a generation, so profile it with -XX: NewRatio = 7, and then again with some higher and lower values.
You may not be able to accurately reproduce realistic use during profiling, so make sure you control the GC when it is used in real life, and then you can make minor changes (for example, to the space for survivors, etc.) and see what effect they have.
The old tip was to not use AggressiveHeap with Xms and Xmx, I'm not sure if this is true.
Edit:. Let us know which OS / hardware platform you are deployed to.
Complete collections every 30 minutes show that the old generation is quite complete. High value for newRatio will give it more space at the expense of the younger generation. Can you give the JVM more than 4 g or are you limited to this?
It would also be helpful to know what your goals / non-functional requirements are. Do you want to avoid these 6/7 second pauses at the risk of lower bandwidth or are these pauses an acceptable compromise for the maximum bandwidth possible?
If you want to minimize pauses, try the CMS builder by removing both
-XX:+UseParallelGC -XX:+UseParallelOldGC
and adding
-XX:+UseConcMarkSweepGC -XX:+UseParNewGC
Profile with this with various NewRatio values ββand see how you do it.
One drawback of the CMS collector is that, unlike parallel old and serial collectors, it does not compact the old generation. If the old generation gets too fragmented and the secondary collection needs to push many objects into the old gene at once, a complete serial collection can be called up, which can mean a long pause. (I saw this once in prod, but with an IBM JVM that went out of memory instead of calling a collection of seals!)
This may not be a problem for you - it depends on the nature of the application, but you can insure it by restarting nightly or weekly.