Java GC Threading Bottleneck in practice?

How well optimized is Java-parallel GC collection for multi-threaded environments? I wrote multi-threaded Jython code that spends most of the time in calling Java libraries. Depending on what parameters I run the program with, the library calls up either tons of distributions under the hood, or practically none. When I use parameters that require tons of heap allocations, I cannot get the code to scale over the last 6 cores. When I use parameters that do not require a large number of distributions, it scales to at least 20. How likely is this due to the GC bottleneck, given that I use the Sun VM stock parallel to GC and Jython as my glue language?

Edit: just to clarify, I will not necessarily think about things that are obvious to Java veterans, because I almost never use the Java / JVM languages. I run most of my programs in D and the flagship implementation of CPython Python. I use JVM and Jython for a small one-time b / c project. I need access to the Java library.

+4
source share
4 answers

For me, the problems with GC and multithreading are very real. I'm not saying that the JVM is bad, just the problem is that the problem itself is very complex.

In one of our projects, we had two applications running on the same JVM (application server). When they emphasized them individually, it was wonderful, but when both were tense, productivity worsened in a strange way. Finally, we split up the applications. in two JVMs, and the performance returned to normal (of course, slower than when only one application was used, but reasonably).

GC configuration is very complicated. Things can improve within 5 minutes, and then the main collection will be blocked, etc. You can decide whether you want high bandwidth or low latency in operations. High throughput is suitable for batch processing; interactive applications require low latency. Ultimately, the default parameters for the JVM were the best results for us!

This is actually not an answer, rather a return to experience, but yes, for me there may be a problem with GC and multithreading.

+2
source

Since your question is related to GC bottlenecks: you can eliminate this possibility by enabling GC logging and log checking - if there are a large number of GC events with long pauses, you can confirm / remove this theory. (However, in the scenario you are describing, I would suggest that this is not a GC problem).

+3
source

Java GC is a generation. The first generation collection is designed to care for short-lived objects and is expected to work often. Expected behavior over a short period of time several times per second if there are many short-lived appropriations. (This should be a comment, not an answer - I have no reputation, sorry).

In addition, depending on which virtual machine you are using, you can choose between GC algorithms. The options will vary depending on the version and vendor of the virtual machine used.

Some (old) information is here: http://java.sun.com/developer/technicalArticles/Programming/turbo/#The_new_GC

+1
source

Thread performance may vary from one version of jdk to another. In my experience, on jdk6u18, parallel gc enabled using -XX: + UseParallelGC ( not parallel gc markup) works very well on a quad core with hundreds of very active threads. I consider it unlikely that it does not scale beyond 6 cores.

The fact that Sun's hardware is based on processors with a large number of cores explains why they have put a lot of effort into new garbage collectors in recent years.

Parallel gc is not enabled by default because its single-threaded performance is not as good as gc by default.

0
source

All Articles