Poor performance thanks to Java garbage collector? Need help

I will try to briefly explain the problem. I work in a supply chain domain where we deal with goods / products and SKUs.

Let's say my whole problem is 1 million SKUs, and I use an algorithm. The JVM heap size is now set to 4 GB.

I can’t process all the SKUs in one shot, as I will need a lot more memory. So, I divide the problem into smaller batches. Each batch will have all the associated SKUs that need to be processed together.

Now I run several iterations to process the entire data set. Say if each batch holds approx. 5000 SKU, I will have 200 iterations / loops. All data for 5000 SKUs is required before batch processing is completed. But when the next batch begins, the previous "packet data" is not required and, therefore, can be collected in garbage.

This is problem. Now, having come up with a specific performance problem due to the GC - Each batch takes about 2-3 seconds to complete. Now, during this time, the GC cannot release any objects, since all the data is necessary until the end of processing a particular batch. Thus, GC moves all these objects to the old Gen (if I look at your profiler, almost nothing in the new General). Thus, the old gene grows faster, and a full GC is needed, which makes my program very slow. Is there a way to configure GC in this case, or can it change my code to allocate memory differently?

PS - if each batch is very small, I do not see this problem. I believe that this is due to the fact that the GC is able to quickly release objects, since the package ends faster and, therefore, is not needed to move objects to the old gene.

+5
source share
5 answers

Google’s first hit points out that you can use -XX:NewRatio to install a larger new generation compared to the old generation.

+3
source

You need to configure -XX: NewRatio as indicated in another answer.

You can start by setting this -XX: NewRatio = 1 , which means that your old gene and young generators share the available heap memory equally.

More about how this flag works with other memory settings flags: https://docs.oracle.com/cd/E19900-01/819-4742/abeik/index.html

+1
source

Consider using an object pool template .

those. create an extrusion of 5,000 SKUs, then initialize each of these objects with new data for each batch. This way, you will not have any problems with the GC, since pulling everything you need to highlight.

+1
source

A few tips:

  • Check for memory leaks with profiling tools like visualvm or MAT
  • If you do not have memory leaks, just check the current memory. If not, allocate enough memory.
  • From your statement of problems, oldGen grows and calls FullGC . You did not quote the garbage collector you are using. Since you are using memory> = 4GB, you should try G1GC alrogithm. in G1GC, you can save most of the default values, except for setting key parameters, such as pause time goal, region size, parallel gc threads etc

Refer to this SE question for more information:

Java 7 (JDK 7) garbage collection and collection in G1

0
source

I know it's a little late, but still ...

I played a lot with the JVM GC options, which helped to some extent. Good thing I learned a lot more about GC in the process :)

Finally, I made some pool of objects. Since the task is processed in batches, and each batch has approximately the same size and uses the same number of objects, I created a pool of objects that were processed for each batch, and not for creating and destroying objects in each batch. At the end of each batch, I simply restore the objects (arrays to -1, etc.). And at the beginning of the next batch, I reuse these objects by reinitializing them. In addition, for multi-threaded cases, these pools are made by ThreadLocals to avoid synchronization overhead.

0
source

All Articles