UPD 11/21/2017: bug fixed in JDK, see comment from Vicente Romero
Summary:
If the for statement, if it is used for any Iterable implementation, the collection will remain in the memory heap until the end of the current area (method, operator body) and will not be garbage collected even if you have other references to the collection and the application should allocate new memory .
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8175883
https://bugs.openjdk.java.net/browse/JDK-8175883
Example:
If I have the following code that selects a list of large lines with random content:
import java.util.ArrayList; public class IteratorAndGc { // number of strings and the size of every string static final int N = 7500; public static void main(String[] args) { System.gc(); gcInMethod(); System.gc(); showMemoryUsage("GC after the method body"); ArrayList<String> strings2 = generateLargeStringsArray(N); showMemoryUsage("Third allocation outside the method is always successful"); } // main testable method public static void gcInMethod() { showMemoryUsage("Before first memory allocating"); ArrayList<String> strings = generateLargeStringsArray(N); showMemoryUsage("After first memory allocation"); // this is only one difference - after the iterator created, memory won't be collected till end of this function for (String string : strings); showMemoryUsage("After iteration"); strings = null; // discard the reference to the array // one says this doesn't guarantee garbage collection, // Oracle says "the Java Virtual Machine has made a best effort to reclaim space from all discarded objects". // but no matter - the program behavior remains the same with or without this line. You may skip it and test. System.gc(); showMemoryUsage("After force GC in the method body"); try { System.out.println("Try to allocate memory in the method body again:"); ArrayList<String> strings2 = generateLargeStringsArray(N); showMemoryUsage("After secondary memory allocation"); } catch (OutOfMemoryError e) { showMemoryUsage("!!!! Out of memory error !!!!"); System.out.println(); } } // function to allocate and return a reference to a lot of memory private static ArrayList<String> generateLargeStringsArray(int N) { ArrayList<String> strings = new ArrayList<>(N); for (int i = 0; i < N; i++) { StringBuilder sb = new StringBuilder(N); for (int j = 0; j < N; j++) { sb.append((char)Math.round(Math.random() * 0xFFFF)); } strings.add(sb.toString()); } return strings; } // helper method to display current memory status public static void showMemoryUsage(String action) { long free = Runtime.getRuntime().freeMemory(); long total = Runtime.getRuntime().totalMemory(); long max = Runtime.getRuntime().maxMemory(); long used = total - free; System.out.printf("\t%40s: %10dk of max %10dk%n", action, used / 1024, max / 1024); } }
compile and run it with limited memory , for example: (180mb):
javac IteratorAndGc.java && java -Xms180m -Xmx180m IteratorAndGc
and at runtime I:
Before the first memory allocation: 1251k max. 176640k
After the first memory allocation: 131426k max. 176640k
After iteration: 131426k max. 176640k
After the GC force in the method enclosure: 110682k max. 176640k (almost nothing assembled)
Try allocating memory in the method body again:
!!!! Out of memory error !!!!: 168948k of max 176640k
GC after body method: 459k max. 176640k (garbage collection!)
The third allocation outside the method is always successful: 117740k max. 163840k
So, inside gcInMethod (), I tried to select the list, iterate over it, drop the link to the list, (optionally) forcibly collect garbage and add a similar list again. But I can not allocate a second array due to lack of memory.
At the same time, outside the function body, I can successfully forcibly collect garbage collection (optional) and redistribute the same array size again!
To avoid this OutOfMemoryError inside the function body, it is enough to delete / comment only one line:
for (String string : strings); <- this is evil !!!
and then the output is as follows:
Before the first memory allocation: 1251k max. 176640k
After the first memory allocation: 131409k max. 176640k
After iteration: 131409k max. 176640k
After GC strength in the method enclosure: 497k max. 176640k (garbage collection!)
Try allocating memory in the method body again:
After allocating secondary memory: 115541k max. 163840k
GC after the method body: 493k max. 163840k (garbage collection!)
The third allocation outside the method is always successful: 121300k max. 163840k
Thus, without repeating the iteration of garbage that was successfully collected after dropping the link to the lines, and allocated a second time (inside the function body) and allocated a third time (outside the method).
My suggestion:
to build the syntax is compiled into
Iterator iter = strings.iterator(); while(iter.hasNext()){ iter.next() }
(and I checked this decompilation javap -c IteratorAndGc.class )
And it looks like this iterative link remains in the scope to the end. You do not have access to the link to nullify it, and the GC cannot complete the collection.
Perhaps this is normal behavior (maybe even specified in javac, but I didn’t find it), but IMHO, if the compiler creates some instances, it should take care to drop them out of scope after use.
The way I expect the implementation of the for statement is:
Iterator iter = strings.iterator(); while(iter.hasNext()){ iter.next() } iter = null;
The java compiler and execution versions are used:
javac 1.8.0_111 java version "1.8.0_111" Java(TM) SE Runtime Environment (build 1.8.0_111-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)
Note :
the question is not about the programming style, best practices, conventions, etc., it is about the effectiveness of the Java platform.
the question is not about the behavior of System.gc() (you can remove all gc calls from the example) - during allocation of the second line, the JVM should free up memory with disk space.
Link to java test class , Online compiler for testing (but this resource has only 50 MB heap, so use N = 5000)