"Too many open files on system" crash when displaying recursive directory structure

I implemented (in Java) a fairly simple iterator to return the file names to a recursive directory structure, and after about 2300 files it failed β€œToo many open files in the system” (the error was actually in an attempt to load the class, but I assume that directory listing was the culprit).

The data structure supported by the iterator is a stack containing the contents of directories open at each level.

The actual logic is pretty simple:

private static class DirectoryIterator implements Iterator<String> { private Stack<File[]> directories; private FilenameFilter filter; private Stack<Integer> positions = new Stack<Integer>(); private boolean recurse; private String next = null; public DirectoryIterator(Stack<File[]> directories, boolean recurse, FilenameFilter filter) { this.directories = directories; this.recurse = recurse; this.filter = filter; positions.push(0); advance(); } public boolean hasNext() { return next != null; } public String next() { String s = next; advance(); return s; } public void remove() { throw new UnsupportedOperationException(); } private void advance() { if (directories.isEmpty()) { next = null; } else { File[] files = directories.peek(); while (positions.peek() >= files.length) { directories.pop(); positions.pop(); if (directories.isEmpty()) { next = null; return; } files = directories.peek(); } File nextFile = files[positions.peek()]; if (nextFile.isDirectory()) { int p = positions.pop() + 1; positions.push(p); if (recurse) { directories.push(nextFile.listFiles(filter)); positions.push(0); advance(); } else { advance(); } } else { next = nextFile.toURI().toString(); count++; if (count % 100 == 0) { System.err.println(count + " " + next); } int p = positions.pop() + 1; positions.push(p); } } } } 

I would like to understand how many "open files" are required for this. Under what circumstances does this algorithm β€œopen” the file, and when does it close again?

I saw some neat code using Java 7 or Java 8, but I'm limited to Java 6.

+7
java directory-structure
source share
2 answers

When you call nextFile.listFiles (), the main file descriptor opens to read the directory. It is not possible to explicitly close this handle, so you rely on garbage collection. As your code descends from a deep tree, it essentially collects a stack of nextFile instances that cannot be garbage collected.

Step 1: set nextFile = null before calling advance (). This frees up the garbage collection object.

Step 2: you may need to call System.gc () after resetting nextFile to speed up garbage collection. Unfortunately, there is no way to force GC.

Step 3: You may need to increase the open file limit on your operating system. On Linux, this can be done using ulimit (1).

If you can upgrade to Java 7 or later, then DirectoryStream will solve your problem. Instead of using nextFile.listFiles (), use Files.newDirectoryStream (nextFile.toPath ()) to get a DirectoryStream. You can then iterate over the thread and then close () to free up operating system resources. Each return path can be converted back to a file using toFile (). However, you may need refactoring to use only Path instead of File.

+6
source share

Thank you all for your help and advice. I found that the problem is what happens to the files after they are returned by the iterator: the "client" code opens the files as they are delivered and does not clean correctly. This is complicated by the fact that the returned files are actually processed in parallel.

I also rewrote the DireectoryIterator, which I share if anyone is interested:

 private static class DirectoryIterator implements Iterator<String> { private Stack<Iterator<File>> directories; private FilenameFilter filter; private boolean recurse; private String next = null; public DirectoryIterator(Stack<Iterator<File>> directories, boolean recurse, FilenameFilter filter) { this.directories = directories; this.recurse = recurse; this.filter = filter; advance(); } public boolean hasNext() { return next != null; } public String next() { String s = next; advance(); return s; } public void remove() { throw new UnsupportedOperationException(); } private void advance() { if (directories.isEmpty()) { next = null; } else { Iterator<File> files = directories.peek(); while (!files.hasNext()) { directories.pop(); if (directories.isEmpty()) { next = null; return; } files = directories.peek(); } File nextFile = files.next(); if (nextFile.isDirectory()) { if (recurse) { directories.push(Arrays.asList(nextFile.listFiles(filter)).iterator()); } advance(); } else { next = nextFile.toURI().toString(); } } } } 
+1
source share

All Articles