I implemented (in Java) a fairly simple iterator to return the file names to a recursive directory structure, and after about 2300 files it failed βToo many open files in the systemβ (the error was actually in an attempt to load the class, but I assume that directory listing was the culprit).
The data structure supported by the iterator is a stack containing the contents of directories open at each level.
The actual logic is pretty simple:
private static class DirectoryIterator implements Iterator<String> { private Stack<File[]> directories; private FilenameFilter filter; private Stack<Integer> positions = new Stack<Integer>(); private boolean recurse; private String next = null; public DirectoryIterator(Stack<File[]> directories, boolean recurse, FilenameFilter filter) { this.directories = directories; this.recurse = recurse; this.filter = filter; positions.push(0); advance(); } public boolean hasNext() { return next != null; } public String next() { String s = next; advance(); return s; } public void remove() { throw new UnsupportedOperationException(); } private void advance() { if (directories.isEmpty()) { next = null; } else { File[] files = directories.peek(); while (positions.peek() >= files.length) { directories.pop(); positions.pop(); if (directories.isEmpty()) { next = null; return; } files = directories.peek(); } File nextFile = files[positions.peek()]; if (nextFile.isDirectory()) { int p = positions.pop() + 1; positions.push(p); if (recurse) { directories.push(nextFile.listFiles(filter)); positions.push(0); advance(); } else { advance(); } } else { next = nextFile.toURI().toString(); count++; if (count % 100 == 0) { System.err.println(count + " " + next); } int p = positions.pop() + 1; positions.push(p); } } } }
I would like to understand how many "open files" are required for this. Under what circumstances does this algorithm βopenβ the file, and when does it close again?
I saw some neat code using Java 7 or Java 8, but I'm limited to Java 6.
java directory-structure
Michael kay
source share