Search for 3 last modified files in a long list of files

I have a list of files that I want to sort and extract the last 3 last modified.

Limitation: I cannot use Java 7 due to compatibility issues in downstream applications

My current settings

Solution 1

File[] files = directory.listFiles(); Arrays.sort(files, new Comparator<File>(){ public int compare(File f1, File f2) { return Long.valueOf(f1.lastModified()).compareTo(f2.lastModified()); } }); 

Decision 2

 public static void sortFilesDesc(File[] files) { Arrays.sort(files, new Comparator() { public int compare(Object o1, Object o2) { if ((File)o1).lastModified().compareTo((File)o2).lastModified()) { return -1; } else if (((File) o1).lastModified() < ((File) o2).lastModified()) { return +1; } else { return 0; } } }); } 

Problem

The solution above takes more time to execute and memory. My file list consists of 300 tar files with a size of 200 MB. therefore, it consumes more time and memory.

Is there a way to deal with this effectively?

Any comparison operation uses a file object with high memory, is there a way to free up memory and process it efficiently?

+4
source share
4 answers

You can do it a lot faster.

Arrays.sort (...) uses "quick sort", which performs ~ n * ln (n) operations.

This example only performs one iteration through the entire array, which is ~ n .

 public static void sortFilesDesc(File[] files) { File firstMostRecent = null; File secondMostRecent = null; File thirdMostRecent = null; for (File file : files) { if ((firstMostRecent == null) || (firstMostRecent.lastModified() < file.lastModified())) { thirdMostRecent = secondMostRecent; secondMostRecent = firstMostRecent; firstMostRecent = file; } else if ((secondMostRecent == null) || (secondMostRecent.lastModified() < file.lastModified())) { thirdMostRecent = secondMostRecent; secondMostRecent = file; } else if ((thirdMostRecent == null) || (thirdMostRecent.lastModified() < file.lastModified())) { thirdMostRecent = file; } } } 

On small amounts of files you will not see much difference, but even for dozens of files the difference will be significant, for large numbers - dramatic.

Code for checking the algorithm (specify the correct file structure):

 package com.hk.basicjava.clasload.tests2; import java.io.File; import java.util.Date; class MyFile extends File { private long time = 0; public MyFile(String name, long timeMills) { super(name); time = timeMills; } @Override public long lastModified() { return time; } } public class Files { /** * @param args */ public static void main(String[] args) { File[] files = new File[5]; files[0] = new MyFile("File1", new Date(2013,1,15, 7,0).getTime()); files[1] = new MyFile("File2", new Date(2013,1,15, 7,40).getTime()); files[2] = new MyFile("File3", new Date(2013,1,15, 5,0).getTime()); files[3] = new MyFile("File4", new Date(2013,1,15, 10,0).getTime()); files[4] = new MyFile("File5", new Date(2013,1,15, 4,0).getTime()); sortFilesDesc(files); } public static void sortFilesDesc(File[] files) { File firstMostRecent = null; File secondMostRecent = null; File thirdMostRecent = null; for (File file : files) { if ((firstMostRecent == null) || (firstMostRecent.lastModified() < file.lastModified())) { thirdMostRecent = secondMostRecent; secondMostRecent = firstMostRecent; firstMostRecent = file; } else if ((secondMostRecent == null) || (secondMostRecent.lastModified() < file.lastModified())) { thirdMostRecent = secondMostRecent; secondMostRecent = file; } else if ((thirdMostRecent == null) || (thirdMostRecent.lastModified() < file.lastModified())) { thirdMostRecent = file; } } System.out.println("firstMostRecent : " + firstMostRecent.getName()); System.out.println("secondMostRecent : " + secondMostRecent.getName()); System.out.println("thirdMostRecent : " + thirdMostRecent.getName()); } } 
+3
source

You must check the lastModified of each file, you cannot change it. What you do not need to do is sort all the elements to get the top 3. If you can use Guava, you can use Ordering.greatestOf (which uses a good algorithm):

 Ordering<File> ordering = Ordering.from( new Comparator(){ public int compare(File f1, File f2) { return Long.valueOf(f1.lastModified()).compareTo(f2.lastModified()); }); List<File> max3 = ordering.greatestOf(Arrays.asList(directory.listFiles()), 3); 
+3
source

I'm for solution 1, with some improvements

 Arrays.sort(files, new Comparator<File>() { public int compare(File f1, File f2) { long d1 = f1.lastModified(); long d2 = f2.lastModified(); return d1 > d2 ? 1 : d1 < d2 ? -1 : 0; } }); 

to avoid creating unnecessary objects due to Long.valueOf (long).

File does not contain / does not read the file data, but only the path to the file, there is no performance / memory problem in it. The only time-consuming operation here is the modification time of reading from the file system, which cannot be avoided.

0
source

Your problem is that getting the last modified date is a relatively expensive operation because it includes the operating system logic. Therefore, if you do not mind getting the latest updated values, you can wrap the files in a comparable class.

 public class LastModifiedFile implements Comparable<LastModifiedFile> { private final File file; private final Date lastModified; public LastModifiedFile(File file) { this.file = file; lastModified = file.lastModified(); } public int compareTo(LastModifiedFile other) { return lastModified.compareTo(other.lastModified); } } 

Note that changing the last modified date during sorting will result in undefined behavior for many sorting algorithms. Implementing Tim 7 Java 7s implementations throws an exception if the last modified date changes, and therefore the comparison leads to different values.

0
source

All Articles