I have a non-executive method, how can I increase its effectiveness?

I have an easy way to compare an array of FileInfo objects with a list of file names to check which files have already been processed. Then the raw list is returned.

The loop of this method iterates around 250,000 FileInfo objects. It takes an indecent amount of time to compete.

Inefficiency is, obviously, a call to the Contains method in a collection of processed files.

First, how can I check to make sure that my suspicion is true regarding the cause, and secondly, how can I improve the process acceleration method?

public static List<FileInfo> GetUnprocessedFiles(FileInfo[] allFiles, List<string> processedFiles)
{
List<FileInfo> unprocessedFiles = new List<FileInfo>();
foreach (FileInfo fileInfo in allFiles)
{
    if (!processedFiles.Contains(fileInfo.Name))
    {
        unprocessedFiles.Add(fileInfo);
    }
    }
    return unprocessedFiles;
}
+5
source share
6 answers

A List<T> Contains , , / . HashSet<string> . HashSet<T> Contains O(1) , .

:

public static List<FileInfo> GetUnprocessedFiles(FileInfo[] allFiles, 
                                         List<string> processedFiles)
{
   List<FileInfo> unprocessedFiles = new List<FileInfo>();
   HashSet<string> processedFileSet = new HashSet<string>(processedFiles);

   foreach (FileInfo fileInfo in allFiles)
   {
       if (!processedFileSet.Contains(fileInfo.Name))
       {
           unprocessedFiles.Add(fileInfo);
       }
    }

   return unprocessedFiles;
}

3 , :

  • , ISet<T> . , .
  • (string FileInfo) . .
  • HashSet<T>.ExceptWith , . , .

LINQ, , :

public static IEnumerable<string> GetUnprocessedFiles
 (IEnumerable<string> allFiles, IEnumerable<string> processedFiles)
{
  // null-checks here
  return allFiles.Except(processedFiles);     
}
+14

HashSet. , . HashSet - O (1).

+3

/hastable like class . - , , , .

+1
  • Array.BinarySearch<T>() . O (logN).
0

,

0

...

, ( FileInfo , ), . - , O (n + m); hashset 250k GC.

- :

public static IEnumerable<string> GetMismatches(IList<string> fileNames, IList<string> processedFileNames, StringComparer comparer)
    {
        var filesIndex = 0;
        var procFilesIndex = 0;

        while (filesIndex < fileNames.Count)
        {
            if (procFilesIndex >= processedFileNames.Count)
            {
                yield return files[filesIndex++];
            }
            else
            {
                var rc = comparer.Compare(fileNames[filesIndex], processedFileNames[procFilesIndex]);
                if (rc != 0)
                {
                    if (rc < 0)
                    {
                        yield return files[filesIndex++];
                    }
                    else
                    {
                        procFilesIndex++;
                    }
                }
                else
                {
                    filesIndex++;
                    procFilesIndex++;
                }
            }
        }

        yield break;
    }

, - . -1 -1 ...

0

All Articles