Search in text files up to a specific line

I am writing a program to search for text files, where each of them has a specific line. The goal is to ignore everything after this line. My current code reads the entire text file and returns Enumerable from the resulting file names where the term was found.

var searchResults = files.Where(file => File.ReadAllText(file.FullName).Contains(searchTerm)).Select(file => file.FullName); 

Can I turn on ignoring all lines after this particular line? Performance will be important as there are thousands of files.

+4
source share
2 answers

You can change your request to:

 var searchResults = files.Where(file => File.ReadLines(file.FullName).Any(line => line.Contains(searchTerm)) .Select(file => file.FullName)); 

Instead of using File.ReadAllText you can use File.ReadLines , which is lazily evaluated and should stop reading when the condition is met.

https://msdn.microsoft.com/en-us/library/vstudio/dd383503(v=vs.100).aspx

To make it faster, you can also use Parallel LINQ:

 var searchResults = files.AsParallel() .Where(file => File.ReadLines(file.FullName).Any(line => line.Contains(searchTerm)) .Select(file => file.FullName)); 
+7
source

You can read the file line by line and close it if the value is found:

  static string[] SearchFiles(string[] filesSrc, string searchTerm) { List<string> result = new List<string>(); string line = ""; StreamReader reader = null; for (int i = 0; i < filesSrc.Length; i++) { reader = new StreamReader(filesSrc[i]); while ((line = reader.ReadLine()) != null) if (line.Contains(searchTerm)) { result.Add(filesSrc[i]); break; } } reader.Dispose(); return result.ToArray(); } 

And use it like: string[] files = SearchFiles(yourfiles[], "searchTerm");

Depending on what you need, you can pass File[] this method and then get the string value with the full path, but you have not provided an example of your File class, and it is difficult to implement without knowing what your class really is looks like.

PS Using LINQ is another possible and good solution (not to mention that it is just 1-2 lines of code).

Improvised showed that LINQ at this stage is 10-20% slower, so it is probably better to stick with it.

+1
source

All Articles