Reading a text file on a specific line

I have a text file with over 3000 lines. I find the number of lines using

string[] lines = File.ReadAllLines(myPath); var lineCount = lines.Length; 

Then I create a random number

 Random rand = new Random(); var lineToRead = rand.Next(1, lineCount); 

Now I need to read a specific line that is generated by a random number. I can do it using

 string requiredLine = lines[lineToRead]; 

Since my file is large, I don't think creating such a large array is efficient. Is there a more efficient or easier way to do this?

+8
c #
source share
6 answers

Here is a solution that iterates the file twice (first time to count lines, next time to select line). The advantage is that you do not need to create an array of 3,000 lines in memory. But, as mentioned above, this will probably be slower. Why? - because File.ReadAllLines creates a list of lines inside, and this list will be changed many times when 3000 items are filled. (The initial capacity will be 4 When the internal array is completely full, a new double-sized array will be created, and all rows will be copied there).

Thus, the solution uses the File.ReadLines method, which returns an IEnumerable<string> with lines and missing lines that you do not need:

 IEnumerable<string> lines = File.ReadLines(myPath); var lineToRead = rand.Next(1, lines.Count()); var line = lines.Skip(lineToRead - 1).First(); 

BTW, inside File.ReadLines uses the SteamReader , which reads the file line by line.

+9
source share

What you can do is analyze the file to find the index of each line, and then later you can return to a specific line using Stream.Position to get the content. Using this method, you do not need to store anything in memory, and it is fast enough. I tested this on a file that is 20 thousand lines long and 1 MB in size . It took 7 ms to index the file, and 0.3 to get the line.

  // Parse the file var indexes = new List<long>(); using (var fs = File.OpenRead("text.txt")) { indexes.Add(fs.Position); int chr; while ((chr = fs.ReadByte()) != -1) { if (chr == '\n') { indexes.Add(fs.Position); } } } int lineCount = indexes.Count; int randLineNum = new Random().Next(0, lineCount - 1); string lineContent = ""; // Read the random line using (var fs = File.OpenRead("text.txt")) { fs.Position = indexes[randLineNum]; using (var sr = new StreamReader(fs)) { lineContent = sr.ReadLine(); } } 
+1
source share

You can wrap the stream in StreamReader and call ReadLine as many times as necessary to jump to your target line. Thus, you do not need to store the contents of the entire file in memory.

However, this is only possible if you rarely do this, and the file is quite large.

0
source share

Use reservoir sampling to solve this problem in one pass

If you want to randomly select one or more elements from a list of elements whose length is not known in advance, you can use Collector Selections .

We can take advantage of this, along with the File.ReadLines() method (which avoids buffering all the lines in memory) to write one -pass, which will read each line only once, without buffering.

The code example below shows a generic solution that allows you to randomly select any number of rows. For your case, N = 1.

The sample code also includes a test program to prove that the rows are selected randomly with uniform distribution.

(To find out how this code works, see the Wiki article linked above.)

 using System; using System.IO; using System.Collections.Generic; namespace Demo { internal class Program { public static List<string> RandomlyChooseLinesFromFile(string filename, int n, Random rng) { var result = new List<string>(n); int index = 0; foreach (var line in File.ReadLines(filename)) { if (index < n) { result.Add(line); } else { int r = rng.Next(0, index + 1); if (r < n) result[r] = line; } ++index; } return result; } // Test RandomlyChooseLinesFromFile() private static void Main(string[] args) { Directory.CreateDirectory("C:\\TEST"); string testfile = "C:\\TEST\\TESTFILE.TXT"; File.WriteAllText(testfile, "0\n1\n2\n3\n4\n5\n6\n7\n8\n9"); var rng = new Random(); int trials = 100000; var counts = new int[10]; for (int i = 0; i < trials; ++i) { string line = RandomlyChooseLinesFromFile(testfile, 1, rng)[0]; int index = int.Parse(line); ++counts[index]; } // If this algorithm is correct, each line should be chosen // approximately 10% of the times. Console.WriteLine("% times each line was chosen:\n"); for (int i = 0; i < 10; ++i) { Console.WriteLine("{0} = {1}%", i, 100*counts[i]/(double)trials); } } } } 
0
source share

Below you will find reading in a specific line of the file.

http://social.msdn.microsoft.com/Forums/en-US/csharpgeneral/thread/4dbd68f6-61f5-4d36-bfa0-5c909101874b

Code snicket

 using System; using System.Collections.Generic; using System.Text; using System.IO; namespace ReadLine { class Program { static void Main(string[] args) { //Load our text file TextReader tr = new StreamReader("\\test.txt"); //How many lines should be loaded? int NumberOfLines = 15; //Make our array for each line string[] ListLines = new string[NumberOfLines]; //Read the number of lines and put them in the array for (int i = 1; i < NumberOfLines; i++) { ListLines[i] = tr.ReadLine(); } //This will write the 5th line into the console Console.WriteLine(ListLines[5]); //This will write the 1st line into the console Console.WriteLine(ListLines[1]); Console.ReadLine(); // close the stream tr.Close(); } } } 

They may also be helpful.

http://www.tek-tips.com/viewthread.cfm?qid=1460456

How to read the specified line in a text file?

And below for editing

Edit a specific line of a text file in C #

Hope this helps ...

-one
source share

you can try as shown below ... it cannot create any large array, but get a specific string ...

 string path = "D:\\Software.txt"; int lines = File.ReadAllLines(path).Length; Random rand = new Random(); var lineToRead = rand.Next(1, lines); var requiredLine = System.IO.File.ReadLines(path).Skip(lineToRead - 1).First(); Console.WriteLine(requiredLine.ToString()); 
-one
source share

All Articles