C ++ reading random txt strings?

I am running C ++ code where I need to import data from a txt file. The text file contains 10,000 lines. Each row contains n columns of binary data.

The code should loop 100,000 times, every time it has to arbitrarily select a row from the txt file and assign binary values ​​in the columns to some variables.

What is the most efficient way to write this code? Should I load the file into memory first or should I accidentally open a random line number?

How can I implement this in C ++?

+7
source share
4 answers

For random access to a line in a text file, all lines must have the same byte length. If you do not have this, you need to go in cycles until you reach the correct line. Since it will be quite slow for such a large access, it is better to just load it into std::vector from std::string s, each record will be a single line (this is easy to do with std::getline ). Or, since you want to assign values ​​from different columns, you can use std::vector with your own structure, e.g.

 struct MyValues{ double d; int i; // whatever you have / need }; std::vector<MyValues> vec; 

What could be better than parsing a string.

Using std::vector you get your random access and only once have to go through the loop throughout the file.

+3
source

10K lines is a pretty small file. If you have, say, 100 characters per line, it will use a HUGE amount of 1 MB of your RAM.

Download it in vector and access it the way you want.

+1
source

perhaps not the most efficient, but you can try the following:

 int main() { //use ifstream to read ifstream in("yourfile.txt"); //string to store the line string line = ""; //random number generator srand(time(NULL)); for(int i = 0; i < 100000; i++) { in.seekg(rand() % 10000); in>>line; //do what you want with the line here... } } 

I'm too lazy right now, but you need to make sure that you check your ifstream for errors like end of file, out-of-bounds index, etc.

0
source

Since you take 100,000 samples from a total of 10,000 lines, most lines will be sampled. Read the entire file in the data structure of the array, and then arbitrarily try the array. This allows you to completely avoid file searches.

A more common case is to select only a small subset of the file data. To do this, assuming the lines are of different lengths, find random points in the file, go to the next new line (for example, cin.ignore( numeric_limits< streamsize >::max(), '\n' ) , and then analyze the subsequent text.

0
source

All Articles