The fastest way to get data from CSV in C ++

I have a large CSV (approximately 75 MB) of this kind:

1,2,4 5,2,0 1,6,3 8,3,1 ... 

And I save my data with this code:

 #include <sstream> #include <fstream> #include <vector> int main() { char c; // to eat the commas int x, y, z; std::vector<int> xv, yv, zv; std::ifstream file("data.csv"); std::string line; while (std::getline(file, line)) { std::istringstream ss(line); ss >> x >> c >> y >> c >> z; xv.push_back(x); yv.push_back(y); zv.push_back(z); } return 0; } 

And that took me in this big CSV (~ 75 MB):

 real 0m7.389s user 0m7.232s sys 0m0.132s 

This is true!

Recently, using a Sublime Text snippet, I found another way to read the file:

 #include <iostream> #include <vector> #include <cstdio> int main() { std::vector<char> v; if (FILE *fp = fopen("data.csv", "r")) { char buf[1024]; while (size_t len = fread(buf, 1, sizeof(buf), fp)) v.insert(v.end(), buf, buf + len); fclose(fp); } } 

And that took me (without getting data) in this big CSV (~ 75 MB):

 real 0m0.118s user 0m0.036s sys 0m0.080s 

This is a huge time difference!

The question is how can I get data in 3 vectors in a character vector faster! I do not know how I can do this faster than the first.

Thank you very much! ^^

+7
source share
2 answers

Of course, the second version will be much faster - it simply reads the file into memory without analyzing the values ​​in it. Equivalent to the first version using C-style I / O will be down the line

 if (FILE *fp = fopen("data.csv", "r")) { while (fscanf(fp, "%d,%d,%d", &x, &y, &z) == 3) { xv.push_back(x); yv.push_back(y); zv.push_back(z); } fclose(fp); } 

which for me is about three times faster than the C ++ style version. But C ++ version without intermediate stringstream

 while (file >> x >> c >> y >> c >> z) { xv.push_back(x); yv.push_back(y); zv.push_back(z); } 

almost as fast.

+6
source

Save in the file how many digits are written inside. Then, when loading, resize the vectors. It can shorten the time a bit.

0
source

All Articles