Using CSV data using C ++

Sorry to ask a question that many may have already asked.

I have a very long CSV data file (dat.csv) with 5 columns. I have another short CSV file (filter.csv) with 1 column.

Now I only need to extract the columns from dat.csv, where column-1 is the same as filter.csv column-column.

I usually did this in BASH using sed/awk . However, for some other reasons, I need to do this in a C ++ file. Can you suggest an effective way to do this?

Sample data:

data.csv

 ID,Name,CountryCode,District,Population 3793,NewYork,USA,NewYork,8008278 3794,LosAngeles,USA,California,3694820 3795,Chicago,USA,Illinois,2896016 3796,Houston,USA,Texas,1953631 3797,Philadelphia,USA,Pennsylvania,1517550 3798,Phoenix,USA ,Arizona,1321045 3799,SanDiego,USA,California,1223400 3800,Dallas,USA,Texas,1188580 3801,SanAntonio,USA,Texas,1144646 

filter.csv

 3793 3797 3798 
+6
source share
2 answers

Here are some suggestions:

  • The stream from which you are reading data should ignore commas, so it should specify commas in spaces using the face std::ctype<char> , pierced by its language version. Here is an example of changing the classification table:

     struct ctype : std::ctype<char> { private: static mask* get_table() { static std::vector<mask> v(classic_table(), classic_table() + table_size); v[','] &= ~space; return &v[0]; } public: ctype() : std::ctype<char>(get_table()) { } }; 
  • Read the first csv. file line-weight ( std::getline() value). Extract the first word and compare it with the extraction from the second CSV file. Continue this until you reach the end of the first file:

     int main() { std::ifstream in1("test1.csv"); std::ifstream in2("test2.csv"); typedef std::istream_iterator<std::string> It; in1 >> comma_whitespace; in2 >> comma_whitespace; std::vector<std::string> in2_content(It(in2), It()); std::vector<std::string> matches; while (std::getline(in1, line)) { std::istringstream iss(line); It beg(iss); if (std::find(in2_content.begin(), in2_content.end(), *beg) != in2_content.end()) { matches.push_back(line); } } } // After the above, the vector matches should hold all the rows that // have the same ID number as in the second csv file 

    comma_whitespace is a manipulator that changes the locale to a custom ctype defined above.

    Disclaimer: I have not tested this code.

0
source

This .csv collation library can help:

http://www.partow.net/programming/dsvfilter/index.html

You can combine the columns of both tables into one large table and then request matches in the new table (where column 1 of table A and column 1 of table B are). Or maybe the library has functions for comparing tables.

+7
source

All Articles