Parsing a std :: string comma

If I have a std :: string containing a list of numbers separated by commas, what is the easiest way to parse the numbers and put them in an integer array?

I do not want to generalize this to a one-time trial. Just a simple string of integers separated by commas, such as "1,1,1,1,2,1,1,1,0,0".

+118
c ++ string parsing stl csv
Dec 12 '09 at 22:21
source share
18 answers

Enter one number at a time, and check if the next character is,. If so, discard it.

 #include <vector> #include <string> #include <sstream> #include <iostream> int main() { std::string str = "1,2,3,4,5,6"; std::vector<int> vect; std::stringstream ss(str); for (int i; ss >> i;) { vect.push_back(i); if (ss.peek() == ',') ss.ignore(); } for (std::size_t i = 0; i < vect.size(); i++) std::cout << vect[i] << std::endl; } 
+136
Dec 12 '09 at 22:47
source share

Something less verbose is std and accepts something separated by a comma.

 stringstream ss( "1,1,1,1, or something else ,1,1,1,0" ); vector<string> result; while( ss.good() ) { string substr; getline( ss, substr, ',' ); result.push_back( substr ); } 
+96
Jun 02 2018-12-12T00:
source share

Another, quite different approach: use a special language that treats commas as spaces:

 #include <locale> #include <vector> struct csv_reader: std::ctype<char> { csv_reader(): std::ctype<char>(get_table()) {} static std::ctype_base::mask const* get_table() { static std::vector<std::ctype_base::mask> rc(table_size, std::ctype_base::mask()); rc[','] = std::ctype_base::space; rc['\n'] = std::ctype_base::space; rc[' '] = std::ctype_base::space; return &rc[0]; } }; 

To use this, you imbue() stream with a locale that includes this face. Once you do this, you can read the numbers as if the commas were not there at all. For example, we will read comma-separated numbers from input and write one line at a time to standard output:

 #include <algorithm> #include <iterator> #include <iostream> int main() { std::cin.imbue(std::locale(std::locale(), new csv_reader())); std::copy(std::istream_iterator<int>(std::cin), std::istream_iterator<int>(), std::ostream_iterator<int>(std::cout, "\n")); return 0; } 
+59
Dec 13 '09 at 4:49
source share

C ++ String Toolkit Library (Strtk) has the following solution to your problem:

 #include <string> #include <deque> #include <vector> #include "strtk.hpp" int main() { std::string int_string = "1,2,3,4,5,6,7,8,9,10,11,12,13,14,15"; std::vector<int> int_list; strtk::parse(int_string,",",int_list); std::string double_string = "123.456|789.012|345.678|901.234|567.890"; std::deque<double> double_list; strtk::parse(double_string,"|",double_list); return 0; } 

More examples can be found here.

+44
Jul 02 '10 at 20:05
source share

An alternative solution using common algorithms and Boost.Tokenizer :

 struct ToInt { int operator()(string const &str) { return atoi(str.c_str()); } }; string values = "1,2,3,4,5,9,8,7,6"; vector<int> ints; tokenizer<> tok(values); transform(tok.begin(), tok.end(), back_inserter(ints), ToInt()); 
+17
Dec 12 '09 at 23:07
source share

You can also use the following function.

 void tokenize(const string& str, vector<string>& tokens, const string& delimiters = ",") { // Skip delimiters at beginning. string::size_type lastPos = str.find_first_not_of(delimiters, 0); // Find first non-delimiter. string::size_type pos = str.find_first_of(delimiters, lastPos); while (string::npos != pos || string::npos != lastPos) { // Found a token, add it to the vector. tokens.push_back(str.substr(lastPos, pos - lastPos)); // Skip delimiters. lastPos = str.find_first_not_of(delimiters, pos); // Find next non-delimiter. pos = str.find_first_of(delimiters, lastPos); } } 
+6
Sep 19 '12 at 22:21
source share
 std::string input="1,1,1,1,2,1,1,1,0"; std::vector<long> output; for(std::string::size_type p0=0,p1=input.find(','); p1!=std::string::npos || p0!=std::string::npos; (p0=(p1==std::string::npos)?p1:++p1),p1=input.find(',',p0) ) output.push_back( strtol(input.c_str()+p0,NULL,0) ); 

It would be nice to check for conversion errors in strtol() , of course. Perhaps the code may also help in some other error checks.

+5
Dec 12 '09 at 22:37
source share

There are a lot of terrible answers here, so I will add (including a test program):

 #include <string> #include <iostream> #include <cstddef> template<typename StringFunction> void splitString(const std::string &str, char delimiter, StringFunction f) { std::size_t from = 0; for (std::size_t i = 0; i < str.size(); ++i) { if (str[i] == delimiter) { f(str, from, i); from = i + 1; } } if (from <= str.size()) f(str, from, str.size()); } int main(int argc, char* argv[]) { if (argc != 2) return 1; splitString(argv[1], ',', [](const std::string &s, std::size_t from, std::size_t to) { std::cout << "'" << s.substr(from, to - from) << "'\n"; }); return 0; } 

Good properties:

  • No dependencies (e.g. boost)
  • Not crazy single line
  • Easy to understand (hopefully)
  • Handles spaces perfectly
  • Does not highlight partitions if you do not want this, for example, you can process them using lambda, as shown.
  • Does not add characters one at a time - should be fast.
  • If you use C ++ 17, you can change it to use std::stringview and then it will not make any allocations and should be very fast.

Some design options you might want to change:

  • Blank entries are not ignored.
  • An empty string will call f () once.

Examples of inputs and outputs:

 "" -> {""} "," -> {"", ""} "1," -> {"1", ""} "1" -> {"1"} " " -> {" "} "1, 2," -> {"1", " 2", ""} " ,, " -> {" ", "", " "} 
+4
Feb 15 '18 at 12:27
source share
 #include <sstream> #include <vector> const char *input = "1,1,1,1,2,1,1,1,0"; int main() { std::stringstream ss(input); std::vector<int> output; int i; while (ss >> i) { output.push_back(i); ss.ignore(1); } } 

Bad input (e.g. sequential separators) will ruin this, but you said it simply.

+2
Dec 12 '09 at 22:47
source share

I am surprised that no one suggested a solution with std::regex :

 #include <string> #include <algorithm> #include <vector> #include <regex> void parse_csint( const std::string& str, std::vector<int>& result ) { typedef std::regex_iterator<std::string::const_iterator> re_iterator; typedef re_iterator::value_type re_iterated; std::regex re("(\\d+)"); re_iterator rit( str.begin(), str.end(), re ); re_iterator rend; std::transform( rit, rend, std::back_inserter(result), []( const re_iterated& it ){ return std::stoi(it[1]); } ); } 

This function inserts all integers on the back of the input vector. You can configure the regular expression to include negative integers or floating point numbers, etc.

+2
Apr 04 '17 at 13:49 on
source share
 string exp = "token1 token2 token3"; char delimiter = ' '; vector<string> str; string acc = ""; for(int i = 0; i < exp.size(); i++) { if(exp[i] == delimiter) { str.push_back(acc); acc = ""; } else acc += exp[i]; } 
+1
Nov 03 '13 at 10:07
source share

I can’t comment yet (getting started on the site), but added a more general version of Jerry Coffin’s awesome Ctype class lesson to his post.

Thanks Jerry for the super idea.

(because it needs to be checked by experts, adding it here too temporarily)

 struct SeparatorReader: std::ctype<char> { template<typename T> SeparatorReader(const T &seps): std::ctype<char>(get_table(seps), true) {} template<typename T> std::ctype_base::mask const *get_table(const T &seps) { auto &&rc = new std::ctype_base::mask[std::ctype<char>::table_size](); for(auto &&sep: seps) rc[static_cast<unsigned char>(sep)] = std::ctype_base::space; return &rc[0]; } }; 
+1
Dec 05 '15 at 7:47
source share
 bool GetList (const std::string& src, std::vector<int>& res) { using boost::lexical_cast; using boost::bad_lexical_cast; bool success = true; typedef boost::tokenizer<boost::char_separator<char> > tokenizer; boost::char_separator<char> sepa(","); tokenizer tokens(src, sepa); for (tokenizer::iterator tok_iter = tokens.begin(); tok_iter != tokens.end(); ++tok_iter) { try { res.push_back(lexical_cast<int>(*tok_iter)); } catch (bad_lexical_cast &) { success = false; } } return success; } 
0
Dec 12 '09 at 23:09
source share

simple structure, easily adaptable, simple support.

 std::string stringIn = "my,csv,,is 10233478,separated,by commas"; std::vector<std::string> commaSeparated(1); int commaCounter = 0; for (int i=0; i<stringIn.size(); i++) { if (stringIn[i] == ",") { commaSeparated.push_back(""); commaCounter++; } else { commaSeparated.at(commaCounter) += stringIn[i]; } } 

at the end you will have a row vector with each sentence element separated by spaces. empty lines are saved as separate items.

0
Jul 25 '14 at 22:10
source share

Simple copy / paste function based on boost tokenizer .

 void strToIntArray(std::string string, int* array, int array_len) { boost::tokenizer<> tok(string); int i = 0; for(boost::tokenizer<>::iterator beg=tok.begin(); beg!=tok.end();++beg){ if(i < array_len) array[i] = atoi(beg->c_str()); i++; } 
0
May 30 '15 at 15:35
source share

This is the easiest way I've used a lot. It works for any single character delimiter.

 #include<bits/stdc++.h> using namespace std; int main() { string str; cin >> str; int temp; vector<int> result; char ch; stringstream ss(str); do { ss>>temp; result.push_back(temp); }while(ss>>ch); for(int i=0 ; i < result.size() ; i++) cout<<result[i]<<endl; return 0; } 
0
Aug 31 '18 at 16:50
source share
 void ExplodeString( const std::string& string, const char separator, std::list<int>& result ) { if( string.size() ) { std::string::const_iterator last = string.begin(); for( std::string::const_iterator i=string.begin(); i!=string.end(); ++i ) { if( *i == separator ) { const std::string str(last,i); int id = atoi(str.c_str()); result.push_back(id); last = i; ++ last; } } if( last != string.end() ) result.push_back( atoi(&*last) ); } } 
-one
Dec 12 '09 at 22:27
source share
 #include <sstream> #include <vector> #include <algorithm> #include <iterator> const char *input = ",,29870,1,abc,2,1,1,1,0"; int main() { std::stringstream ss(input); std::vector<int> output; int i; while ( !ss.eof() ) { int c = ss.peek() ; if ( c < '0' || c > '9' ) { ss.ignore(1); continue; } if (ss >> i) { output.push_back(i); } } std::copy(output.begin(), output.end(), std::ostream_iterator<int> (std::cout, " ") ); return 0; } 
-four
Feb 24 '12 at 23:30
source share



All Articles