You copy only 4 characters (depending on the width of your system pointer). This will cause numbers of 4+ non-zero characters to be completed, resulting in fluent lines at the input to atoi
sizeof(str.c_str())
it should be
str.length() + 1
Or characters will not be destroyed.
Only for STL:
make_testdata() : see all the way down
Why aren't you using threads ...?
#include <sstream> #include <iostream> #include <algorithm> #include <iterator> #include <string> #include <vector> int main() { std::vector<int> data = make_testdata(); std::ostringstream oss; std::copy(data.begin(), data.end(), std::ostream_iterator<int>(oss, "\t")); std::stringstream iss(oss.str()); std::vector<int> clone; std::copy(std::istream_iterator<int>(iss), std::istream_iterator<int>(), std::back_inserter(clone)); //verify that clone now contains the original random data: //bool ok = std::equal(data.begin(), data.end(), clone.begin()); return 0; }
You can do it much faster in regular C with atoi / itoa and some settings, but I believe you should use binary transmission (see Boost Spirit Karma and protobuf for good libraries) if you need speed.
Boost Karma / Qi:
#include <boost/spirit/include/qi.hpp> #include <boost/spirit/include/karma.hpp> namespace qi=::boost::spirit::qi; namespace karma=::boost::spirit::karma; static const char delimiter = '\0'; int main() { std::vector<int> data = make_testdata(); std::string astext; // astext.reserve(3 * sizeof(data[0]) * data.size()); // heuristic pre-alloc std::back_insert_iterator<std::string> out(astext); { using namespace karma; generate(out, delimit(delimiter) [ *int_ ], data); // generate_delimited(out, *int_, delimiter, data); // equivalent // generate(out, int_ % delimiter, data); // somehow much slower! } std::string::const_iterator begin(astext.begin()), end(astext.end()); std::vector<int> clone; qi::parse(begin, end, qi::int_ % delimiter, clone); //verify that clone now contains the original random data: //bool ok = std::equal(data.begin(), data.end(), clone.begin()); return 0; }
If you wanted to do architecture-independent binary serialization, you would use this tiny adaptation that did things a million times faster (see table below ...):
karma::generate(out, *karma::big_dword, data); // ... qi::parse(begin, end, *qi::big_dword, clone);
Speed ββUp Session
Best performance can be achieved by using Boost Serialization in binary mode:
#include <sstream> #include <boost/archive/binary_oarchive.hpp> #include <boost/archive/binary_iarchive.hpp> #include <boost/serialization/vector.hpp> int main() { std::vector<int> data = make_testdata(); std::stringstream ss; { boost::archive::binary_oarchive oa(ss); oa << data; } std::vector<int> clone; { boost::archive::binary_iarchive ia(ss); ia >> clone; } //verify that clone now contains the original random data: //bool ok = std::equal(data.begin(), data.end(), clone.begin()); return 0; }
Testdata h2>
(common to all versions above)
#include <boost/random.hpp> // generates a deterministic pseudo-random vector of 32Mio ints std::vector<int> make_testdata() { std::vector<int> testdata; testdata.resize(2 << 24); std::generate(testdata.begin(), testdata.end(), boost::mt19937(0)); return testdata; }
Benchmarks
I compared it
- using input
2<<24 (33554432) random integers - do not display output (we do not want to measure the scroll performance of our terminal)
- rough timings were
- Only the STL version is not so bad at 12.6s
- The text version of Karma / Qi launched 5.1s
in 18 seconds , thanks to Arlen's prompt generate_delimited :) - The binary version of Karma / Qi (big_dword) is only 1.4 seconds ( roughly
12x 3-4 times faster ) - Boost Serialization takes a cake with 0.8 s (or when replacing text archives instead of binary files, about 13 seconds).