Insert Vector Conversion

I previously posted a question the best way to combine two std::vector s, where you first need to convert one vector. Although the obvious solution using std::transform might not be the theoretically optimal solution, the cost of multiple-throughput checks is unlikely to be significant.

But if we consider the more general task of inserting one vector, which should be transformed into another, now there is potential significant overhead information.

What is the best way to do this?

+4
c ++ vector c ++ 11 stl
source share
3 answers

Here I see several options:

Method 1: Temporary Conversion

 std::vector<T1> temp {}; temp.reserve(vec.size()); std::transform(vec2.begin(), vec2.end(), std::back_inserter(temp), op); vec1.insert(vec1.begin() + pos, std::make_move_iterator(temp.begin()), std::make_move_iterator(temp.end())); 

But now the overhead is not a trivial test of bandwidth, but the additional cost is the allocation / release of size_of(T1) * vec2.size() for temp . This can be very significant if vec2 is large.

Method 2: insert a loop

 vec1.reserve(vec1.size() + vec2.size()); std::size_t i {pos}; for (const auto& p : vec2) { vec1.insert(vec1.begin() + i, op); ++i; } 

This avoids the additional distribution / release of method 1, but there is another serious problem with this solution: each of the elements n = vec1.size() - pos in vec1 must be shifted n times, and O(n^2) .

Method 3: Shift and Copy

 vec1.resize(vec1.size() + pair_vec.size()); std::move(vec1.begin() + pos, vec1.end(), vec1.begin() + pos + vec2.size()); std::transform(vec2.begin(), vec2.end(), vec1.begin() + pos, op); 

This is similar to what we want, we only pay for the "extra" n default constructors.

EDIT

My shift and copy method (3) was wrong, it should be:

 auto v1_size = vec1.size(); vec1.resize(vec1.size() + vec2.size()); std::move(vec1.begin() + pos, vec1.begin() + v1_size, vec1.begin() + pos + vec2.size()); std::transform(vec2.begin(), vec2.end(), vec1.begin() + pos, op); 

TESTS (updated using @VaughnCato and @ViktorSehr methods)

I tested methods 1 and 3 (method 2 obviously will not perform well - easy to check), as well as the methods suggested by @VaughnCato and @ViktorSehr . Here is the complete code:

 #include <iostream> #include <vector> #include <iterator> #include <algorithm> #include <cstdint> #include <chrono> #include <numeric> #include <random> #include <boost/iterator/transform_iterator.hpp> #include <boost/range/adaptor/transformed.hpp> using std::size_t; std::vector<int> generate_random_ints(size_t n) { std::default_random_engine generator; auto seed1 = std::chrono::system_clock::now().time_since_epoch().count(); generator.seed((unsigned) seed1); std::uniform_int_distribution<int> uniform {}; std::vector<int> v(n); std::generate_n(v.begin(), n, [&] () { return uniform(generator); }); return v; } std::vector<std::string> generate_random_strings(size_t n) { std::default_random_engine generator; auto seed1 = std::chrono::system_clock::now().time_since_epoch().count(); generator.seed((unsigned) seed1); std::uniform_int_distribution<int> uniform {}; std::vector<std::string> v(n); std::generate_n(v.begin(), n, [&] () { return std::to_string(uniform(generator)); }); return v; } template <typename D=std::chrono::nanoseconds, typename F> D benchmark(F f, unsigned num_tests) { D total {0}; for (unsigned i = 0; i < num_tests; ++i) { auto start = std::chrono::system_clock::now(); f(); auto end = std::chrono::system_clock::now(); total += std::chrono::duration_cast<D>(end - start); } return D {total / num_tests}; } template <typename T1, typename T2, typename UnaryOperation> void temp_transform(std::vector<T1> vec1, const std::vector<T2> &vec2, size_t pos, UnaryOperation op) { std::vector<T1> temp {}; temp.reserve(vec2.size()); std::transform(vec2.begin(), vec2.end(), std::back_inserter(temp), op); vec1.insert(vec1.begin() + pos, std::make_move_iterator(temp.begin()), std::make_move_iterator(temp.end())); } template <typename T1, typename T2, typename UnaryOperation> void shift_copy(std::vector<T1> vec1, const std::vector<T2> &vec2, size_t pos, UnaryOperation op) { auto v1_size = vec1.size(); vec1.resize(vec1.size() + vec2.size()); std::move(vec1.begin() + pos, vec1.begin() + v1_size, vec1.begin() + pos + vec2.size()); std::transform(vec2.begin(), vec2.end(), vec1.begin() + pos, op); } template <typename T1, typename T2, typename UnaryOperation> void boost_transform(std::vector<T1> vec1, const std::vector<T2> &vec2, size_t pos, UnaryOperation op) { auto v2_begin = boost::make_transform_iterator(vec2.begin(), op); auto v2_end = boost::make_transform_iterator(vec2.end(), op); vec1.insert(vec1.begin() + pos, v2_begin, v2_end); } template <typename T1, typename T2, typename UnaryOperation> void boost_adapter(std::vector<T1> vec1, const std::vector<T2> &vec2, size_t pos, UnaryOperation op) { auto transformed_range = vec2 | boost::adaptors::transformed(op); vec1.insert(vec1.begin() + pos, transformed_range.begin(), transformed_range.end()); } int main(int argc, char **argv) { unsigned num_tests {1000}; size_t vec1_size {1000000}; size_t vec2_size {1000000}; size_t insert_pos {vec1_size / 2}; // Switch the variable names to inverse test auto vec2 = generate_random_ints(vec1_size); auto vec1 = generate_random_strings(vec2_size); //auto op = [] (const std::string& str) { return std::stoi(str); }; auto op = [] (int i) { return std::to_string(i); }; auto f_temp_transform_insert = [&vec1, &vec2, &insert_pos, &op] () { temp_transform(vec1, vec2, insert_pos, op); }; auto f_shift_copy_insert = [&vec1, &vec2, &insert_pos, &op] () { shift_copy(vec1, vec2, insert_pos, op); }; auto f_boost_transform_insert = [&vec1, &vec2, &insert_pos, &op] () { boost_transform(vec1, vec2, insert_pos, op); }; auto f_boost_adapter_insert = [&vec1, &vec2, &insert_pos, &op] () { boost_adapter(vec1, vec2, insert_pos, op); }; auto temp_transform_insert_time = benchmark<std::chrono::milliseconds>(f_temp_transform_insert, num_tests).count(); auto shift_copy_insert_time = benchmark<std::chrono::milliseconds>(f_shift_copy_insert, num_tests).count(); auto boost_transform_insert_time = benchmark<std::chrono::milliseconds>(f_boost_transform_insert, num_tests).count(); auto boost_adapter_insert_time = benchmark<std::chrono::milliseconds>(f_boost_adapter_insert, num_tests).count(); std::cout << "temp_transform: " << temp_transform_insert_time << "ms" << std::endl; std::cout << "shift_copy: " << shift_copy_insert_time << "ms" << std::endl; std::cout << "boost_transform: " << boost_transform_insert_time << "ms" << std::endl; std::cout << "boost_adapter: " << boost_adapter_insert_time << "ms" << std::endl; return 0; } 

results

Compiled with

 g++ vector_insert.cpp -std=c++11 -O3 -o vector_insert_test 

Average user wait time:

 | Method | std::string -> int (ms) | int -> std::string (ms) | |:-----------:|:-----------------------:|:-----------------------:| | 1 | 68 | 220 | | 3 | 67 | 222 | | @VaughnCato | 71 | 239 | | @ViktorSehr | 72 | 236 | 

TL; DR

  • boost methods do not execute as fast as std::transform methods.
  • The std::transform methods are almost equally good - although it is difficult to interpret the discrepancy between the results intstd::string and std::stringint .
+1
source share

The @VaughnCato approach of using boost::transform_iterator for your other question should work well for that.

 auto vec1begin = boost::make_transform_iterator(vec1.begin(), f); auto vec1end = boost::make_transform_iterator(vec1.end(), f); vec2.insert(middle, vec1begin, vec1end); 
+1
source share

What about using boost :: range :: adapters :: transform ?

 std::vector<...> first_vector; auto transform_functor = [](...){...}; auto transformed_range = first_vector | boost::adaptors::transformed(transform_functor): some_vector.insert(some_vector.end(), transformed_range.begin(), transformed_range.end()); 

Given that boost :: transform and the vector :: insert function are implemented as smart as possible, this should be able to ignore any undefined health checks.

0
source share

All Articles