I compared the performance of std::none_of with three different manual implementations using i) a for loop, ii) a range based for loop, and iii) iterators. To my surprise, I found that although all three manual implementations take about the same time, std::none_of much faster. My question is: why is this so?
I used the Google test library and compiled it with -std=c++14 -O3 . When starting the test, I limited the proximity of the process to one processor. I get the following result with GCC 6.2:
Benchmark Time CPU Iterations -------------------------------------------------------- benchmarkSTL 28813 ns 28780 ns 24283 benchmarkManual 46203 ns 46191 ns 15063 benchmarkRange 48368 ns 48243 ns 16245 benchmarkIterator 44732 ns 44710 ns 15698
In Clang 3.9, std::none_of also faster than a manual for loop, although the difference in speed is less. Here is the test code (for brevity only, for guidance):
#include <algorithm> #include <array> #include <benchmark/benchmark.h> #include <functional> #include <random> const size_t N = 100000; const unsigned value = 31415926; template<size_t N> std::array<unsigned, N> generateData() { std::mt19937 randomEngine(0); std::array<unsigned, N> data; std::generate(data.begin(), data.end(), randomEngine); return data; } void benchmarkSTL(benchmark::State & state) { auto data = generateData<N>(); while (state.KeepRunning()) { bool result = std::none_of( data.begin(), data.end(), std::bind(std::equal_to<unsigned>(), std::placeholders::_1, value)); assert(result); } } void benchmarkManual(benchmark::State & state) { auto data = generateData<N>(); while (state.KeepRunning()) { bool result = true; for (size_t i = 0; i < N; i++) { if (data[i] == value) { result = false; break; } } assert(result); } } BENCHMARK(benchmarkSTL); BENCHMARK(benchmarkManual); BENCHMARK_MAIN();
Note that generating data using a random number generator does not matter. I get the same result when I just set the i -th element to i and check if the value is N + 1 .
source share