Semantics of value and output parameters with large data structures

Question

Semantics of value and output parameters with large data structures

2013 Keynote: Chandler Carruth: optimizing emerging C ++ structures

42:45
You don’t need output parameters, we have semantics of values in C ++ .... Anytime you see that someone claims that nonono I’m not going to return at a cost, because a copy will cost too much, someone is working on an optimizer, saying that they are wrong. Fine? I have not seen a code fragment where this argument was correct .... People do not understand how important the semantics of values are for the optimizer, because it fully explains the scripts of aliases.

Can someone put this in the context of this answer: stack overflow

I hear myself repeating over and over, but for me, a function that returns something is the source . The output parameters by reference take this characteristic from the function, and removing such a hard-coded characteristic from the function instead allows you to control how the output will be saved / reused.

My question, even in the context of this SO answer, is there a way to tell, by restructuring the code in some other equivalent way, "ok now see, value semantics in this way does not lose for the version of the output parameter", or Chandler's comments were specific to some contrived situations ? I even saw Andrei Alexandrescu arguing about this in a conversation and saying that you cannot avoid using ref ref for better performance.

For a different look at Andrew’s comments, see Eric Niebler: output parameters, semantics movement, and state algorithms .

+10

c ++ optimization c ++ 11

pepper_chico Feb 10 '14 at 4:34

source share

3 answers

The problem with the output parameters described in the related question is that they usually make the general case of the call (i.e. you don't have vector storage to use already) much more verbose than usual. For example, if you used return by value:

 auto a = split(s, r);

if you used the output parameters:

 std::vector<std::string> a; split(s,r,a);

The second looks much less attractive to my eyes. Also, as Chandler noted, the optimizer can do much more with the first than the second, depending on the rest of your code.

Is there a way to get the best of both worlds? Emphatically yes using move semantics:

 std::vector<std::string> split(const std::string &s, const std::regex &r, std::vector<std::string> v = {}) { auto rit = std::sregex_token_iterator(s.begin(), s.end(), r, -1); auto rend = std::sregex_token_iterator(); v.clear(); while(rit != rend) { v.push_back(*rit); ++rit; } return v; }

Now, in the general case, we can call split as usual (i.e. the first example), and it will allocate a new vector storage for us. In the important but rare case, when we have to split several times, and we want to reuse the same store, we can just move to the store, which is stored between calls:

 int main() { const std::regex r(" +"); std::vector<std::string> a; for(auto i=0; i < 1000000; ++i) a = split("abc", r, std::move(a)); return 0; }

This works as fast as the output argument method, and what happens is pretty clear. You do not need to constantly do your function constantly in order to get good performance for some time.

+1

rmcclellan Feb 11 '14 at 21:35

source share

I was going to implement a solution using string_view and ranges , and then found the following:

std :: split (): line splitting algorithm

This confirms the conclusion of ~~the semantics value~~ by return, and I accept it as more beautiful than the current choice in the SO answer. This construct will also be characterized as a source from the function, although it returns . A simple illustration: one could have an external reserved vector that is re-populated with the returned range.

In any case, I'm not sure that such a split version will help the optimizer in any sense (I speak in context to Chandler's talk here).

Note

The version of the output parameter provides a named variable on the call site, which can be ugly to the eye, but can simplify the debugging of the call site.

Solution example

Until std::split comes, I implemented the output of ~~the semantics~~ using the return version as follows:

 #include <string> #include <string_view> #include <boost/regex.hpp> #include <boost/range/iterator_range.hpp> #include <boost/iterator/transform_iterator.hpp> using namespace std; using namespace std::experimental; using namespace boost; string_view stringfier(const cregex_token_iterator::value_type &match) { return {match.first, static_cast<size_t>(match.length())}; } using string_view_iterator = transform_iterator<decltype(&stringfier), cregex_token_iterator>; iterator_range<string_view_iterator> split(string_view s, const regex &r) { return { string_view_iterator( cregex_token_iterator(s.begin(), s.end(), r, -1), stringfier ), string_view_iterator() }; } int main() { const regex r(" +"); for (size_t i = 0; i < 1000000; ++i) { split("abc", r); } }

I used the Marshall Clow implementation of string_view libC ++ found at https://github.com/mclow/string_view .

I posted the timings at the bottom of the response .

0

pepper_chico Feb 11 '14 at 17:02

source share

justin · Accepted Answer · 2014-02-11 17:33

This is either an exaggeration, a generalization, or a joke, or Chandler’s idea “Absolutely reasonable performance” (using modern C ++ toolschains / libs) is unacceptable for my programs.

I find it a rather narrow circle of optimizations. Penalties exist outside this area that cannot be ignored due to the actual complexities and designs found in the programs. Heap allocation was an example for the getline example. Specific optimizations may or may not always be applicable to the program in question, despite your attempts to reduce them. Real world structures will refer to memory, which may be an alias. You can reduce this, but it is impractical to believe that you can eliminate aliasing (from an optimizer's point of view).

Of course, RBV can be great - it just doesn't fit all cases. Even the link to which you referred indicated how to avoid a ton of distributions / releases. The actual programs and data structures found in them are much more complex.

Later in the conversation, he continues to criticize the use of member functions (ref: S::compute() ). Of course, it makes sense to remove, but is it really reasonable to avoid using these language functions completely, because it makes the work of the optimizer easier? No. Will this always lead to more readable programs? No. Do these code conversions always lead to significantly faster programs? No. Are the changes needed to transform your codebase for the duration of your investment? Sometimes. Can you remove some points and make more informed decisions that affect your existing or future code base? Yes.

Sometimes this helps break down how your program will run, or how it will look in C.

The optimizer will not solve all performance problems, and you should not rewrite programs with the assumption that the programs you are dealing with are “completely dead brains and broken projects”, and you should not assume that using RBV will always result in “Perfectly reasonable performance. " You can use new language features and simplify the work of the optimizer, although there is much to gain, there are often more important optimizations to invest your time.

Good to consider the proposed changes; Ideally, you would measure the impact of such changes in real-time execution and affect the source code before accepting these suggestions.

In your example: even copying + assigning large structures at a cost can have significant costs. In addition to the costs of running constructors and destructors (along with creating / cleaning the resources that they acquire and own, as indicated in the link you link to), even simple things like eliminating unnecessary structural copies can save a lot of CPU if you use links (where necessary). A copy of the structure can be as simple as memcpy . These are not far-fetched problems; they appear in real programs, and the complexity can increase significantly with your program complexity. Is a reduction in the imposition of some memory and other optimizations a cost, and does it lead to "Absolutely reasonable performance"? Not always.

Semantics of value and output parameters with large data structures

Note

Solution example

More articles: