C ++ a faster way to do adding strings?

I find the standard line addition very slow, so I am looking for some hints / hacks that can speed up some code that I have.

My code is basically structured as follows:

inline void add_to_string(string data, string &added_data) { if(added_data.length()<1) added_data = added_data + "{"; added_data = added_data+data; } int main() { int some_int = 100; float some_float = 100.0; string some_string = "test"; string added_data; added_data.reserve(1000*64); for(int ii=0;ii<1000;ii++) { //variables manipulated here some_int = ii; some_float += ii; some_string.assign(ii%20,'A'); //then we concatenate the strings! stringstream fragment; fragment<<some_int <<","<<some_float<<","<<some_string; add_to_string(fragment.str(),added_data); } return; } 

While doing basic profiling, I found that a ton of time is used in the for loop. Are there any things I can do that will speed this up significantly? Did this help to use c strings instead of C ++ strings?

+6
source share
7 answers

You can save a lot of string operations if you don't call add_to_string in your loop.

I believe this does the same thing (although I'm not an expert in C ++ and I don't know exactly what stringstream ):

 stringstream fragment; for(int ii=0;ii<1000;ii++) { //variables manipulated here some_int = ii; some_float += ii; some_string.assign(ii%20,'A'); //then we concatenate the strings! fragment<<some_int<<","<<some_float<<","<<some_string; } // inlined add_to_string call without the if-statement ;) added_data = "{" + fragment.str(); 
+4
source

Adding a line is not the problem you are facing. std :: stringstream is known to be slow due to its design. At each iteration of your for loop, the line is responsible for at least 2 distributions and 2 deletions. The cost of each of these 4 operations is probably more than the cost of adding a row.

Profile the following and measure the difference:

 std::string stringBuffer; for(int ii=0;ii<1000;ii++) { //variables manipulated here some_int = ii; some_float += ii; some_string.assign(ii%20,'A'); //then we concatenate the strings! char buffer[128]; sprintf(buffer, "%i,%f,%s",some_int,some_float,some_string.c_str()); stringBuffer = buffer; add_to_string(stringBuffer ,added_data); } 

Ideally, replace sprintf with _snprintf or the equivalent supported by your compiler.

Typically, use stringstream for default formatting and switch to faster and less secure functions like sprintf, itoa, etc. when performance matters.

Edit: this, and what deierc said: added_data + = data;

+5
source

I see that you used the stock method on added_data , which should help, avoiding multiple redistributions of the row as it grows.

You should also use the += line operator, if possible:

 added_data += data; 

I think the above should save some significant time by avoiding unnecessary copies of back and forth added_data in the timeline when doing catenation.

This += operator is a simpler version of the string::append method, it just copies data directly at the end of added_data . Since you made a reserve, this operation should be very fast (almost equivalent to strcpy).

But why does all this happen when you are already using stringstream to process input? Have it all there to get started!

The stringstream class is really not very efficient.

You can look at the stringstream class for more information on how to use it if necessary, but your decision to use a string since the buffer seems to avoid the class speed problem.

In any case, avoid trying to override the critical speed code in pure C unless you really know what you are doing. Some other SO posts support the idea of ​​this, but I think it's best (read safer) to rely on the standard library as much as possible, which will improve over time, and take care of many cases with angles that you (or I) do not I would have thought. If your input format is set in stone, you can start thinking about going that route, but otherwise it’s premature optimization.

+3
source

If you run added_data with "{" , you can remove the if from your add_to_string method: if will execute exactly once when the line is empty, so you can also make it non-empty right away.

In addition, your add_to_string creates a copy of data ; it is not necessary because it does not change. Taking a data reference to const should speed up the process for you.

Finally, changing the added_data from string to sstream should allow you to add it to the loop, without the sstream broker that is created, copied, and discarded at each iteration of the loop.

+2
source

Please see Twine used in LLVM.

A Twine is a kind of rope, it is a concatenated string using a binary tree, where the string is a preliminary order of nodes. Since twine can be effectively turned into a buffer when its result is used, this avoids the cost of creating temporary values ​​for the intermediate string of results β€” especially in cases where the twine result is never required. By explicitly tracking the type of leaf nodes, we can also avoid creating temporary strings for conversion operations (for example, adding an integer to a string).

This may help in solving your problem.

+2
source

How about this approach?

This is a DevPartner report for MSVC 2010.

enter image description here

0
source

string newstring = stringA and stringB;

I don’t think that strings are slow, their conversions, which can slow down their work, and perhaps your compiler, which can check variable types for inconsistencies.

-4
source

All Articles