Speed ​​difference for single-line string concatenation

So, I was believing that using the + operator to add lines on a single line was as effective as using StringBuilder (and certainly much nicer in front of my eyes). Today, although I had some speed problems with Logger that added variables and strings, he used the "+" operator. So I did a quick test case , and to my surprise, found that using StringBuilder was faster!

I mainly use an average of 20 runs for each number of additions with 4 different methods (shown below).

Results, times (in milliseconds)

                                                # of Appends
                            10 ^ 1 10 ^ 2 10 ^ 3 10 ^ 4 10 ^ 5 10 ^ 6 10 ^ 7
 StringBuilder (capacity) 0.65 1.25 2 11.7 117.65 1213.25 11570
 StringBuilder () 0.7 1.2 2.4 12.15 122 1253.7 12274.6
 "+" operator 0.75 0.95 2.35 12.35 127.2 1276.5 12483.4
 String.format 4.25 13.1 13.25 71.45 730.6 7217.15 -

Percent Chart Difference from the fastest algorithm.

% Difference in String timings

I checked the byte code , it is different for each string comparison method.

Here is what I use for the methods, and you can see the whole test class here .

public static String stringSpeed1(float a, float b, float c, float x, float y, float z){ StringBuilder sb = new StringBuilder(72).append("[").append(a).append(",").append(b).append(",").append(c).append("]["). append(x).append(",").append(y).append(",").append(z).append("]"); return sb.toString(); } public static String stringSpeed2(float a, float b, float c, float x, float y, float z){ StringBuilder sb = new StringBuilder().append("[").append(a).append(",").append(b).append(",").append(c).append("]["). append(x).append(",").append(y).append(",").append(z).append("]"); return sb.toString(); } public static String stringSpeed3(float a, float b, float c, float x, float y, float z){ return "["+a+","+b+","+c+"]["+x+","+y+","+z+"]"; } public static String stringSpeed4(float a, float b, float c, float x, float y, float z){ return String.format("[%f,%f,%f][%f,%f,%f]", a,b,c,x,y,z); } 

Now I tried to use float, ints and string. All of them show more or less the same time difference.

Questions

  • The + operator does not explicitly become the same byte code, and the time is very different from the optimal one. So what gives?
  • The behavior of the algorithms between 100 and 10,000 additives is very strange for me, so does anyone have an explanation?
+8
java string profiling append timing
source share
2 answers

The Java language specification does not indicate how string concatenation is performed, but I doubt your compiler does anything other than the equivalent:

 new StringBuilder("["). append(a). append(","). append(b). append(","). append(c). append("]["). append(x). append(","). append(y). append(","). append(z). append("]"). toString(); 

You can use "javap -c ..." to decompile your class file and verify this.

If you are measuring any significant and repeated run-time difference between your methods, I would prefer the garbage collector to work at different times than there is a real significant difference in performance. Of course, creating a StringBuilder with different initial capabilities may have some effect, but it should be negligible compared to the effort required, for example, to format floats.

+3
source share

I did not like two questions about your test case. First, you completed all the tests in the same process. When you are dealing with “large” ones (ambiguous, which I know), but when you are dealing with something where your processes correspond with memory, this is your main problem, you should always be guided in a separate run. Just the fact that we collected garbage collection can affect the results of previous launches. The way you considered your results confused me. What I did was take each on separate runs and knock it down from scratch the number of times I ran it. I also let him work for a few "reps," counting down every reputation. Then the number of milliseconds that was executed each time is printed. Here is my code:

 import java.util.Random; public class blah { public static void main(String[] args){ stringComp(); } private static void stringComp() { int SIZE = 1000000; int NUM_REPS = 5; for(int j = 0; j < NUM_REPS; j++) { Random r = new Random(); float f; long start = System.currentTimeMillis(); for (int i=0;i<SIZE;i++){ f = r.nextFloat(); stringSpeed3(f,f,f,f,f,f); } System.out.print((System.currentTimeMillis() - start)); System.out.print(", "); } } public static String stringSpeed1(float a, float b, float c, float x, float y, float z){ StringBuilder sb = new StringBuilder(72).append("[").append(a).append(",").append(b).append(",").append(c).append("]["). append(x).append(",").append(y).append(",").append(z).append("]"); return sb.toString(); } public static String stringSpeed2(float a, float b, float c, float x, float y, float z){ StringBuilder sb = new StringBuilder().append("[").append(a).append(",").append(b).append(",").append(c).append("]["). append(x).append(",").append(y).append(",").append(z).append("]"); return sb.toString(); } public static String stringSpeed3(float a, float b, float c, float x, float y, float z){ return "["+a+","+b+","+c+"]["+x+","+y+","+z+"]"; } public static String stringSpeed4(float a, float b, float c, float x, float y, float z){ return String.format("[%f,%f,%f][%f,%f,%f]", a,b,c,x,y,z); } } 

Now my results:

 stringSpeed1(SIZE = 10000000): 11548, 11305, 11362, 11275, 11279 stringSpeed2(SIZE = 10000000): 12386, 12217, 12242, 12237, 12156 stringSpeed3(SIZE = 10000000): 12313, 12016, 12073, 12127, 12038 stringSpeed1(SIZE = 1000000): 1292, 1164, 1170, 1168, 1172 stringSpeed2(SIZE = 1000000): 1364, 1228, 1230, 1224, 1223 stringSpeed3(SIZE = 1000000): 1370, 1229, 1227, 1229, 1230 stringSpeed1(SIZE = 100000): 246, 115, 115, 116, 113 stringSpeed2(SIZE = 100000): 255, 122, 123, 123, 121 stringSpeed3(SIZE = 100000): 257, 123, 129, 124, 125 stringSpeed1(SIZE = 10000): 113, 25, 14, 13, 13 stringSpeed2(SIZE = 10000): 118, 23, 24, 16, 14 stringSpeed3(SIZE = 10000): 120, 24, 16, 17, 14 //This run SIZE is very interesting. stringSpeed1(SIZE = 1000): 55, 22, 8, 6, 4 stringSpeed2(SIZE = 1000): 54, 23, 7, 4, 3 stringSpeed3(SIZE = 1000): 58, 23, 7, 4, 4 stringSpeed1(SIZE = 100): 6, 6, 6, 6, 6 stringSpeed2(SIZE = 100): 6, 6, 5, 6, 6 stirngSpeed3(SIZE = 100): 8, 6, 7, 6, 6 

As you can see from my results, by the values ​​that are in the "medium ranges", each consecutive rep is faster. This, I believe, is due to the fact that the JVM works and grabs onto the memory it needs. As the "size" increases, this effect is not allowed to take responsibility, because there is too much memory for the garbage collector to release and the process is delayed. In addition, when you make a “repeating” benchmark, for example, when most of your process can exist at lower cache levels rather than in RAM, your process is even more predictable for industry predictors. They are very smart and will understand what your process is doing, and I think the JVM strengthens it. It also helps explain why the values ​​in the initial cycles are slower, and why how you came up with benchmarking was a bad decision. That is why I think that your results for values ​​that are not "large" are distorted and seem strange. Then, as the "memory" of your test has increased, this branch prediction has less effect (in percent) than the large lines that you added are shifted into RAM.

A simplified conclusion: your results for the “big” runs are quite reasonable and seem similar to mine (although I still do not quite understand how you got your results, but the percentages seem to fit well in comparison). However, your results for small runs are not valid due to the nature of your test.

+4
source share

All Articles