Best way to modify an existing row? StringBuilder or convert to char array and return to string?

I am learning Java and wondering what is the best way to change the lines here (both for performance and for learning the preferred method in Java). Suppose you iterate over a string and check each character / perform some action on that index in the string.

Am I using the StringBuilder class or converting a string to a char array, making my modifications, and then converting the char array to a string?

Example for StringBuilder :

 StringBuilder newString = new StringBuilder(oldString); for (int i = 0; i < oldString.length() ; i++) { newString.setCharAt(i, 'X'); } 

Example for converting a char array:

 char[] newStringArray = oldString.toCharArray(); for (int i = 0; i < oldString.length() ; i++) { myNameChars[i] = 'X'; } myString = String.valueOf(newStringArray); 

What are the pros and cons in different ways?

I believe StringBuilder will be more efficient, since converting to a char array makes copies of the array every time you update the index.

+7
java string
source share
4 answers

I say, do what is most readable / supported until you find out that String “modification” slows you down. For me, this is the most readable:

 Sting s = "foo"; s += "bar"; s += "baz"; 

If it is too slow, I would use StringBuilder . You can compare this with StringBuffer . If performance and synchronization do not work, StringBuilder should be faster. If synchronization is required, you should use StringBuffer .

It is also important to know that these lines do not change. In java, String immutable.


It all depends on the context. If you optimize this code and it does not make a noticeable difference (and this usually takes place), then you just thought longer than you had to, and you probably made your code more difficult to understand. Optimize when you need, not because you can. And before you do this, make sure that the code you optimize is causing your performance problem.

+3
source share

What are the pros and cons in different ways. I believe StringBuilder will be more efficient, as converting to a char array makes copies of the array every time you update the index.

As written, the code in the second example will create only two arrays: one when you call toCharArray() , and the other when you call String.valueOf() ( String stores data in a char [] array). The processing of the elements you perform should not initiate the distribution of objects. When you read or write an item, copies are not copied from the array.

If you intend to perform any String manipulations, it is recommended that you use StringBuilder . If you write very performance-sensitive code, and your conversion does not change the length of the string, then it would be advisable to manipulate the array directly. But since you are learning Java as a new language, I'm going to assume that you are not working in high-frequency trading or in any other environment where there is a critical delay. Therefore, you are probably better off using StringBuilder .

If you are doing any kind of conversion that can produce a string of different lengths than the original, you should almost certainly use StringBuilder ; if necessary, it will change its internal buffer.

In a related note, if you perform simple string concatenation (for example, s = "a" + someObject + "c" ), the compiler actually converts these operations to the StringBuilder.append() call chain, so you can use what you find more aesthetically pleasing. I personally prefer the + operator. However, if you create a string for multiple statements, you must create one StringBuilder .

For example:

 public String toString() { return "{field1 =" + this.field1 + ", field2 =" + this.field2 + ... ", field50 =" + this.field50 + "}"; } 

Here we have one long expression that includes many concatenations. You do not need to worry about optimization manually, because the compiler will use one StringBuilder and just call append() on it several times.

 String s = ...; if (someCondition) { s += someValue; } s += additionalValue; return s; 

This is where you get two StringBuilders created under covers, but if this is not a very hot code path in a latency-critical application, then you really should not worry. Given similar code, but with many other separate concatenations, it might be worth optimizing. The same thing happens if you know that strings can be very large. But don't just guess - measure it! Demonstrate that there is a performance problem before trying to fix this. (Note: this is just a general rule for “microoptimization," rarely there is a flaw in using StringBuilder explicitly. But do not assume that this will make a noticeable difference: if you are concerned about this, you should actually measure .)

 String s = ""; for (final Object item : items) { s += item + "\n"; } 

Here we perform a separate concatenation operation at each iteration of the loop, which means that a new StringBuilder will be allocated on each pass. In this case, it is probably worth using a single StringBuilder , as you may not know how large the collection will be. I would see this as an exception to “prove the performance problem before optimizing the rule”: if an operation can explode in complexity based on input, make mistakes on the side of caution.

+1
source share

Which option will perform best is not an easy question.

I did a benchmark using Caliper :

  RUNTIME (NS) array 88 builder 126 builderTillEnd 76 concat 3435 

Comparable Methods:

 public static String array(String input) { char[] result = input.toCharArray(); // COPYING for (int i = 0; i < input.length(); i++) { result[i] = 'X'; } return String.valueOf(result); // COPYING } public static String builder(String input) { StringBuilder result = new StringBuilder(input); // COPYING for (int i = 0; i < input.length(); i++) { result.setCharAt(i, 'X'); } return result.toString(); // COPYING } public static StringBuilder builderTillEnd(String input) { StringBuilder result = new StringBuilder(input); // COPYING for (int i = 0; i < input.length(); i++) { result.setCharAt(i, 'X'); } return result; } public static String concat(String input) { String result = ""; for (int i = 0; i < input.length(); i++) { result += 'X'; // terrible COPYING, COPYING, COPYING... same as: // result = new StringBuilder(result).append('X').toString(); } return result; } 

Notes

  • If we want to change String, we need to make at least one copy of this input String, because strings in Java are immutable.

  • java.lang.StringBuilder extends java.lang.AbstractStringBuilder . StringBuilder.setCharAt() inherited from AbstractStringBuilder and looks like this:

     public void setCharAt(int index, char ch) { if ((index < 0) || (index >= count)) throw new StringIndexOutOfBoundsException(index); value[index] = ch; } 

    AbstractStringBuilder internally uses the simplest char array: char value[] . Thus, result[i] = 'X' very similar to result.setCharAt(i, 'X') , however the second will call the polymorphic method (which is probably the JVM embed) and check the borders in if , so it will be a bit slower.

conclusions

  • If you can work with StringBuilder to the end (you do not need String) - do it. This is the preferred method as well as the fastest. Simply the best.

  • If you want the line at the end and this is the bottleneck of your program, you might consider using a char array. In the char test array, the array was 25% faster than StringBuilder . Do not forget to correctly measure the runtime of your program before and after optimization, because there is no guarantee about this 25%.

  • Never concatenate lines in a loop with + or += unless you really know what you are doing. It is usually best to use explicit StringBuilder and append() .

+1
source share

I would prefer to use the StringBuilder class where the original string was changed.

For string manipulation, I like the StringUtil class. You will need to use the Apache community dependency.

0
source share

All Articles