Efficient way to create a string from char [], start, length in Java

We use Java SAX to parse strong XML files. Our implementation of characters as follows:

 @Override public void characters(char ch[], int start, int length) throws SAXException { String value = String.copyValueOf(ch, start, length); ... } 

( ch[] arrays passed to SAX tend to be quite long)

But we recently got some performance issues, and the profiler shows us that more than 20% of our CPU usage is above calling String.copyValueOf (which called new String(ch,start,length) under the hood).

Is there a more efficient way to get a string from an array of characters, start the index and length, than String.copyValueOf(ch, start, length) or new String(ch,start,length) ?

+7
source share
3 answers

Good question, but I'm sure this answer is no .

This is due to the fact that any construction of the String object uses the method of copying arrays. It cannot be constructed directly on an existing array, because the String object must be immutable, and its internal representation of the string array is encapsulated from external changes.

In addition, in your case, you have a deal with a fragment of some array. It is not possible to construct a String object on a fragment of another array in any way.

+4
source

As @Andremoniy pointed out, if you want to use a String object, it should always be created and the contents will be copied into it.

The only way to speed up your parser is to keep the number of new string objects to a minimum.

I believe that every element of your xml structure contains raw data between the start and end tags.

Therefore, I suggest creating rows only if you are in an element in which data is of interest. Moreover, I would suggest somehow limiting the possible elements. For example, by hierarchical level or parent to reduce the number of rows. But it depends on the xml structure.

 protected boolean readChars = false; protected int level = -1; @Override public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { ++level; if (level == 4) { if (qName.equalsIgnoreCase("TextElement")) { readChars = true; } } } @Override public void characters(char ch[], int start, int length) throws SAXException { if (readChars) { String value = String.copyValueOf(ch, start, length); ... readChars = false; } } @Override public void endElement(String uri, String localName, String qName) throws SAXException { --level; } 
+1
source

Perhaps, combined with the fact that characters can be called more than once within the same tag, it may be acceptable to place a StringBuilder at the element level. This makes System.arrayCopy .

+1
source

All Articles