In Java, what is the fastest way to “build” and use a string, character by character?

I have a Java socket connection that periodically receives data. The number of bytes of data received with each packet varies. Data may or may not be interrupted by a known character (e.g. CR or LF). The length of each data packet is variable.

I am trying to build a string from each data packet. What is the fastest way (speed, not memory) to build a string that should later be parsed?

I started by using a byte array to store incoming bytes, and then converted them to String with each packet, for example:

byte[] message = new byte[1024]; ... message[i] = //byte from socket i++; ... String messageStr = new String(message); ... //parse the string here 

The obvious disadvantage of this is that some packets may be longer than 1024. I do not want to arbitrarily create a larger array of bytes (what if my packet is larger?).

What is the best way to do this? Should I create a StringBuilder object and append ()? This way, I don't need to convert from StringBuilder to String (since the first one has most of the methods I need).

Again, execution speed is my biggest problem.

TIA.

+4
source share
8 answers

I would probably use an InputStreamReader wrapped around a BufferedInputStream , which in turn wraps a socket. And write code that processes the message at a time, potentially blocking input. If the input signal is explosive, I can start the background thread and use a parallel queue to store messages.

Reading the buffer at a time and trying to convert it to characters is exactly what BufferedInputStream/InputStreamReader does. And he does it, paying attention to coding, that (as other people noted), your decision does not.

I don’t know why you are focused on speed, but you will find that the processing time of data coming out of the socket is much shorter than the time it takes to transfer through this socket.

+12
source

Please note that when transferring through network layers, the conversion speed may not be a bottleneck. It would be wise to measure if you think this is important.

Note (also) that you do not specify a character encoding when converting from bytes to a string (through characters). I would use this one way or another, otherwise your client / server connection may become damaged if / when your client / server works in different environments. You can enforce this with the JVM runtime arguments, but this is not a particularly safe option.

Given the above, you can consider StringBuilder (int capacity) to pre-configure it with the appropriate size so that it does not need to be redistributed on the fly.

+8
source

First of all, you make a lot of assumptions about the chachachter encoding that you get from your client. Is it US-ASCII, ISO-8859-1, UTF-8?

Since there is no byte sequence in a Java string, when it comes to creating portable String serialization code, you have to make explicit decisions about character encoding. For this reason, you should NEVER use StringBuilder to convert bytes to String. If you look at the StringBuilder interface, you will notice that it does not even have an append( byte ) method, and this is not because the designers simply did not notice it.

In your case, you should definitely use ByteArrayOutputStream. The only drawback to using the direct implementation of ByteArrayOutputStream is that its toByteArray() method returns a copy of the array stored by the internaly object. For this reason, you can create your own subclass of ByteArrayOutputStream and provide direct access to the protected buf member.

Note that if you are not using the default implementation, be sure to specify the boundaries of the byte array in your String constructor. Your code should look something like this:

 MyByteArrayOutputStream message = new MyByteArrayOutputStream( 1024 ); ... message.write( //byte from socket ); ... String messageStr = new String(message.buf, 0, message.size(), "ISO-8859-1"); 

Substitute ISO-8859-1 for a character set suitable for your needs.

+4
source

StringBuilder is your friend. Add as many characters as needed, then call toString () to get the string.

+2
source

I would create a "small" array of characters and add characters to it. When the array is full (or the transfer completes), use the StringBuilder.append (char []) method to add the contents of the array to your string.

Now for the "small" size of the array, you will need to try different sizes and see which one is the fastest for your production environment (the performance "may" depend on the JVM, OS, processor type and speed, etc.)

EDIT: Other people mentioned ByteArrayOutputStream , I agree that this is another option.

+2
source

You might want to look at ByteArrayOutputStream depending on whether you work with bytes instead of characters.

I usually use ByteArrayOutputStream to assemble the message, and then use toString / toByteArray to receive it when the message is complete.

Edit: ByteArrayOutputStream can handle various character set encodings through a call toString.

+2
source

Personally, regardless of the language, I would send all the characters to a data stream in memory, and as soon as I need a string, I would read all the characters from this stream into a string. Alternatively, you can use a dynamic array, making it larger when you need to add more characters. Even better, keep track of the actual length and grow the array with additional blocks instead of single characters. So you start with 1 character in an array of 1000 characters. After you get 1001, the size of the array should be changed to 2000, then 3000, 4000, etc.

Fortunately, in several languages, including Java, there is a special class that specializes in this. These are the stringbuilder classes. No matter which method they use, it is not that important, but they are designed to improve performance, so they should be your fastest option.

0
source

Take a look at the Text class. It is faster (for operations) and more deterministic than StringBuilder.

Note: The project containing the class targets the RTSJ VM. It is great for standard SE / EE environments.

0
source

All Articles