Problems reading a string from a TCP socket in Node.js

Question

Problems reading a string from a TCP socket in Node.js

I have implemented a client / server that communicates using a TCP socket. The data I write to the socket is lowercase JSON. Initially, everything works as expected, however, as I increase the write speed, I eventually encounter JSON parsing errors when the client starts getting a new record at the end of the old one.

Here is the server code:

var data = {}; data.type = 'req'; data.id = 1; data.size = 2; var string = JSON.stringify(data); client.write(string, callback());

This is how I get this code on the client server:

 client.on('data', function(req) { var data = req.toString(); try { json = JSON.parse(data); } catch (err) { console.log("JSON parse error:" + err); } });

The error that I get with increasing speed is:

 SyntaxError: Unexpected token {

It seems that the beginning of the next query is marked at the end of the current one.

I tried to use; as a separator at the end of each JSON request, and then using:

  var data = req.toString().substring(0,req.toString().indexOf(';'));

However, this approach, instead of leading to JSON parsing errors, seems to lead to the complete absence of some client-side requests, as I increase the write speed by more than 300 per second.

Are there any recommendations or more efficient ways to differentiate incoming requests over TCP sockets?

Thanks!

+6

json node.js sockets

sicr Oct 13 '12 at 11:15

source share

4 answers

You are on the right track using a separator. However, you cannot simply extract material in front of the separator, process it, and then discard what happened after it. You should buffer everything that you received after the separator, and then combine what will be next to it. This means that after this data event, you can get any number (including 0) of JSON "pieces".

Basically you save a buffer that you initialize to "" . In each data event, you combine everything that you get to the end of the buffer, and then split it on the separator buffer. The result will be one or more entries, but the latter may be incomplete, so you need to test the buffer to make sure it ends with a separator. If not, you pull out the last result and set your buffer for it. Then you process all the results (which may be inactive).

+5

ebohlman Oct 14 '12 at 10:00

source share

Remember that TCP makes no guarantees as to where it shares the pieces of data that you receive. All this ensures that all bytes you send will be received in order if the connection does not complete completely.

I believe that Node data events appear when the socket says that it has data for you. Technically, you can get separate data events for each byte in JSON data, and it will still be within the scope of what the OS allows. Nobody does this, but your code should be written as if it could suddenly take action at any time to be reliable. It is up to you to combine data events, and then re-split the data stream at boundaries that make sense to you.

To do this, you need to buffer any data that is not “completed”, including data added to the end of the “full” data fragment. If you use a delimiter, never delete any data after the delimiter - always keep it as a prefix until you see more data and, ultimately, another delimiter or the final event.

Another common option is to prefix all data with a length field. Say you are using a fixed 64-bit binary value. Then you always wait for 8 bytes, plus many more more values in these bytes indicate to arrive. Say you had a piece of ten bytes of data. You can get 2 bytes in one event, then 5, then 4 - at this moment you can analyze the length and know that you need 7 more, since the last 3 bytes of the third block were a payload. If the next event actually contains 25 bytes, you should take the first 7 along with 3 from earlier and parse and look for another length field in bytes 8-16.

This is a contrived example, but keep in mind that at low traffic speeds, the network layer usually sends your data to any pieces that you give it, so such things really begin to appear only when the load increases. After the OS starts creating packages from several records at the same time, it will begin to split into details that are convenient for the network and not for you, and you will have to deal with this.

+2

Walter mundt Oct 14 '12 at 10:19

source share

Try with end event and no data

 var data = ''; client.on('data', function (chunk) { data += chunk.toString(); }); client.on('end', function () { data = JSON.parse(data); // use try catch, because if a man send you other for fun, you're server can crash. });

Hope to help you.

-3

user1255808 Oct 13 '12 at 11:55

source share

sicr · Accepted Answer · 2012-10-16T16:11:40+0000

Thanks to everyone for the explanation, they helped me better understand how data is sent and received through TCP sockets. Below is a brief overview of the code I used at the end:

 var chunk = ""; client.on('data', function(data) { chunk += data.toString(); // Add string on the end of the variable 'chunk' d_index = chunk.indexOf(';'); // Find the delimiter // While loop to keep going until no delimiter can be found while (d_index > -1) { try { string = chunk.substring(0,d_index); // Create string up until the delimiter json = JSON.parse(string); // Parse the current string process(json); // Function that does something with the current chunk of valid json. } chunk = chunk.substring(d_index+1); // Cuts off the processed chunk d_index = chunk.indexOf(';'); // Find the new delimiter } });

Comments are welcome ...

Problems reading a string from a TCP socket in Node.js

More articles: