Serial byte alignment

So, I'm trying to determine the communication protocol for serial communication, I want to be able to send 4 bytes to the device, but I'm not sure how to make sure that the device starts to pick it up on the right byte.

For example, if I want to send

0x1234abcd 0xabcd3f56 ... 

How can I make sure that the device does not start reading in the wrong place and receives the first word:

 0xabcdabcd 

Is there a smart way to do this? I thought of using a marker to start the message, but what if I want to send a number that I select as data?

+2
source share
2 answers

Why not send a start-of-message byte followed by a length-of-data byte if you know how big the data will be?

Alternatively, follow other binary protocols and send only fixed packet sizes with a fixed header. Let's say that you send only 4 bytes, then you know that before the actual content of the data you will have one or more bytes of the header.

Edit: I think you do not understand me. I mean, the client should always treat bytes as a header or data, not based on value, but rather based on position in the stream. Say you send four bytes of data, then one byte will be the header byte.

 +-+-+-+-+-+ |H|D|D|D|D| +-+-+-+-+-+ 

Then the client will be a fairly simple state machine, which will look like this:

 int state = READ_HEADER; int nDataBytesRead = 0; while (true) { byte read = readInput(); if (state == READ_HEADER) { // process the byte as a header byte state = READ_DATA; nDataBytesRead = 0; } else { // Process the byte as incoming data ++nDataBytesRead; if (nDataBytesRead == 4) { state = READ_HEADER; } } } 

The thing about this setting is that it determines if the byte of the header byte is not the actual contents of the byte, but rather the position in the stream. If you want to have a variable number of data bytes, add another byte to the header to indicate the number of data bytes following it. That way, it doesn't matter if you send the same value as the header in the data stream, as your client will never interpret it as anything other than data.

+7
source

netstring

A relatively simple netstring format is possible for this application.

For example, the text "hello world!" encoded as:

 12:hello world!, 

An empty string is encoded as three characters:

 0:, 

which can be represented as a series of bytes

 '0' ':' ',' 

The word 0x1234abcd in one netstring (using the network byte order ), followed by the word 0xabcd3f56 in another netstring, is encoded as a series of bytes

 '\n' '4' ':' 0x12 0x34 0xab 0xcd ',' '\n' '\n' '4' ':' 0xab 0xcd 0x3f 0x56 ',' '\n' 

(the newline character '\ n' before and after each netstring is optional, but makes testing and debugging easier).

frame synchronization

how can I make sure that the device does not start reading in the wrong place.

A common solution to the frame synchronization problem is to read into a temporary buffer, hoping we started reading in the right place. Later we run some checks for message consistency in the buffer. If the message does not pass the check, something went wrong, so we throw the data into the buffer and start all over again. (If this was an important message, we hope the transmitter redirects it).

For example, if a serial cable is connected to the middle of the first grid, the receiver sees a byte string:

 0xab 0xcd ',' '\n' '\n' '4' ':' 0xab 0xcd 0x3f 0x56 ',' '\n' 

Since the receiver is smart enough to wait for the ":" before waiting for the next byte to be valid, the receiver will be able to ignore the first partial message and then correctly receive the second message.

In some cases, you know in advance what the actual length (s) of the messages will be; making it even easier for the receiver to detect that he began to read in the wrong place.

sending a start message token as data

I thought of using a marker to start the message, but what if I want to send a number that I select as data?

After sending the netstring header, the transmitter sends the raw data as is - even if it looks like a message start marker.

In the normal case, the receiver already has frame synchronization. The netstring parser has already read the header "length" and ":", so the netstring parser places the raw data bytes in the right place in the buffer - even if these data bytes look like the header bytes ":" or "," the bottom byte.

pseudo code

 // netstring parser for receiver // WARNING: untested pseudocode // 2012-06-23: David Cary releases this pseudocode as public domain. const int max_message_length = 9; char buffer[1 + max_message_length]; // do we need room for a trailing NULL ? long int latest_commanded_speed = 0; int data_bytes_read = 0; int bytes_read = 0; int state = WAITING_FOR_LENGTH; reset_buffer() bytes_read = 0; // reset buffer index to start-of-buffer state = WAITING_FOR_LENGTH; void check_for_incoming_byte() if( inWaiting() ) // Has a new byte has come into the UART? // If so, then deal with this new byte. if( NEW_VALID_MESSAGE == state ) // oh dear. We had an unhandled valid message, // and now another byte has come in. reset_buffer(); char newbyte = read_serial(1); // pull out 1 new byte. buffer[ bytes_read++ ] = newbyte; // and store it in the buffer. if( max_message_length < bytes_read ) reset_buffer(); // reset: avoid buffer overflow switch state: WAITING_FOR_LENGTH: // FIXME: currently only handles messages of 4 data bytes if( '4' != newbyte ) reset_buffer(); // doesn't look like a valid header. else // otherwise, it looks good -- move to next state state = WAITING_FOR_COLON; WAITING_FOR_COLON: if( ':' != newbyte ) reset_buffer(); // doesn't look like a valid header. else // otherwise, it looks good -- move to next state state = WAITING_FOR_DATA; data_bytes_read = 0; WAITING_FOR_DATA: // FIXME: currently only handles messages of 4 data bytes data_bytes_read++; if( 4 >= data_bytes_read ) state = WAITING_FOR_COMMA; WAITING_FOR_COMMA: if( ',' != newbyte ) reset_buffer(); // doesn't look like a valid message. else // otherwise, it looks good -- move to next state state = NEW_VALID_MESSAGE; void handle_message() // FIXME: currently only handles messages of 4 data bytes long int temp = 0; temp = (temp << 8) | buffer[2]; temp = (temp << 8) | buffer[3]; temp = (temp << 8) | buffer[4]; temp = (temp << 8) | buffer[5]; reset_buffer(); latest_commanded_speed = temp; print( "commanded speed has been set to: " & latest_commanded_speed ); } void loop () # main loop, repeated forever # then check to see if a byte has arrived yet check_for_incoming_byte(); if( NEW_VALID_MESSAGE == state ) handle_message(); # While we're waiting for bytes to come in, do other main loop stuff. do_other_main_loop_stuff(); 

more tips

When defining a serial communication protocol, I believe that testing and debugging is much easier if the protocol always uses ASCII text characters that are human-readable and not any arbitrary binary values.

frame synchronization (again)

I thought of using a marker to start the message, but what if I want to send a number that I select as data?

We have already considered the case when the receiver already has frame synchronization. The case where the receiver does not yet have frame synchronization is rather dirty.

The simplest solution is to send the transmitter a series of harmless bytes (possibly newline or space characters) the length of the maximum possible valid message, like a preamble immediately before each grid. No matter what state the receiver is in when the serial cable is connected, these harmless bytes ultimately put the receiver in WAITING_FOR_LENGTH state. And then, when the tranmitter sends the packet header (the length followed by ":"), the receiver correctly recognizes it as the packet header and restored frame synchronization.

(The transmitter should not transmit this preamble before each packet. Perhaps the transmitter could send it to 1 out of 20 packets; then the receiver is guaranteed to restore frame synchronization in 20 packets (usually less) after connecting a serial cable).

other protocols

Other systems use a simple Fletcher-32 checksum or something more complex to detect many kinds of errors that the netstring format cannot detect (<a href = "" rel = "nofollow"> a , b ), and can even be synchronized without preamble.

Many protocols use a special “start of packet” token and use various “screening” methods to avoid actually sending a literal “initial packet” of bytes in the transmitted data, even if the real data we want to send has such a value. ( Consistent overhead byte filling , bit stuffing , quoted-printable and other types of binary text encoding , etc ..).

These protocols have the advantage that the receiver can be sure that when we see the “start of packet” token, this is the actual start of the packet (and not some data byte that accidentally matches the same value). This makes it easier to handle loss of synchronization - just drop the bytes to the next "start of packet" token.

Many other formats, including the netstring format, allow you to transfer any possible byte value as data. Thus, receivers should be smarter than handling the byte of the start header, which can be the actual start header or the data byte - but at least they don't have to deal with “escaping” or a surprisingly large buffer is needed, in the worst case, hold "fixed 64 byte data message" after exiting.

Choosing one approach is really no simpler than another - it just pushes complexity to another place, as the waterbed theory predicts .

Could you give up discussing various ways to handle start-of-header bytes, including these two methods, in Wikibook Sequential Programming , and editing this book to make it better?

+3
source

All Articles