Serialize / deserialize a structure to char * in C

I have a structure

struct Packet { int senderId; int sequenceNumber; char data[MaxDataSize]; char* Serialize() { char *message = new char[MaxMailSize]; message[0] = senderId; message[1] = sequenceNumber; for (unsigned i=0;i<MaxDataSize;i++) message[i+2] = data[i]; return message; } void Deserialize(char *message) { senderId = message[0]; sequenceNumber = message[1]; for (unsigned i=0;i<MaxDataSize;i++) data[i] = message[i+2]; } }; 

I need to convert this to char *, the maximum length MaxMailSize> MaxDataSize to send over the network and then deserialize it on the other end

I can not use tpl or any other library.

Is there a way to do it better, I don’t like it, or is this the best we can do.

+4
source share
7 answers

since it needs to be sent over the network, I highly recommend that you convert this data to the network byte order before sending and back to the host byte order when receiving. this is because the byte order is not the same everywhere, and once your bytes are not in the correct order, it can be very difficult to cancel them (depending on the programming language used on the receiving side). byte ordering functions are defined along with sockets and are called htons() , htonl() , ntohs() and ntohl() . (in this name: h means “host” or your computer, n means “network”, s means “short” or 16-bit value, l means “long” or 32-bit value).

then you yourself with serialization, C and C ++ do not have an automatic way to execute it. some software tools can generate code for this, as the ASN.1 asn1c implementation, but they are difficult to use because they are much more connected than just copying data over a network.

+6
source

You can have a class that will reprogram the object that you use in your software, with all the intricacies and functions of a member and everything you need. Then you have a “serialized” structure that more describes what will be on the network.

To make sure that the compiler does everything you say to it, you need to instruct its “pack” structure. The directive I used here for gcc is in your compiler document if you are not using gcc.

Then the serialization and deserialization procedure simply converts between them, providing byte order and such details.

 #include <arpa/inet.h> /* ntohl htonl */ #include <string.h> /* memcpy */ class Packet { int senderId; int sequenceNumber; char data[MaxDataSize]; public: char* Serialize(); void Deserialize(char *message); }; struct SerializedPacket { int senderId; int sequenceNumber; char data[MaxDataSize]; } __attribute__((packed)); void* Packet::Serialize() { struct SerializedPacket *s = new SerializedPacket(); s->senderId = htonl(this->senderId); s->sequenceNumber = htonl(this->sequenceNumber); memcpy(s->data, this->data, MaxDataSize); return s; } void Packet::Deserialize(void *message) { struct SerializedPacket *s = (struct SerializedPacket*)message; this->senderId = ntohl(s->senderId); this->sequenceNumber = ntohl(s->sequenceNumber); memcpy(this->data, s->data, MaxDataSize); } 
+3
source

Depending if you have enough space or not ... you can just use streams :)

 std::string Serialize() { std::ostringstream out; char version = '1'; out << version << senderId << '|' << sequenceNumber << '|' << data; return out.str(); } void Deserialize(const std::string& iString) { std::istringstream in(iString); char version = 0, check1 = 0, check2 = 0; in >> version; switch(version) { case '1': senderId >> check1 >> sequenceNumber >> check2 >> data; break; default: // Handle } // You can check here than 'check1' and 'check2' both equal to '|' } 

I readily admit that it takes up more space ... or that it could be.

Actually, in a 32-bit architecture, int usually spans 4 bytes (4 char). Serializing them using streams takes more than 4 'char' if the value exceeds 9999, which usually gives some room.

Also note that you should probably include some defenders in your stream, just to check when you get back that it’s good.

The top is probably a good idea; it is not expensive and allows unplanned later development.

+3
source
 int senderId; int sequenceNumber; ... char *message = new char[MaxMailSize]; message[0] = senderId; message[1] = sequenceNumber; 

Here you rewrite the values. senderId and sequenceNumber are both int, and for most architectures take up more sizeof (char) bytes. Try something else like this:

 char * message = new char[MaxMailSize]; int offset = 0; memcpy(message + offset, &senderId, sizeof(senderId)); offset += sizeof(senderId); memcpy(message + offset, &sequenceNumber, sizeof(sequenceNumber)); offset += sizeof(sequenceNumber); memcpy(message + offset, data, MaxDataSize); 

EDIT: Fixed code recorded in a stupor. In addition, as noted in the commentary, any such package is not portable due to word differences.

+1
source

To answer your question in general, C ++ does not have a reflection mechanism, and therefore the manual serialization and non-serialization functions defined for each class are the best you can do. In this case, the serialization function that you wrote will distort your data. Here is the correct implementation:

 char * message = new char[MaxMailSize]; int net_senderId = htonl(senderId); int net_sequenceNumber = htonl(sequenceNumber); memcpy(message, &net_senderId, sizeof(net_senderId)); memcpy(message + sizeof(net_senderId), &net_sequenceNumber, sizeof(net_sequenceNumber)); 
0
source

As mentioned in other posts, senderId and sequenceNumber are int types that are likely to be larger than char, so these values ​​will be truncated.

If this is acceptable, then the code is fine. If not, then you need to break them into their component bytes. Given that the protocol you use will determine the byte order of multibyte fields, the most portable and least ambiguous way to do this is by switching.

For example, let's say that senderId and sequenceNumber are 2 bytes long, and the protocol requires the high byte to be first:

 char* Serialize() { char *message = new char[MaxMailSize]; message[0] = senderId >> 8; message[1] = senderId; message[2] = sequenceNumber >> 8; message[3] = sequenceNumber; memcpy(&message[4], data, MaxDataSize); return message; } 

I would also recommend replacing the for loop with memcpy (if available), as it is unlikely to be less efficient, and this will make the code shorter.

Finally, this all assumes that char is one byte long. If this is not so, then all data should be masked, for example:

  message[0] = (senderId >> 8) & 0xFF; 
0
source

You can use protocol buffers to define and serialize structures and classes. This is what Google uses internally and has a very small transfer mechanism.

http://code.google.com/apis/protocolbuffers/

0
source

All Articles