Should the network load data of the network be balanced at the appropriate boundaries?

Question

Should the network load data of the network be balanced at the appropriate boundaries?

If you have the following class as a network packet payload:

class Payload {char field0; int field1; char field2; int field3; };

Does the class, for example, use the payload, of a data receiver prone to alignment problems when receiving data through a socket? I would think that the class should either be reordered or add an addition to ensure alignment.

Change the order:

class Payload { int field1; int field3; char field0; char field2; };

or add add-on:

 class Payload { char field0; char pad[3]; int field1; char field2; char pad[3]; int field3; };

If for some reason reordering does not make sense, I would think that adding an add-on would be preferable, as this will avoid alignment problems, even if it increases the size of the class.

What is your experience with such alignment issues in network data?

+4

c ++ c sockets network-programming payload

zooropa Apr 15 '09 at 19:58

source share

6 answers

Proper blind ignoring alignment can cause problems. Even in the same operating system, if 2 components were compiled with different compilers or different versions of the compiler.

Better ... 1) Transfer your data through some kind of serialization process.
2) Or pass each of your primitives individually, still paying attention to the byte order == Endianness

A good place to start would be Boost Serialization .

+8

Brian R. bondy Apr 15 '09 at 20:00

source share

We use packaged structures that are directly superimposed directly on the binary package in memory today, and I regret that I decided to do this. The only way we got this is:

careful definition of bit-specific types based on the compilation environment ( typedef unsigned int uint32_t )
inserting the appropriate pragmas for a particular compiler to indicate close packing of structural elements.
requiring everything to be in one byte order (use network or large order)
carefully writing server and client code

If you are just starting out, I would advise you to skip the whole mess trying to imagine what's on the wire with the structures. Just arrange each primitive element separately. If you decide not to use an existing library like Boost Serialize or middleware like TibCo, then save a lot of headache by writing an abstraction around the binary buffer that hides the details of your serialization method. The purpose for the interface:

 class ByteBuffer { public: ByteBuffer(uint8_t *bytes, size_t numBytes) { buffer_.assign(&bytes[0], &bytes[numBytes]); } void encode8Bits(uint8_t n); void encode16Bits(uint16_t n); //... void overwrite8BitsAt(unsigned offset, uint8_t n); void overwrite16BitsAt(unsigned offset, uint16_t n); //... void encodeString(std::string const& s); void encodeString(std::wstring const& s); uint8_t decode8BitsFrom(unsigned offset) const; uint16_t decode16BitsFrom(unsigned offset) const; //... private: std::vector<uint8_t> buffer_; };

Each of your package classes will have a serialization method in ByteBuffer or deserialized from ByteBuffer and offset. This is one of those things that I absolutely wish that I could return on time and correct. I cannot count the number of times that I spent time debugging a problem caused by forgetting to exchange bytes or not packing a struct .

Another hurdle to avoid is using union to represent bytes or memcpy for an unsigned buffer to extract bytes. If you always use Big-Endian on the wire, then you can use simple code to write bytes to the buffer and not worry about htonl things:

 void ByteBuffer::encode8Bits(uint8_t n) { buffer_.push_back(n); } void ByteBuffer::encode16Bits(uint16_t n) { encode8Bits(uint8_t((n & 0xff00) >> 8)); encode8Bits(uint8_t((n & 0x00ff) )); } void ByteBuffer::encode32Bits(uint32_t n) { encode16Bits(uint16_t((n & 0xffff0000) >> 16)); encode16Bits(uint16_t((n & 0x0000ffff) )); } void ByteBuffer::encode64Bits(uint64_t n) { encode32Bits(uint32_t((n & 0xffffffff00000000) >> 32)); encode32Bits(uint32_t((n & 0x00000000ffffffff) )); }

This remains a beautiful platform agnostic, as the numerical representation is always logically Big-Endian. This code is also very suitable for using templates based on the size of a primitive type (think encode<sizeof(val)>((unsigned char const*)&val) ) ... not so beautiful, but very, very easy to write and maintain.

+4

D.Shawley Apr 15 '09 at 21:17

source share

My experience is that the preferred (in order of preference) are the following approaches:

Use a high-level infrastructure such as Tibco, CORBA, DCOM or anything that will manage all these problems for you.
Write your own libraries on both sides of the junction that are aware of packaging, byte order, and other issues.
Communicate only with string data.

Trying to send raw binary data without any mediation will almost certainly cause a ton of problems.

+2

anon Apr 15 '09 at 20:05

source share

You practically cannot use a class or structure for this if you want any kind of mobility. In your example, ints can be 32-bit or 64-bit depending on your system. Most likely you are using a small Endian machine, but Apple's older Macs are a big Indian. The compiler can freely insert as he likes.

In general, you will need a method that writes each field to the buffer byte at a time, after you get the right byte order with n2hll, n2hl or n2hs.

+1

Andrew Johnson Apr 15 '09 at 20:08

source share

If you do not have natural alignment in structures, compilers usually insert additions so that alignment is correct. If, however, you use pragmas to “pack” structures (remove the gasket), there can be very harmful side effects. In PowerPC, non-fixed floats throw an exception. If you are working on an embedded system that does not handle this exception, you will get reset. If there is a routine to handle this interrupt, it can DRAINLY slow down your code, because it will use a software routine to eliminate inconsistencies that will powerlessly disrupt your performance.

+1

KeyserSoze Apr 15 '09 at 21:58

source share

KeyserSoze · Accepted Answer · 2009-04-15T20:24:26+0000

You should look into the buffers of the Google protocol or Boost :: serialize, as another poster said.

If you want to quit your own, do it right.

If you use types from stdint.h (i.e.: uint32_t, int8_t, etc.) and make sure that each variable has its own alignment (this means that its address is evenly divided by its size ( int8_t where anything, uint16_t are on even addresses, uint32_t are on addresses divisible by 4), you don’t have to worry about alignment or packaging.

At the previous work, we had all the structures sent via our data bus (ethernet or CANbus or byteflight or serial ports) defined in XML. There was a parser that would confirm the alignment of variables within the structures (warning you if someone wrote bad XML), and then generate header files for different platforms and languages to send and receive structures. This worked very well for us, we never had to worry about writing code for writing messages or packing, and it was guaranteed that there would be no stupid coding errors on all platforms. Some of our data layers were limited by bandwidth, so we implemented things like bit fields, and the parser generated the correct code for each platform. We also had enumerations that were very nice (you would be surprised at how easy it is for a person to just mess up the coding bit fields on enums manually).

If you don’t need to worry that it works on 8051 and HC11 with C or on data transfer layers that have limited bandwidth, you are not going to come up with something better than protocol buffers, you just spend a lot of time trying to be on a par with them.

Should the network load data of the network be balanced at the appropriate boundaries?

More articles: