The correct way to work with a network buffer in modern GCC / C ++ without violating strict anti-aliasing rules

Question

The correct way to work with a network buffer in modern GCC / C ++ without violating strict anti-aliasing rules

A program is a kind of network transmission of old-school messages:

// Common header for all network messages. struct __attribute__((packed)) MsgHeader { uint32_t msgType; }; // One of network messages. struct __attribute__((packed)) Msg1 { MsgHeader header; uint32_t field1; }; // Network receive buffer. uint8_t rxBuffer[MAX_MSG_SIZE]; // Receive handler. The received message is already in the rxBuffer. void onRxMessage() { // Detect message type if ( ((const MsgHeader*)rxBuffer)->msgType == MESSAGE1 ) { // Breaks strict-aliasing! // Process Msg1 message. const Msg1* msg1 = (const Msg1*)rxBuffer; if ( msg1->field1 == 0 ) { // Breaks strict-aliasing! // Some code here; } return; } // Process other message types. }

This code violates strict anti-aliasing in modern GCC (and drops to unspecified behavior in modern C ++). What is the correct way to solve the problem (in order to make code that does not give a “strict anti-aliasing” warning)?

Postscript If rxBuffer is defined as:

 union __attribute__((packed)) { uint8_t[MAX_MSG_SIZE] rawData; } rxBuffer;

and then I throw & rxBuffer into other pointers, this does not cause any warnings. But is it safe, correct, and portable?

+7

c ++ gcc strict-aliasing c ++ 11

Sap Jun 05 '15 at 12:45

source share

4 answers

Alberto m · Answer 1 · 2015-06-05T12:53:32+0000

Define rxBuffer as a pointer to union of uint8_t[MAX_SIZE] , MsgHeader , Msg1 and any type that you plan to use. Please note that this still violates the rules of strict anti-aliasing, but in GCC it guarantees operation as a non-standard extension.

EDIT: if such a method leads to an overly complex declaration, the completely portable (if slower) way is to save the buffer as a simple uint8_t[] and memcpy it in a suitable message structure as soon as it needs to be reinterpreted. The ability of this method obviously depends on your performance and efficiency needs.

~~EDIT 2: The third solution (if you are working on “normal” architectures) is to use char or unsigned char instead of uint8_t .~~ ~~Such types guarantee all the alias.~~ Not valid because conversion to message type may not work, see here

Vaughn cato · Answer 2 · 2015-06-05T13:49:46+0000

By working with individual bytes, you can avoid casting pointers and eliminate portability problems with precision and alignment:

 uint32_t decodeUInt32(uint8_t *p) { // Decode big-endian, which is network byte order. return (uint32_t(p[0])<<24) | (uint32_t(p[1])<<16) | (uint32_t(p[2])<< 8) | (uint32_t(p[3]) ); } void onRxMessage() { // Detect message type if ( decodeUInt32(rxBuffer) == MESSAGE1 ) { // Process Msg1 message. if ( decodeUInt32(rxBuffer+4) == 0 ) { // Some code here; } return; } // Process other message types. }

netcat · Answer 3 · 2015-06-07T12:20:18+0000

As Alberto M writes, you can change the type of your buffer and how you get it:

 union { uint8_t rawData[MAX_MSG_SIZE]; struct MsgHeader msgHeader; struct { struct MsgHeader dummy; struct Msg1 msg; } msg1; } rxBuffer; receiveBuffer(&rxBuffer.rawData); if (rxBuffer.msgHeader.msgType == MESSAGE1) { if (rxBuffer.msg1.msg.field1) { // ...

or directly get into the structure if your method uses char ( uint8_t only aliases uint8_t unlike char , which can always be an alias):

 struct { struct MsgHeader msgHeader; union { struct Msg1 msg1; struct Msg2 msg2; } msg; } rxBuffer; recv(fd, (char *)&rxBuffer, MAX_MSG_SIZE, 0); // handle errors and insufficient recv length if (rxBuffer.msgHeader.msgType == MESSAGE1) { // ...

~~Btw.~~ ~~punning type through union standard~~ ~~and does not violate strict anti-aliasing.~~ ~~See C99-TC3 6.5 (7), as well as the search for "punning type".~~ The question is about C ++, but not C, so Alberto M is right about the fact that it is non-standard, but an extension of GCC.

Using memcpy for this works similarly to the one described above, but is standard: bytes are copied based on each character, effectively interpreting them as a structure when accessing the destination, for example, when you type through a union:

 struct MsgHeader msgHeader; memcpy(&msgHeader, rxBuffer, sizeof(msgHeader)); if (msg_header.msgType == MESSAGE1) { struct Msg1 msg; memcpy(&msg, rxBuffer + sizeof(msgHeader), sizeof(msg)); if (msg.field1 == 0) { // Some code here; } }

As Vaughn Cato wrote, you can unpack (and then probably also pack) the received and sent network buffers. Again, this is a standard match, and so you can also work with padding and byte ordering in a portable way:

 uint8_t *buf= rxBuffer; struct MsgHeader msgHeader; msgHeader.msgType = (buf[3]<<0) | (buf[2]<<8) | (buf[1]<<16) | (buf[0]<<24); // read uint32_t in big endian if (msgHeader.msgType == MESSAGE2) { struct Msg2 msg; buf += sizeof(MsgHeader); msg.field1 = (buf[1]<<0) | (buf[0]<<8); // read uint16_t in big endian if (msg.field1 == 0) { // ...

Note: struct Msg1 and struct Msg2 do not contain struct MsgHeader in the above snippets and look like this:

 struct Msg1 { uint32_t field1; }; struct Msg2 { uint16_t field1; };

Aaron mcdaid · Answer 4 · 2015-07-20T22:15:29+0000

It comes down to the following:

  ((const MsgHeader*)rxBuffer)->msgType

rxBuffer has one type, but we want to consider it as if from another type. I suggest the following "alias-cast":

  const MsgHeader * msg_header_p = (const MsgHeader *) rxBuffer; memmove(msg_header_p, rxBuffer, sizeof(MsgHeader)); auto msg_type = msg_header_p -> msgType;

memmove (as its less flexible cousin memcpy ) effectively says that the bit pattern that was available in the source ( rxBuffer ), after calling memmove will be available at the destination ( msg_header_p ). Even if the types are different.

You can argue that memmove does nothing because the source and destination are identical. But that's for sure. Logically, it serves to create an msg_header_p alias for rxBuffer , although in practice a good compiler optimizes it.

(This answer is potentially somewhat contradictory. Perhaps I memmove too hard. I assume my logic is this: first, memcpy to a new place is clearly acceptable to answer this question, and secondly, memmove (but maybe slower) , memcpy version; thirdly, if memcpy allows you to look at the same bit pattern with a different type, then why not memmove allow the same idea to “change” the type of a specific bit pattern? If we memcpy in the time domain, then memcpy will return to the starting position, will it be OK too?)

If you want to build a complete answer from this, you will need to run the alias in memmove(rxBuffer, msg_header_p, sizeof(MsgHeader)); but I think I should wait for feedback on my "alias cast" first!

The correct way to work with a network buffer in modern GCC / C ++ without violating strict anti-aliasing rules

More articles: