Google Protocol Buffers. Stability and security of implementation in C ++ in the face of malicious data

Question

Google Protocol Buffers. Stability and security of implementation in C ++ in the face of malicious data

For those who have used the C ++ implementation of Google Protocol Buffers, how is this related to malicious or malformed messages? For example, is it crashing or continues to work? My application will certainly receive malicious data at some point, and I do not want it to crash every time an incorrect message is received. This is the only answer I could find on this question ( google mailing list ).

Before releasing the code, security was addressed. For at least C ++ and Java implementations, there are various safeguards to protect against corrupt or malicious data. There are restrictions on the total message size provided by the protobuf library (CodedInputStream :: SetTotalBytesLimit); it also provides recursion restriction to prevent deeply nested messages from entering the stack. There are other internal implementation details to avoid such things as memory exhaustion (in particular, from receiving messages that indicate huge importance with delimiters in length).

+5

c ++ security protocol-buffers

Charles Jan 12 '15 at 16:28

source share

1 answer

Richard Hodges · Accepted Answer · 2015-01-12T17:25:10+0000

I use google C ++ protocol buffers in a very safe application for a web application.

If you look at the generated code, all deserialization work will be delegated to the automatically generated code in each method of the <Message-Type>::MergePartialFromCodedStream . These methods are generated using complex checks of data types and lengths, and so far we have not had any problems.

One of the areas of attack that you might want to close is the creation of protobuf data - protocol buffers themselves do not serialize the total size of a serialized message into a stream in any standardized header, so you might want to (like me) wrap all protocol buffer messages in a frame . For my purposes, the frame header simply contains the size of the message, which means that I can determine the memory requirements for the message before I try to read it from the wire, not to mention decrypting it.

At this point, a simple check can be performed to reject the message (or delete the connection) if the size is unreasonably large.

Further work can be done to wrap this frame in a shared key bypass scheme to prevent a person from being captured in the middle of your session if this is a concern.

A buffer overflow in a message (for example, a line that is too long) cannot occur because the bytes and string fields are internally represented by std::string , which automatically increases the amount of memory when data is added to it.

However:

There is no guarantee that malicious clients will not attempt to encode valid messages containing invalid data. For example, if your server application takes a method name from a data string, looks at its address and calls it, then this is an obvious vector for an attack.

You should never allow client data to find the server code without fully verifying that the operation is specifically allowed.

Some examples of this should never be done:

allows the client to send you SQL in the text box
allow the client to send you command lines, which you then go to system() , exec() , spawn() , etc ...
allows the client to send you the name of the shared library and the name of the function inside it ...

etc.

Google Protocol Buffers. Stability and security of implementation in C ++ in the face of malicious data

More articles: