Are there any performance differences for binary and XML serialization?

In terms of both parsing (serialization, deserialization) and sending packets over the network, is there a good estimate of performance differences between binary and xml serialization?

+6
c # xml binary-data
source share
4 answers

Nope.

It depends a lot on what data is inside the XML document itself. If you have a lot of structured data, the overhead for XML will be large. For example, if your data looks like this:

<person> <name>Dave</dave> <ssn>000-00-0000</ssn> <email1>xxxxxx/email1> </person> ... 

You will have more overhead than having an XML document that looks like this:

 <book name="bible"> In the beginning God created the heavens and the earth. Now the earth was formless and empty ... And if any man shall take away from the words of the book of this prophecy, God shall take away his part out of the book of life, and out of the holy city, and from the things which are written in this book. He which testifieth these things saith, Surely I come quickly. Amen. Even so, come, Lord Jesus. </book> 

So this is not a completely honest question. It depends a lot on the data that you intend to send, and how / if you compress it.

+15
source share

The biggest difference between BinaryFormatter and xml serialization is portability; BinaryFormatter is difficult to guarantee between versions, so it is really suitable for short storage or transfer.

However, you can get the best of both, and reduce it and speed it up using binary serialization to order β€” and you don’t even have to do it yourself; -p

protobuf-net is a .NET implementation of the binary protocol serialization specification for Google protocol buffers; it is smaller than XmlSerializer or BinaryFormatter , fully portable (not only between versions - you can load the pb stream into, for example, java, etc.), extensible and fast. It is also fairly comprehensively tested with a large number of users.

A complete breakdown of size and speed, covering XmlSerializer , BinaryFormatter , DataContractSerializer and protobuf-net here .

+5
source share

Instinctively, you would like to say that the binary is more efficient, but in reality it depends on the serializable data.

Check out this article: http://www.nablasoft.com/alkampfer/index.php/2008/10/31/binary-versus-xml-serialization-size/

0
source share

Just pointing out that performance is not the only indicator you can look at.

  • Simplicity of design. You have several days / weeks to build a serializer / deserializer procedure and thoroughly test it, or you could spend more time on functions.
  • Ease of use of data. Can a client use a prebuilt open source parser or do they need to implement a bunch of (potentially buggy) code on their own?
  • Easy debugging. Will it be possible to view data in transit to debug? Then the binary format will tend to spoof any problems.
  • What is the cost of maintenance for each method?

Personally, I would like to use the published XML standard and open source libraries until the performance bottleneck is proven by actual testing.

0
source share

All Articles