Socket idiom in Python

I have experience programming sockets using the Berkeley Socket APIs in C. Typically, any socket programming requires a strategy that lets the receiving socket know how much data it should receive. This can be done with header fields or delimiters. Generally, I prefer a header field that contains a length.

Of course, we also need to know the size of the length header field itself, which is just a fixed size value that must be agreed upon by both the sender and the receiver. In C, this is easy to implement, since native integer types are fixed in size and in binary format, so you can just say something like:

uint16_t bytes_to_receive; recv(sock, &bytes_to_receive, sizeof(bytes_to_receive), 0); bytes_to_receive = ntohs(bytes_to_receive); // Now receive 'bytes_to_receive' bytes... 

But how is this idiom done using Python sockets? In Python, integers are objects and pickled integers are arrays of variable-length bytes. Therefore, we cannot use a pickled integer as a length header field, because we cannot be sure of its size in bytes.

Of course, I could always send a byte array of a known size containing a binary integer, such as b'\x05\x00' , to create a 16-bit binary integer with a value of 5 in a small trailing format, but it really doesn't seem like the right approach .

So how is this usually done in Python?

+6
c python sockets
source share
3 answers

You can use the struct module to convert Python integers to and from arrays of strings and bytes. Just read the number of bytes that match the size of the type header and convert it with the struct module, and you should be good to go. (note: when encoding / decoding, be sure to use the correct endian-flags)

+5
source share

The sys module provides the getsizeof() function, which returns the size of the object in bytes (using the objects __sizeof__ method). If you work with custom objects, you need to thoroughly test the __sizeof__ implementation, but it looks like this works fine for standard types.

Alternatively, you can also serialize the data in pickle or json and count the number of characters per line, although this can lead to performance degradation.

Using any method, if you are transmitting data of variable length, first pass the size, then use this value to determine how much more data to read.

Other notes:

  • If you have not already done so, you will also want to read the api documentation for sockets .
  • Remember that complex types, such as lists, require extra space, so:
  >>> import sys
     >>> a = [1,3,4]
     >>> sys.getsizeof (a)
     96
     >>> l = 0
     >>> for i in a:
     ... l + = sys.getsizeof (i)
     ... 
     >>> print l
     72
     >>>
0
source share

The ctypes module can provide sizeof() for type C uint16 , which you use in your example:

 >>> import ctypes >>> ctypes.sizeof(ctypes.c_uint16) 2 
0
source share

All Articles