Parsing binary data in ctypes Structural object via readinto ()

I am trying to handle binary format, following the example here:

http://dabeaz.blogspot.jp/2009/08/python-binary-io-handling.html

>>> from ctypes import * >>> class Point(Structure): >>> _fields_ = [ ('x',c_double), ('y',c_double), ('z',c_double) ] >>> >>> g = open("foo","rb") # point structure data >>> q = Point() >>> g.readinto(q) 24 >>> qx 2.0 

I have defined the structure of my header and I am trying to read the data in my structure, but I have some difficulties. My structure is as follows:

 class BinaryHeader(BigEndianStructure): _fields_ = [ ("sequence_number_4bytes", c_uint), ("ascii_text_32bytes", c_char), ("timestamp_4bytes", c_uint), ("more_funky_numbers_7bytes", c_uint, 56), ("some_flags_1byte", c_byte), ("other_flags_1byte", c_byte), ("payload_length_2bytes", c_ushort), ] 

The ctypes documentation says:

For integer type fields, such as c_int, the third optional element may be given. This should be a small positive integer defining the bit width of the field.

So, for ("more_funky_numbers_7bytes", c_uint, 56), I tried to define the field as a field of 7 bytes, but I get an error message:

ValueError: number of bits invalid for the bit field

So my first problem is how to define a 7 byte int field?

Then, if I skip this problem and comment out the "more_funky_numbers_7bytes" field, the resulting data will load in ... but as expected, only 1 character is loaded in "ascii_text_32bytes". And for some reason it returns 16 , which I assume is the calculated number of bytes that it reads into the structure ... but if I comment out my "funky number" and "ascii_text_32bytes" fields, it gives only one char (1 byte), not should be 13, not 16 ???

Then I tried to pull the char field into a separate structure and reference it from my header structure. But this does not work ...

 class StupidStaticCharField(BigEndianStructure): _fields_ = [ ("ascii_text_1", c_byte), ("ascii_text_2", c_byte), ("ascii_text_3", c_byte), ("ascii_text_4", c_byte), ("ascii_text_5", c_byte), ("ascii_text_6", c_byte), ("ascii_text_7", c_byte), ("ascii_text_8", c_byte), ("ascii_text_9", c_byte), ("ascii_text_10", c_byte), ("ascii_text_11", c_byte), . . . ] class BinaryHeader(BigEndianStructure): _fields_ = [ ("sequence_number_4bytes", c_uint), ("ascii_text_32bytes", StupidStaticCharField), ("timestamp_4bytes", c_uint), #("more_funky_numbers_7bytes", c_uint, 56), ("some_flags_1byte", c_ushort), ("other_flags_1byte", c_ushort), ("payload_length_2bytes", c_ushort), ] 

So, any ideas like:

  • Define a field of 7 bytes (which I will need to decode using a specific function)
  • Define a static char field of 32 bytes

UPDATE

I found a structure that seems to work ...

 class BinaryHeader(BigEndianStructure): _fields_ = [ ("sequence_number_4bytes", c_uint), ("ascii_text_32bytes", c_char * 32), ("timestamp_4bytes", c_uint), ("more_funky_numbers_7bytes", c_byte * 7), ("some_flags_1byte", c_byte), ("other_flags_1byte", c_byte), ("payload_length_2bytes", c_ushort), ] 

Now, however, my remaining question is why when using .readinto() :

 f = open(binaryfile, "rb") mystruct = BinaryHeader() f.readinto(mystruct) 

It returns 52 , not expected, 51 . Where does this extra byte come from, and where does it go?

UPDATE 2 For those interested in an example of an alternative struct method for reading values ​​in namedtuple mentioned by eryksun:

 >>> record = 'raymond \x32\x12\x08\x01\x08' >>> name, serialnum, school, gradelevel = unpack('<10sHHb', record) >>> from collections import namedtuple >>> Student = namedtuple('Student', 'name serialnum school gradelevel') >>> Student._make(unpack('<10sHHb', record)) Student(name='raymond ', serialnum=4658, school=264, gradelevel=8) 
+7
c python data-structures ctypes
source share
1 answer

This line definition is actually intended to define a bit field :

 ... ("more_funky_numbers_7bytes", c_uint, 56), ... 

what's wrong here. The size of the bit field must be less than or equal to the size of the type, so c_uint must be no more than 32, one additional bit will throw an exception:

 ValueError: number of bits invalid for bit field 

An example of using a bit field:

 from ctypes import * class MyStructure(Structure): _fields_ = [ # c_uint8 is 8 bits length ('a', c_uint8, 4), # first 4 bits of `a` ('b', c_uint8, 2), # next 2 bits of `a` ('c', c_uint8, 2), # next 2 bits of `a` ('d', c_uint8, 2), # since we are beyond the size of `a` # new byte will be create and `d` will # have the first two bits ] mystruct = MyStructure() mystruct.a = 0b0000 mystruct.b = 0b11 mystruct.c = 0b00 mystruct.d = 0b11 v = c_uint16() # copy `mystruct` into `v`, I use Windows cdll.msvcrt.memcpy(byref(v), byref(mystruct), sizeof(v)) print sizeof(mystruct) # 2 bytes, so 6 bits are left floating, you may # want to memset with zeros print bin(v.value) # 0b1100110000 

you need 7 bytes, so what you are doing right:

 ... ("more_funky_numbers_7bytes", c_byte * 7), ... 

As for the size of the structure, it will be 52, I will add an additional byte to the align structure of 4 bytes per 32-bit processor or 8 bytes per 64 bits. Here:

 from ctypes import * class BinaryHeader(BigEndianStructure): _fields_ = [ ("sequence_number_4bytes", c_uint), ("ascii_text_32bytes", c_char * 32), ("timestamp_4bytes", c_uint), ("more_funky_numbers_7bytes", c_byte * 7), ("some_flags_1byte", c_byte), ("other_flags_1byte", c_byte), ("payload_length_2bytes", c_ushort), ] mystruct = BinaryHeader( 0x11111111, '\x22' * 32, 0x33333333, (c_byte * 7)(*([0x44] * 7)), 0x55, 0x66, 0x7777 ) print sizeof(mystruct) with open('data.txt', 'wb') as f: f.write(mystruct) 

An extra byte is padded between other_flags_1byte and payload_length_2bytes in the file:

 00000000 11 11 11 11 .... 00000004 22 22 22 22 """" 00000008 22 22 22 22 """" 0000000C 22 22 22 22 """" 00000010 22 22 22 22 """" 00000014 22 22 22 22 """" 00000018 22 22 22 22 """" 0000001C 22 22 22 22 """" 00000020 22 22 22 22 """" 00000024 33 33 33 33 3333 00000028 44 44 44 44 DDDD 0000002C 44 44 44 55 DDDU 00000030 66 00 77 77 f.ww ^ extra byte 

This is a problem with file formats and network protocols. To change it, set it to 1:

  ... class BinaryHeader(BigEndianStructure): _pack_ = 1 _fields_ = [ ("sequence_number_4bytes", c_uint), ... 

the file will be:

 00000000 11 11 11 11 .... 00000004 22 22 22 22 """" 00000008 22 22 22 22 """" 0000000C 22 22 22 22 """" 00000010 22 22 22 22 """" 00000014 22 22 22 22 """" 00000018 22 22 22 22 """" 0000001C 22 22 22 22 """" 00000020 22 22 22 22 """" 00000024 33 33 33 33 3333 00000028 44 44 44 44 DDDD 0000002C 44 44 44 55 DDDU 00000030 66 77 77 fww 

As for struct , this will not make things easier for you. Unfortunately, it does not support nested tuples in a format. For example here:

 >>> from struct import * >>> >>> data = '\x11\x11\x11\x11\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22 \x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x22\x33 \x33\x33\x33\x44\x44\x44\x44\x44\x44\x44\x55\x66\x77\x77' >>> >>> BinaryHeader = Struct('>I32cI7BBBH') >>> >>> BinaryHeader.unpack(data) (286331153, '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"' , '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"', '"' , '"', '"', 858993459, 68, 68, 68, 68, 68, 68, 68, 85, 102, 30583) >>> 

This result cannot be used by namedtuple , you will still analyze it based on the index. This will work if you can do something like '>I(32c)(I)(7B)(B)(B)H' . This function has been requested here (Extend struct.unpack for creating nested tuples) since 2003, but nothing has been done since.

+6
source share

All Articles