Reading binary data structure in Python?

Are there any good Python solutions like Ruby BinData for reading custom binaries / streams? If not, what is the preferred way to do this in Python outside of using the struct module?

I have a binary file that stores the "records" of events. Records are dynamic in size, so I have to read the first few bytes of each record to determine the length of the record and the type of record. Different types of records will have different byte layouts. For example, a warning record may contain three 4-byte ints followed by a 128-byte value, while an information record may contain only five 4-byte ints.

It would be nice to define different types of records and their structures in such a way that I could just pass the binary code to something, and it processes the rest (generating objects, etc.). In short, your defining patterns / maps on how to interpret binary data.

+4
source share
4 answers

Perhaps you are looking for Construct , a binary syntax system with pure Python 2 and 3?

+3
source

The Python structure module works as follows:

record_header = struct.Struct("<cb") warning = struct.Struct("<iii128") info = struct.Struct("<iiiii") while True: header_text = input.read(record_header.size) # file is empty if not header_text: break packet_type, extra_data = record_header.unpack(header_text) if packet_type == 'w': warning_data = warning.unpack( input.read(warning.size) ) elif packet_type == 'i': info_data = info.unpack( input.read(info.size) ) 

See the documentation for more details: http://docs.python.org/library/struct.html

+3
source

The structural module will probably work, but you can also use python bindings to

+2
source

I would like to give an example for reading in python.

 typedef struct { ID chunkname; long chunksize; /* Note: there may be additional fields here, depending upon your data. */ } Chunk; 

How do you read structure data from a file in python? Here is one way:

 class Chunk: def __init__(self, file, align=True, bigendian=True, inclheader=False): import struct self.closed = False self.align = align # whether to align to word (2-byte) boundaries if bigendian: strflag = '>' else: strflag = '<' self.file = file self.chunkname = file.read(4) if len(self.chunkname) < 4: # you need to take care of end of file raise EOFError try: # you could use unpack # http://docs.python.org/2/library/struct.html#format-characters # here 'L' means 'unsigned long' 4 standard size self.chunksize = struct.unpack(strflag+'L', file.read(4))[0] except struct.error: # you need to take care of end of file raise EOFError if inclheader: self.chunksize = self.chunksize - 8 # subtract header self.size_read = 0 try: self.offset = self.file.tell() except (AttributeError, IOError): self.seekable = False else: self.seekable = True 

So, you need to understand the correspondence between the structure c and the format struct.unpack () http://docs.python.org/2/library/struct.html#format-characters .

-1
source

All Articles