Can I pass a Python pickle list, tuple, or other duplicate data type?

I often use shared comma / tab data files that might look like this:

key1,1,2.02,hello,4 key2,3,4.01,goodbye,6 ... 

I could read and pre-process this in Python to a list of lists, for example:

 [ [ key1, 1, 2.02, 'hello', 4 ], [ key2, 3, 4.01, 'goodbye', 6 ] ] 

Sometimes I like to keep this list of lists as a pickle, as it saves different types of my records. If the pickled file is large, however, it would be great to read this list of lists in streaming mode.

In Python, to download a text file as a stream, I use the following information to print each line:

 with open( 'big_text_file.txt' ) as f: for line in f: print line 

Is it possible to do something like this for a Python list, i.e.:

 import pickle with open( 'big_pickled_list.pkl' ) as p: for entry in pickle.load_streaming( p ): # note: pickle.load_streaming doesn't exist print entry 

Is there a brine function like "load_streaming"?

+8
python ipython pickle streaming
source share
2 answers

That will work.

What is, however, a fuzzy single object from a file, and then print the rest of the contents of the file before stdout

What you can do is something like:

 import cPickle with open( 'big_pickled_list.pkl' ) as p: try: while True: print cPickle.load(p) except EOFError: pass 

This will cause all objects in the file to decay before reaching EOF.


If you want something that works like for line in f: you can easily wrap this:

 def unpickle_iter(file): try: while True: yield cPickle.load(file) except EOFError: raise StopIteration 

Now you can simply do this:

 with open('big_pickled_list.pkl') as file: for item in unpickle_iter(file): # use item ... 
+9
source share

To keep track of the comment I made to the decision I recommend a cycle more like this:

 import cPickle with open( 'big_pickled_list.pkl' ) as p: while p.peek(1): print cPickle.load(p) 

This way you will continue to receive an EOFError exception if there is a damaged object in the file.

For completeness:

 def unpickle_iter(file): while file.peek(1): yield cPickle.load(file) 
0
source share

All Articles