If you want to read large files, just use the file descriptor and read the lines one at a time, processing each line as you need. If you want to save a python session, just use dill.dump_session - and it will save all existing objects. Other answers will fail because pickle cannot determine the writing file. dill , however, can serialize almost every python object - including a file descriptor.
Python 2.7.9 (default, Dec 11 2014, 01:21:43) [GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import dill >>> f = open('bigfile1.dat', 'r') >>> data = f.readline() >>> >>> dill.dump_session('session.pkl') >>>
Then close the python session and restart. When you load_session , you load all the objects that existed during the dump_session call.
dude@hilbert>$ python Python 2.7.9 (default, Dec 11 2014, 01:21:43) [GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import dill >>> dill.load_session('session.pkl') >>> len(data) 9 >>> data += f.readline() >>> f.close() >>>
Just like that.
Get the dill here: https://github.com/uqfoundation
Mike mckerns
source share