Cache file processes netCDF files in python

Is there any way to cache python files? I have a function that takes the path to the netCDF file as input, opens it, extracts some data from the netCDF file and closes it. It is called many times, and the overhead of opening a file is high every time.

How can I do this faster, possibly by caching a file descriptor? Perhaps there is a python library for this.

+6
source share
2 answers

Yes, you can use the following python libraries:

Let's look at an example. You have two files:

# save.py - it puts deserialized file handler object to memcached import dill import memcache mc = memcache.Client(['127.0.0.1:11211'], debug=0) file_handler = open('data.txt', 'r') mc.set("file_handler", dill.dumps(file_handler)) print 'saved!' 

and

 # read_from_file.py - it gets deserialized file handler object from memcached, # then serializes it and read lines from it import dill import memcache mc = memcache.Client(['127.0.0.1:11211'], debug=0) file_handler = dill.loads(mc.get("file_handler")) print file_handler.readlines() 

Now if you run:

 python save.py python read_from_file.py 

you can get what you want.

Why does it work?

Because you did not close the file ( file_handler.close() ), therefore the object still exists in memory (garbage was not collected due to weakref ), and you can use it. Even in different processes.

Decision

 import dill import memcache mc = memcache.Client(['127.0.0.1:11211'], debug=0) serialized = mc.get("file_handler") if serialized: file_handler = dill.loads(serialized) else: file_handler = open('data.txt', 'r') mc.set("file_handler", dill.dumps(file_handler)) print file_handler.readlines() 
+3
source

How about this?

 filehandle = None def get_filehandle(filename): if filehandle is None or filehandle.closed(): filehandle = open(filename, "r") return filehandle 

You may want to encapsulate this in a class to prevent other code from being exchanged with the filehandle variable.

-one
source

All Articles