I am writing a class in which there are some methods based on calculations, and some parameters that the user wants to iteratively adjust and are independent of the calculation.
Actual use is used for visualization, but here is an example of a cartoon:
class MyClass(object): def __init__(self, x, name, mem=None): self.x = x self.name = name if mem is not None: self.square = mem.cache(self.square) def square(self, x): """This is the 'computation heavy' method.""" return x ** 2 def report(self): """Use the results of the computation and a tweakable parameter.""" print "Here you go, %s" % self.name return self.square(self.x)
The basic idea is that the user may want to create many instances of this class with the same x parameters, but with different name . I want to allow the user to provide a joblib.Memory object that will cache part of the calculations so that they can "communicate" to many different names without having to re-calculate the square of the array every time.
(This is a bit strange, I know. The reason the user needs a separate instance of the class for each name is because they will actually interact with an interface function that looks like this.
def myfunc(x, name, mem=None): theclass = MyClass(x, name, mem) theclass.report()
But let's not ignore it now).
Following joblib docs , I cache the square function using the line self.square = mem.cache(self.square) . The problem is that since self will be different for different instances, the array will be redistributed every time, even when the argument is the same.
I assume that the correct way to handle this is to change the line to
self.square = mem.cache(self.square, ignore=["self"])
However, are there any drawbacks to this approach? Is there a better way to do caching?
python memoization joblib
mwaskom
source share