Python: creating an LRU cache

I have around 6,00,000 entries in MongoDB in the following format:

 feature:category:count 

Where

  • function can be any word
  • the category is positive or negative, and
  • count indicates how many times the function took place in the document for this category.

I want to cache the first 1000 tuples, let's say so that every time I do not query the database.

How to build LRU cache in Python? Or are there any known solutions?

+7
source share
4 answers

The LRU cache in Python3.3 has O (1) insertion, deletion, and search.

The project uses a circular doubly linked list of records (ranging from oldest to newest) and a hash table to search for individual links. Cache hits use a hash table to find the appropriate link and move it to the top of the list. The cache skips deleting the oldest link and creates a new link at the top of the linked list.

Here is a simplified (but quick) version of 33 lines of very simple Python (using only simple vocabulary and lists). It works on Python2.0 and later (either PyPy or Jython or Python3.x):

 class LRU_Cache: def __init__(self, original_function, maxsize=1024): # Link structure: [PREV, NEXT, KEY, VALUE] self.root = [None, None, None, None] self.root[0] = self.root[1] = self.root self.original_function = original_function self.maxsize = maxsize self.mapping = {} def __call__(self, *key): mapping = self.mapping root = self.root link = mapping.get(key) if link is not None: link_prev, link_next, link_key, value = link link_prev[1] = link_next link_next[0] = link_prev last = root[0] last[1] = root[0] = link link[0] = last link[1] = root return value value = self.original_function(*key) if len(mapping) >= self.maxsize: oldest = root[1] next_oldest = oldest[1] root[1] = next_oldest next_oldest[0] = root del mapping[oldest[2]] last = root[0] last[1] = root[0] = mapping[key] = [last, root, key, value] return value if __name__ == '__main__': p = LRU_Cache(ord, maxsize=3) for c in 'abcdecaeaa': print(c, p(c)) 
+17
source

In addition to the version included in Python 3.2, there are recipes for the LRU cache in the Python Cookbook , including those by Python core developer Raymond Hettinger.

+5
source

Python 3.2 functools includes LRU cache . You can easily blacken it from repo , check whether you need to configure it to work with Python 2 (it should not be too complicated - perhaps using itertools instead of certain built-in functions - ask if you need help) and do it. You need to wrap the request in the called one and make sure that it depends on the arguments of the function (hashable), however.

+3
source

There are also backports pruon 3.3 versions of lru_cache version, for example this , which works on python 2.7. If you are interested in two levels of caching (if they are not cached in the instance, it will check the general cache), I created lru2cache based on backport lru_cache.

+1
source

All Articles