C ++ container allowing you to sort items when they are in last access?

Is there such a thing? or can anyone recommend how I could implement such a container?

Basically, I have std :: map, which uses an integer of 64 bits and a user data type as the containing element as the key.

I need to be able to periodically delete items that were available after a while in the most optimal way. Does anyone have any suggestions for this?

amuses

+6
c ++
source share
7 answers

One idea: save std :: deque , which gets an iterator into your map element, which is pushed to the front whenever you access the map. Then you can easily look at deque to find out which items were used most recently.

Some C ++ sketches (exception for error checking, the point is to demonstrate that deque is updated when accessing the map, and you can crop the map later).

class MyMap { typedef std::map<int64_t, void *> Map; Map m_map; std::deque<Map::iterator> m_recentlyUsedItems; public: void *getItem( int64_t key ) { Map::iterator it = m_map.find( key ); if ( it == m_map.end() ) { return 0; } m_recentlyUsedItems.push_front( it ); return it->second; } void removeAllButMostRecentlyUsedItems( int n ) { std::deque<Map::iterator> it = m_recentlyUsedItems.begin(); advance( it, n ); std::deque<Map::iterator> it2 = it; for ( ; it2 != m_recentlyUsedItems.end(); ++it2 ) { m_map.erase( *it2 ); } m_recentlyUsedItems.erase( it, m_recentlyUsedItems.end() ); } }; 
+4
source share

Use a priority queue that places the least recently used (LRU) item at the top of the queue. When an item is available, delete it and reinsert it into the current timestamp. When you want to expire items, simply remove them from the head of the queue.

I must point out that you cannot use the standard priority_queue as this does not support random deletion. You will have to use heap functions in combination with the vector.

I must also indicate that updating an access item will be expensive (O (N) to find the item to delete).

EDIT: Please ignore this answer. To rethink this is not the best way to do this. (Also see Comments.)

+5
source share

Here's a sketch of how this can be done using a list to store the most recent items available in order. The list is constantly updated, so there is no significant overhead for accessing the map (unlike some other answers that require a linear search for each access). I supported the interface very simple and did not test it very carefully.

 template <typename KEY, typename VALUE> class Container { public: void Set(const KEY& key, const VALUE& value) { typename Map::iterator it = map.find(key); if (it == map.end()) { list.push_front(it); it = map.insert(std::make_pair(key, std::make_pair(value, list.begin()))).first; list.front() = it; } else { it->second.first = value; Accessed(it); } } const VALUE* Get(const KEY& key) { typename Map::iterator it = map.find(key); if (it == map.end()) return 0; Accessed(it); return &it->second.first; } void Expire(std::size_t new_size) { while (list.size() > new_size) { map.erase(list.back()); list.pop_back(); } } private: // Needed to resolve the semicircular dependency on nested iterator types. struct MapIterator; typedef std::list<MapIterator> List; typedef std::map<KEY, std::pair<VALUE, typename List::iterator> > Map; struct MapIterator : Map::iterator { MapIterator(const typename Map::iterator& it) : Map::iterator(it) {} }; void Accessed(typename Map::iterator it) { list.erase(it->second.second); list.push_front(it); it->second.second = list.begin(); } Map map; List list; }; 
+5
source share

I am going to offer a completely different idea.

The problem of optimality is that it is difficult to understand. Especially: do you want to damage the extraction operation in order to get faster cleaning? Typical cleaning is usually performed during β€œdowntime”, when the speed is not so important, on the other hand, you may need a quick receipt (for access in a loop, etc.)

There, I would suggest that before you try fancy designs, you just save the last available time along your element on the map. Then cleaning simply consists of checking each element on the map and deleting those that you no longer need.

+2
source share

If you just need to know which elements were accessed so that you could delete them, you could probably take a card with multiple indices and save the last access value as an alternate key.

If you want to use this idea to increase productivity, you can implement your own container. The simplest approach would be to create a data structure that is known as an auto-sorting list . In fact, this means that each access operation makes an available item in your list new. In this case, the accessed items will often be close to the beginning, which will lead to a better search time.

Of course there are variations. Automatically sorted lists are not very efficient, and there are many other similar data structures that are actually better.

+1
source share

I implemented a similar type of sound, which I called DynamicCache. It basically stores data in a list sorted by creation date. This can be easily changed to the last available date. The purpose of my cache is to cache database items that don't change very often. It caches items for 5 minutes, then deletes them so that they can be read again when they are available.

A cache search uses a map in which the key and iterator are stored in a list. The search uses the map to find the data in the list, and then, before returning the item, removes all old items from the end of the list. If the item is not in the cache, factory is called to provide data.

This approach should use a list to store data, since iterators on the map must always be valid if it used deque, iterators may be invalid after insertion. The list uses a structure to store data, a key, the time it was created (not the last access), and finally, if the data exists.

 struct Record { KeyT key; DataT data; time_t createTime; bool exists; }; 

If your data is static and you want to keep the last access, then you can add a member of the structure access time and move the item to the top of the list each time it is available.

Here is my code, it looks a little more complicated, but this is mainly due to the template parameters and blocking reader entries.

 #include "BWThread/BWReadersWriterLock.h" #include <list> #include <map> #include <ctime> #include <memory> #include <boost/scoped_ptr.hpp> /** * This is a Generic Cache implementation. * * To implement a cache using this class create a new class * derived from CacheFactory and implement the lookup method. * If the datasource supports updating implement update and * remove method. * * EG * typedef NameCache DynamicCache<int, BWString>; * NameCacheFactory : NameCache::Factory * { * public: * virtual bool lookup(int, BWString *); * }; * * NameCache cache(new NameCacheFactory, "<cache name>" ); * * -------------------------------------------------------- * Implementation note: * This class uses a list as an efficient way to remove stale items from * the cache. The map stores a key and an iterators to the data in the list. * The list and the map are updated together. */ template <class KeyT, class DataT> class CacheFactory { public: virtual ~CacheFactory() {} // Lookup the data for from the data source. // Return true if the data is found. virtual bool lookup(const KeyT & key, DataT * data) = 0; // Update or insert the data in the data source. // Return true if the data can be updated. // Returning false means the cache is not updated either. virtual bool update(const KeyT & key, const DataT & data) { return false; } // Remove the data in the data source. // Return true if the data can be deleted weather it exists or not. // Returning false means the cache is not updated either. virtual bool remove(const KeyT & key) { return false; } }; template <class KeyT, class DataT> class DynamicCache { public: typedef CacheFactory<KeyT, DataT> Factory; DynamicCache(Factory * f, const std::string & name, time_t t = (5 * 60)) : factory(f), timeout(t), lastClean(std::time(0)), lock(name + " DynamicCache") {} /* * Lookup a key in the cache, the cached version is returned if it is * present and the value is not old. If the value is old or is not * present then use the factory to create it and insert the value in the * cache for future lookups. If the factory cannot create it cache this * fact too so we will ignore future lookups. Afterwards any entries in * the cache longer than timeout are removed. * * This is the main method and entry point for the cache. All locking is * performed inside the child methods. */ bool lookup(const KeyT & key, DataT * data, time_t now = std::time(0)) { bool found = false; FindStatus status = find(key, data, now); switch(status & EntryStatus) { case Found: found = true; break; case Create: found = build(key, data, now); break; } if (status & CleanRequired) { cleanOldEntries(now); } return found; } bool update(const KeyT & key, const DataT & data, time_t now = std::time(0)) { if (factory->update(key, data)) { Record record; record.key = key; record.createTime = now; record.data = data; record.exists = true; BWReadersWriterLock::WriteLockGuard guard(lock, __FILE__, __LINE__); updateEntry(key, record); return true; } return false; } bool remove(const KeyT & key, time_t now = std::time(0)) { if (factory->remove(key)) { Record record; record.key = key; record.createTime = now; record.exists = false; BWReadersWriterLock::WriteLockGuard guard(lock, __FILE__, __LINE__); updateEntry(key, record); return true; } return false; } /** * Return the size of the cache (only really useful for unit testing). */ size_t size() const { BWReadersWriterLock::ReadLockGuard guard(lock, __FILE__, __LINE__); return map.size(); } Factory * getFactory() { return factory.get(); } private: // Cache record struct Record { KeyT key; DataT data; time_t createTime; bool exists; }; // Find and Clean status // CleanRequired is part of this so that searching the cache and finding // stale items in the cache can be automic (use a single readlock). enum FindStatus { None, Found, Create, //Add NotExist, EntryStatus=Found|Create|NotExist, CleanRequired = 8 }; typedef std::list<Record> List; typedef typename List::iterator Iterator; typedef std::map<KeyT, typename Iterator> Map; // // The following methods all use and require explicit locking. // FindStatus find(const KeyT & key, DataT * data, time_t now) { BWReadersWriterLock::ReadLockGuard guard(lock, __FILE__, __LINE__); Iterator itr = getEntry(key); if (isValid(itr) && !isOld(itr, now)) { if (itr->exists) { *data = itr->data; return FindStatus(Found | cleanRequired(now)); } else { return FindStatus(NotExist | cleanRequired(now)); } } return FindStatus(Create | cleanRequired(now)); } bool build(const KeyT & key, DataT * data, time_t now) { Record record; record.key = key; record.createTime = now; record.exists = factory->lookup(key, &record.data); BWReadersWriterLock::WriteLockGuard guard(lock, __FILE__, __LINE__); if (record.exists) { *data = record.data; } updateEntry(key, record); return record.exists; } void cleanOldEntries(time_t now) { BWReadersWriterLock::WriteLockGuard guard(lock, __FILE__, __LINE__); lastClean = now; time_t old = now - timeout; typename List::reverse_iterator itr = list.rbegin(); while(!list.empty() && list.back().createTime < old) { removeEntry(getEntry(list.back().key)); } } // // The following methods don't use locking but require the calling // method to already have aquired a lock. // Iterator getEntry(const KeyT & key) { typename Map::const_iterator itr = map.find(key); if (itr != map.end()) { return map.find(key)->second; } return list.end(); } bool updateEntry(const KeyT key, const Record & record) { Iterator itr = getEntry(key); if (isValid(itr)) { removeEntry(itr); } insertEntry(record); return record.exists; } bool isValid(Iterator itr) const { typename List::const_iterator constItr(itr); return constItr != list.end(); } bool isOld(Iterator itr, time_t now) const { // isOld or time_t has wrapped return ((itr->createTime + timeout) < now) || (now < itr->createTime); } Iterator insertEntry(const Record & record) { list.push_front(record); Iterator itr = list.begin(); map.insert(typename Map::value_type(record.key, itr)); return itr; } void removeEntry(Iterator itr) { map.erase(itr->key); list.erase(itr); } FindStatus cleanRequired(time_t now) const { return (lastClean + timeout) < now ? CleanRequired : None; } List list; Map map; time_t timeout; time_t lastClean; boost::scoped_ptr<CacheFactory<KeyT, DataT> > factory; mutable BWReadersWriterLock lock; }; 
+1
source share

You can also use linked_hash_map from the MCT library . In fact, its documentation contains a recipe for this utility.

+1
source share

All Articles