Database (probably noSQL) with built-in data storage functions

Which open source databases have automatic aging features, so you can specify how long a piece of data is stored?

those. the set date or time on the data fragment, after which the database can delete all traces.

Update: I am looking more for ages from a few days to a few years, more than a few minutes or seconds. So the cache mechanism is not quite what I am looking for.

+4
source share
4 answers

MongoDB has something in the new version 2.2 that might be of interest - TTL Collections .

Collections end with a special index that tracks insertion time along with the mongod background process, which regularly removes expired documents from the collection. You can use this function to expire data from replica sets and fragment clusters.

It is very easy to create a TTL collection from the mongo shell -

db.mycollection.ensureIndex( { "status": 1 }, { expireAfterSeconds: 3600 } )

  • Download 2.2rc0 here (release candidate, not quite ready for production ... there will be another candidate for release before production build)

  • Change log here

  • 2.2 release notes can be found here .

I can not talk about other solutions.

+7
source

I think most noSql databases support this function, but, for example, cassandra has this function:

http://www.datastax.com/docs/1.0/ddl/column_family .

Cassandra can be downloaded here:

http://cassandra.apache.org/

However, if you use such db solely for expiration time , consider using a cache because it matches exactly what you are trying to do, especially if your time to live objects is short. In the end, the goal of the cache is "as a container for the objects you want to keep temporary." Most traditional caches use Key-Value caches / storages, like most NoSql databases.

While nosql databases like cassandra tend to retrieve data very quickly, you will find that most of them are worse if you constantly add and delete data compared to traditional caches and add an additional file system and / or network costs. If you find that what you need is actually a cache, there are a few tips.

http://ehcache.org/

It is an unallocated cache with a very simple api

http://www.jboss.org/infinispan/

is a distributed cache in memory / K, V store

In caches, however, you are limited by how much you can save, since by default they are in memory. Most of them have the ability to store data on file systems, but if that happens, I would use noSQL db.

+5
source

It depends on what type of data needs to be stored. Simply store the key-value or you need a databse document.

This is the usual use of cache. You can try EHCache, Hazelcast, Memcached, etc. But mostly these are stores with key values. There are several data deletion policies - the oldest, least used, etc. Abstracts are rather stored in memory. If you need K / V persistent storage with this feature, try Redis.

If you collect time-based data, for example, usage statistics, you can use a database such as RRD, which combines older data instead of deleting it (makes day, week, monthly amounts)

If you need more, as in a document database, it looks like MongoDB supports simple documentation (http://docs.mongodb.org/manual/tutorial/expire-data/). CouchDB does not seem to support this, however you can run a timer task to delete old data.

+2
source

Couchbase offers TTL deletes using the memcached binary protocol to set the time to live. Thus, you can save the data item, with a timestamp, it should be deleted after (may be arbitrarily far in the future). When it arrives, Couchbase will delete the data item.

Here is an example of how to install TTL from Ruby. There are examples in other languages ​​if you prefer: http://www.couchbase.com/docs/couchbase-sdk-ruby-1.0/couchbase-sdk-ruby-getting-started-hello.html

+1
source

All Articles