MongoDB and Redis as a cache-level architecture

Suppose we have a social network application (using NodeJS, Express) and MongoDB as the main database engine.

In most API calls from clients (mobile application, web application, etc.) I do not want to make a complex request for each request. These requests can be answered from the cache level, such as Redis.

But my question is how / when should I update the cache level, because all write operations are performed in the MongoDB database, and not in the cache level (Redis). What is the right approach / architecture to solve this problem?

+8
caching mongodb redis
source share
6 answers

It really depends on your needs, but here is pretty common:

on_get_request if data_in_redis serve_data_from _redis else get_data_from_mongo set_data_in_redis set_expire_in_redis serve_data_from_memory 

The data will be a little dated from time to time, but this is normal for most uses. It works well in combination with some invalid cache when writing important data:

 on_important_data delete_invalid_redis_keys 

But everything implies a low write level, high reading level and a stable set of queries.

What does your high-load case look like?

+15
source share

The ideal approach is the cache path. You can write mongodb first and then write redis. This is the most common way.

Another option: You can write redis first and send an asynchronous message using redis (e.g. Q). Some threads can consume the message and read it, write it to mongoDB.

The first option is easier to implement. The second option can support a huge number of write transactions. As I know, the mongodb lock problem has not yet been resolved (it was fixed from global lock to db level lock) The second option can be significant in order to reduce such a conflict.

+5
source share

This is already implemented in the reference architecture for the open source MongoDB project called Socialization , although it is in Java, not node.js, so my answers are based on my experience and stress testing this code.

As you can see from its implementation of the status feed, the feed has an option fanoutOnWrite cache , which will create a cache (document with a limited size) for active users, limiting the number of recent entries in the cache document (this number is configurable).

The main principles of this implementation are that the requirements for the content are actually different from the requirements for cache caches, and first the database is written to the content, as this is a recording system for all content, and then you update the cache (if it exists ) This part can be performed asynchronously , if necessary. The update uses "limited arrays", aka updating the $ slice function to atomize the new value / contents into an array and slice the oldest of them at the same time.

Do not create a cache for the user if he does not already exist (if they never log into the system, then you are wasting your time). If you wish, you can use caches based on some TTL parameter.

When you start reading the cache for the user when you log in, but he is not there, then go back to "fanoutOnRead" (which requests all the contents of the users that they follow), and then builds their cache from this result.

The Socialite project used MongoDB for all third-party developers, but during comparative testing we found that the timeline cache does not need to be replicated or saved, therefore its MongoDB servers are configured only in memory (no log, no replication, no disk flushing), which is similar use of redis. If you lose the cache, it will simply be restored from the on-demand persistent content database.

+5
source share

Since your question is about architecture and begins with "Suppose ..."

Any reason to choose mongoDB?

In Postgres, I get better performance than mongoDB, and better relational and schematic documents with Postgres json / jsonb support, which is actually faster than mongoDB. With Postgres, you get a RELIABLE database, a simplified battle that has excellent performance, scalability and, most importantly, allows you to sleep at night and enjoy your vacation.

You can also use postgres LISTEN / NOTIFY for real-time events so that you can rename the cache.

Here is an example using postgres LISTEN / NOTIFY in nodejs: http://gonzalo123.com/2011/05/23/real-time-notifications-part-ii-now-with-node-js-and-socket-io/

Here are some complete performance tests for Postgres 9.4 as a schemaless / noSQL vs mongoDB document repository:

http://thebuild.com/presentations/pg-as-nosql-pgday-fosdem-2013.pdf

+1
source share

This will require serious data transfer to make Redis a viable option for the cache level over MongoDB, given that MongoDB has a working set that is stored in RAM; since both of them can really serve from memory if you know what you are doing and plan your scheme correctly.

Typically, calling Redis for caching is the goal of mass sites like craigslist ( http://www.slideshare.net/jzawodn/living-with-sql-and-nosql-at-craigslist-a-pragmatic-approach ) As you can see on slide 7 of this presentation, use it to:

  • counters
  • clots
  • queues

etc., but you can easily see how their memcached installation can also be combined with it to enable certain publications if MongoDB was their main repository, not MySQL.

So, the presentation itself gives you an idea of ​​how others use Redis with MongoDB.

It is mainly used to host snapshots of data that will usually be too slow to retrieve from the database.

Here is some related information that I will use for a small answer: What is Redis and what do I use it for? . I highly recommend that you read this question, as it will give you more sense in which use case of Redis and what its caching is.

0
source share

Need real-time transactions and records? When someone writes a mongo update, is it absolutely imperative that customers are immediately notified of the change (1 second / minute / day)?

Is your data really important so that no record is lost? If so, you cannot write redis first, except for AOF (which is not the default mode when reused and much slower). Transactions between mongo and redis are not so simple to implement, for example.

If you write redis first, you can use publish / subscribe to notify the redis client who has signed up to update the value in mongo, but there is no guarantee that your data will be transferred securely, be warned! However, this should be the fastest / most efficient way to update all of your clients, all of which are redis related.

Another way: you can define a survey with your acceptable real-time interval between redis and mongo to update the cache with changes from mongo to redis (decoupling) without directly rewriting the redis code from your code. You can use listeners ("triggers" in mongo) for this or use dirty checking.

Finally, some of them migrated from mongo + redis to couchbase, like viber, maybe you should consider it as an option? http://www.couchbase.com/viber

0
source share

All Articles