Planning MapReduce for MongoDB

This is more of an implementation issue, but are there any flaws in using something simple, such as cron, to schedule tasks like mapreduce for MongoDB? Say that something needs to be done every hour, it seems like an appropriate way to do it ... But I think I'm just asking because of all the popular work order systems, such as Resque and others.

I believe my question is more similar, does cron make a reliable and reliable solution? Thoughts?

+7
source share
1 answer

Cron has been used for decades and is fairly reliable and reliable; if your cron is not reliable, I would suggest that a fierce discussion with your OS provider is ok. In addition, the MongoDB documentation talks about cron job assignments (Google site: mongodb.org cron), so presumably cron jobs should be expected from MongoDB.

However, if you already have a bunch of infrastructure configured for another planning system, then there is probably no reason to use cron for MongoDB and something else for other tasks.

In any case, you probably need a layer on a simple system to block PID files, if your cron jobs can take a long time to overlap, and you only need to start one at a time:

  • The cron task searches for a PID file at startup.
  • If it finds a file, it reads the old PID file from the file and checks to see if it continues to work.
    • If the old one starts, the new one will complain and exit.
    • If the old one is not running, the new one will continue.
  • When a new task decides that this is normal, it writes its PID to the PID file.
  • When a new task is completed, it deletes the PID file immediately before exiting (or using the atexit handler or any other similar function supported by your environment).
+8
source

All Articles