Best Logging Practices at Django Celery

I am trying to get Celery magazine to work with Django . I have a registration in settings.py to go to the console (this works fine since I host on Heroku ). At the top of each module I have:

 import logging logger = logging.getLogger(__name__) 

And in my .py tasks I have:

 from celery.utils.log import get_task_logger logger = get_task_logger(__name__) 

This works great for registering calls from a task, and I get the output as follows:

 2012-11-13T18:05:38+00:00 app[worker.1]: [2012-11-13 18:05:38,527: INFO/PoolWorker-2] Syc feed is starting 

But if this task then calls the method in another module, for example. a queryset , I get duplicate log entries, for example.

 2012-11-13T18:00:51+00:00 app[worker.1]: [INFO] utils.generic_importers.ftp_processor process(): File xxx.csv already imported. Not downloaded 2012-11-13T18:00:51+00:00 app[worker.1]: [2012-11-13 18:00:51,736: INFO/PoolWorker-6] File xxx.csv already imported. Not downloaded 

I think I could use

 CELERY_HIJACK_ROOT_LOGGER = False 

just use the Django log, but it didn’t work when I tried it, and even if I earned it, I would lose the "PoolWorker-6" bit that I really want. (By the way, I can't figure out how to get the name of the task to display in the log entry from Celery, since the docs seem to indicate what it should be).

I suspect I'm missing something simple here.

+54
python django logging celery django-celery
Nov 13 '12 at 18:14
source share
3 answers

When your logger is initialized at the beginning of a "different module", it contacts another registrar. Which process your messages. It can be the root logger, or usually I see in Django projects a logger with the name. ''

The best way here is to override the logging configuration:

 LOGGING = { 'version': 1, 'disable_existing_loggers': True, 'formatters': { 'simple': { 'format': '%(levelname)s %(message)s', 'datefmt': '%y %b %d, %H:%M:%S', }, }, 'handlers': { 'console': { 'level': 'DEBUG', 'class': 'logging.StreamHandler', 'formatter': 'simple' }, 'celery': { 'level': 'DEBUG', 'class': 'logging.handlers.RotatingFileHandler', 'filename': 'celery.log', 'formatter': 'simple', 'maxBytes': 1024 * 1024 * 100, # 100 mb }, }, 'loggers': { 'celery': { 'handlers': ['celery', 'console'], 'level': 'DEBUG', }, } } from logging.config import dictConfig dictConfig(LOGGING) 

In this case, I believe that it should work as you expect.

PS dictConfig added in Python2.7 +.

+62
Nov 17 '12 at 10:53
source share

This is troubling that celery interferes with the root log (which is not best practice and cannot be completely controlled), but it doesn’t disable your user application loggers in any way, so use your own processor names and define your own and not try to solve this problem with celery. [I like my application log to stay separate). You can use separate handlers or the same for Django code and Celery tasks, you just need to define them in the Django LOGGING configuration. Add formatting arguments for the module, file name, and process name to your sanity formatter to help you distinguish where messages come from.

[it is assumed that you set the handler for 'yourapp' to the LOGGING parameter value pointing to Appender - it looks like you know about it, though].

views.py

 log = logging.getLogger('yourapp') def view_fun(): log.info('about to call a task') yourtask.delay() 

tasks.py

 log = logging.getLogger('yourapp') @task def yourtask(): log.info('doing task') 

For logging that celery generates, use the celeryd --logfile flags to send the original Celery product (for example, work init, running task, task failed) to a separate location if necessary. Or use another answer here that sends the celery logger to a file of your choice.

Note. I would not use RotatingFileHandlers - they are not supported for applications with multiple processes. Rotating the log from another tool, such as logrotate, is more secure, the same thing happens with logging from Django, assuming you have multiple processes or the same log files are shared with celery workers. If you are using a multi-server solution, you probably want to register somewhere centralized anyway.

+7
Nov 22 '12 at 1:00
source share

To fix the issue of log duplication, setting the distribution parameter value when declaring my settings worked for me. LOGGING dict

 LOGGING = { 'version': 1, 'disable_existing_loggers': False, 'handlers': { 'console': { 'level': 'DEBUG', 'class': 'logging.StreamHandler', 'formatter': 'verbose' }, }, 'formatters': { 'verbose': { 'format': '%(asctime)s %(levelname)s module=%(module)s, ' 'process_id=%(process)d, %(message)s' } }, 'loggers': { 'my_app1': { 'handlers': ['console'], 'level': 'DEBUG', 'propagate': False #this will do the trick }, 'celery': { 'handlers': ['console'], 'level': 'DEBUG', 'propagate': True }, } } 

lets say your django project project is as follows:
my_project /
- tasks.py
- email.py

and lets say that one of your tasks calls a function call in email.py; logging will be done via .py e-mail, and then the registration will be extended to the "parent", which in this case will be your celery task. Thus, double logging. But setting the distribution to False for a particular registrar means that for this log / application its logs will not be distributed to the parent, therefore they will not be "double" logged. By default, Distribute is set to True.

Here's a link to the django docs section on what the parent / child magazine stuff is

+4
Jul 6 '15 at 17:08
source share



All Articles