Celery.delay freezes (last, not auth issue)

I am using Celery 2.2.4 / djCelery 2.2.4 using RabbitMQ 2.1.1 as the backend. I recently brought two new celery servers online - I had 2 workers on two machines, a total of about 18 threads and on my new calipers (36 g RAM + dual hyper-threaded quad-core processor), I start 10 workers with 8 threads each, total 180 threads - my tasks are all pretty small, so that should be fine.

Nodes have been working fine for the past few days, but today I noticed that it was .delaay()hanging. When I interrupt it, I see a trace there:

File "/home/django/deployed/releases/20110608183345/virtual-env/lib/python2.5/site-packages/celery/task/base.py", line 324, in delay
    return self.apply_async(args, kwargs)
File "/home/django/deployed/releases/20110608183345/virtual-env/lib/python2.5/site-packages/celery/task/base.py", line 449, in apply_async
    publish.close()
File "/home/django/deployed/virtual-env/lib/python2.5/site-packages/kombu/compat.py", line 108, in close
    self.backend.close()
File "/home/django/deployed/virtual-env/lib/python2.5/site-packages/amqplib/client_0_8/channel.py", line 194, in close
    (20, 41),    # Channel.close_ok
File "/home/django/deployed/virtual-env/lib/python2.5/site-packages/amqplib/client_0_8/abstract_channel.py", line 89, in wait
    self.channel_id, allowed_methods)
File "/home/django/deployed/virtual-env/lib/python2.5/site-packages/amqplib/client_0_8/connection.py", line 198, in _wait_method
    self.method_reader.read_method()
File "/home/django/deployed/virtual-env/lib/python2.5/site-packages/amqplib/client_0_8/method_framing.py", line 212, in read_method
    self._next_method()
File "/home/django/deployed/virtual-env/lib/python2.5/site-packages/amqplib/client_0_8/method_framing.py", line 127, in _next_method
    frame_type, channel, payload = self.source.read_frame()
File "/home/django/deployed/virtual-env/lib/python2.5/site-packages/amqplib/client_0_8/transport.py", line 109, in read_frame
    frame_type, channel, size = unpack('>BHI', self._read(7))
File "/home/django/deployed/virtual-env/lib/python2.5/site-packages/amqplib/client_0_8/transport.py", line 200, in _read
    s = self.sock.recv(65536)

I checked the Rabbit logs and I see that the process is trying to connect as:

=INFO REPORT==== 12-Jun-2011::22:58:12 ===
accepted TCP connection on 0.0.0.0:5672 from x.x.x.x:48569

INFO, . , 2 :

[2011-06-12 22:41:08,033: ERROR/MainProcess] Consumer: Connection to broker lost. Trying to re-establish connection...

.

, (RabbitMQ/Celery Django //etc - ) , . , - amqplib - , amqplib , , .

: celeryctl purge - , - AMQP:

AMQPConnectionException(reply_code, reply_text, (class_id, method_id))
    amqplib.client_0_8.exceptions.AMQPConnectionException: 
    (530, u"NOT_ALLOWED - cannot redeclare exchange 'XXXXX' in vhost 'XXXXX' 
     with different type, durable or autodelete   value", (40, 10), 'Channel.exchange_declare')

inspect stats " " . .

EDIT2: , exchange.delete camqadm, node: (.

EDIT3: , vhost rabbitmq, node.

+5
1

, - ... , , :

/var , . , , /var - /var/lib/rabbitmq, .

+4

All Articles