Using Django ORM in threads and throwing an “too many clients” exception with BoundedSemaphore

I am working on a manage.py team that creates about 200 threads to verify remote hosts. My database setup allows me to use 120 connections, so I need to use some kind of pool. I tried using a dedicated stream like this

class Pool(Thread): def __init__(self): Thread.__init__(self) self.semaphore = threading.BoundedSemaphore(10) def give(self, trackers): self.semaphore.acquire() data = ... some ORM (not lazy, query triggered here) ... self.semaphore.release() return data 

I pass an instance of this object to each control chain, but still get "OperationalError: FATAL: sorry, too many clients already" inside the pool object after initializing 120 threads. I expected that only 10 connections to the database would be open, and the threads would wait for a free semaphore slot. I can check if the semaphore works by commenting on "release ()", in this case only 10 threads will work, while others will wait for the application to finish.

As far as I understand, each thread opens a new connection to the database, even if the actual call is inside another thread, but why? Is there a way to execute all database queries in only one thread?

+6
multithreading django postgresql orm connection-pooling
source share
1 answer

Django ORM manages database connections in local variables. Therefore, each individual thread accessing the ORM will create its own connection. You can see this in the first few lines of django/db/backends/__init__.py .

If you want to limit the number of connections to the database, you must limit the number of different threads that actually access the ORM. The solution may be to implement a service that delegates ORM requests to a pool of dedicated ORM threads. To transfer requests and their results from other threads, you will have to implement some kind of message passing mechanism. Since this is a typical manufacturer / consumer issue, some suggestions on how to do this should be given in the Python streaming usage docs.

Edit: I was just looking for google for the django connection pool. There are many people who complain that Django does not provide the correct connection pool. Some of them managed to integrate a separate pool. For PostgreSQL, I would look at the pgpool middleware.

+13
source share

All Articles