Gevent / Eventlet to fix errors for DB drivers

After running the Gevent / Eventlet monkey patch, can I assume that whenever the DB driver (e.g. redis-py, pymongo) uses IO through a standard library (e.g. socket ), will it be asynchronous?

Thus, the use of patch patches for event packages is enough to do, for example: redis-py without blocking in the eventlet application?

From what I know, it should be enough if I take care of the connection (for example, to use a different connection for each pedigree). But I want to be sure.

If you know what else is required, or how to properly use the DB drivers using the Gevent / Eventlet, enter it as well.

+6
source share
2 answers

You can assume that it will be magically corrected if all of them are correct.

  • Are you sure I / O is built on top of the standard Python socket or other things that eventlet / gevent monkeypatches. No files, no socket objects (C), etc.
  • You pass aggressive=True to patch_all (or patch_select ), or you are sure that the library is not using select or something like that.
  • The driver does not use any (implicit) internal threads. (If the driver uses internal threads, patch_thread may work, but it may not be so.)

If you're not sure, this is pretty easy to verify — perhaps easier than reading code and trying to work it out. There is one green that just does something like this:

 while True: print("running") gevent.sleep(0.1) 

Then there is another that performs a slow database query. If it is disabled, the green loop will continue to print “running” 10 times / second; if not, the loop will not start until the program is blocked in the request.

So what do you do if your driver is blocked?

The simplest solution is to use a truly parallel thread for database queries. The idea is that you run each request (or package) as a threadpool and greenlet-block gevent for gevent when this task completes. (For really simple cases where you do not need many simultaneous requests, you can just spawn threading.Thread for each of them instead, but usually you cannot avoid it.)

If the driver does significant work with the CPU (for example, you use something that starts the cache in the process or even the entire DBMS in the process, for example sqlite), you want this thread to be actually implemented on top of the processes, as otherwise GIL may prohibit the launch of greenlets . Otherwise (especially if you care about Windows), you probably want to use OS threads. (However, this means that you cannot patch_threads() , if you need to do this, use processes.)

If you use an eventlet , and want to use streams, there is a built-in simple solution called tpool , which may be enough. If you use gevent , or you need to use processes, this will not work. Unfortunately, blocking the grill (without blocking the entire event loop) on a real stream object is slightly different between eventlet and gevent and is not well documented, but the tpool source should give you this idea. Other than this part, the rest just use concurrent.futures (see futures on pypi if you need it in 2.x or 3.1) to perform tasks on ThreadPoolExecutor or ProcessPoolExecutor . (Or, if you want, you can switch to threading or multiprocessing instead of using futures .)


Can you explain why I should use OS threads in Windows?

A brief summary: if you stick to threads, you can just write cross-platform code, but if you go to processes, you are effectively writing code for two different platforms.

First read the Programming Guide for the multiprocessing module (both the "All Platforms" section and the "Windows" section). Fortunately, the DB shell should not work in most cases. You only need to process the processes through the ProcessPoolExecutor . And regardless of whether you complete the level at the cursor level or the query level, all your arguments and return values ​​will be simple types that can be pickled. However, this is something you should be careful about, which otherwise would not be a problem.

At the same time, Windows has very low overhead for its synchronization objects within the process, but very high overhead for its interprocess. (He also has very fast thread creation and very slow process creation, but that doesn't matter if you use the pool.) So, how do you handle this? I had a lot of fun creating OS threads to wait for cross-process synchronization on objects and signal green ones, but your definition of fun may differ.

Finally, tpool can be trivially adapted to ppool for Unix, but it requires more work on Windows (and you will have to understand Windows to do this).

+16
source

Abarnert's answer is correct and very comprehensive. I just want to add that there is no “aggressive” fix in the eventlet, possibly gevent. Also, if the library uses select , this is not a problem, because the eventlet can also use the monkey patch.

Indeed, in most cases, eventlet.monkey_patch() is all you need. Of course, this must be done before creating any sockets.

If you still have any problems, feel free to post problems or write to the eventlet mailing list or the G + community. All relevant links can be found at http://eventlet.net/

+2
source

All Articles