Does multiprocessing.Queue work with gevent?

Does anyone know what is wrong with this code? He just “loads” forever. No exit. "Sites" is a list of several dozen lines.

num_worker_threads = 30 def mwRegisterWorker(): while True: try: print q.get() finally: pass q = multiprocessing.JoinableQueue() for i in range(num_worker_threads): gevent.spawn(mwRegisterWorker) for site in sites: q.put(site) q.join() # block until all tasks are done 
+3
source share
2 answers

gevent.spawn() creates green dots, not processes (even more: all green bars run in the same OS thread). Therefore, multiprocessing.JoinableQueue is not suitable here.

gevent based on joint multitasking, that is, until you call the lock function, which switches to the gevent event loop, other green dots will not start. For example, conn below uses the patch for gevent socket methods, which let other greens run while they wait for a response from the site. And without pool.join() , which refuses to control green, which starts the event loop, no connections will be made.

To limit concurrency during requests to multiple sites, you can use gevent.pool.Pool :

 #!/usr/bin/env python from gevent.pool import Pool from gevent import monkey; monkey.patch_socket() import httplib # now it can be used from multiple greenlets import logging info = logging.getLogger().info def process(site): """Make HEAD request to the `site`.""" conn = httplib.HTTPConnection(site) try: conn.request("HEAD", "/") res = conn.getresponse() except IOError, e: info("error %s reason: %s" % (site, e)) else: info("%s %s %s" % (site, res.status, res.reason)) finally: conn.close() def main(): logging.basicConfig(level=logging.INFO, format="%(asctime)s %(msg)s") num_worker_threads = 2 pool = Pool(num_worker_threads) sites = ["google.com", "bing.com", "duckduckgo.com", "stackoverflow.com"]*3 for site in sites: pool.apply_async(process, args=(site,)) pool.join() if __name__=="__main__": main() 
+11
source

Use gevent.queue.JoinableQueue . Green threads ( gevent uses it internally) are neither threads nor a process, but coroutine with user level scheduling.

+3
source

All Articles