Python: running multiple queries in parallel and getting the first completed

Question

Python: running multiple queries in parallel and getting the first completed

I am trying to create a Python script that executes requests to multiple sites. The script works well (I use urllib2), but only for one link. For multi-site sites, I make several requests one by one, but not very powerful.

What is the ideal solution (threads, I think) to start multiple queries in parallel and stop others when the query returns a specific row?

I found this question, but I did not find how to change it to stop the rest of the threads ...: Python urllib2.urlopen () is slow, you need a better way to read multiple urls

Thank you in advance!

(sorry if I made mistakes in English, I'm French ^^)

+5

python multithreading urllib2

ncrocfer Jul 01 '11 at 14:54

source share

3 answers

Tobu · Answer 1 · 2011-07-01T17:17:00+0000

You can use Twisted to address multiple requests at the same time. Inside, he will use epoll (or iocp or kqueue depending on the platform) to receive tcp availability notification efficiently, which is cheaper than using threads. When one request matches, you cancel the other.

Here is an HTTP Twisted tutorial .

Sunny Milenov · Answer 2 · 2011-07-01T15:02:39+0000

This is usually implemented with the following pattern (sorry, my Python skills are not so good).

Runner. , . , "", - ( url , request.terminate()).

, , .

, , . , .

, , threaded- .

, Python:)

Noctis skytower · Answer 3 · 2011-07-01T16:14:01+0000

You can run your queries using the library multiprocessing, polling for results, and shutdown queries that you no longer need. The documentation for the module contains information about the Process class which has a terminate () method . If you want to limit the number of requests sent, check the merge options.

Python: running multiple queries in parallel and getting the first completed

More articles: