How to perform a download with a limited response time using python requests?

Question

How to perform a download with a limited response time using python requests?

When uploading a large file using python, I want to set a time limit not only for the connection process, but also for downloading.

I am trying to use the following python code:

import requests r = requests.get('http://ipv4.download.thinkbroadband.com/1GB.zip', timeout = 0.5, prefetch = False) print r.headers['content-length'] print len(r.raw.read())

This does not work (download is not limited in time), as correctly indicated in the documents: https://requests.readthedocs.org/en/latest/user/quickstart/#timeouts

That would be great if it were possible:

 r.raw.read(timeout = 10)

The question is how to set a time limit for downloading?

+7

python python-requests urllib3

Hristo Hristov Nov 26 '12 at 21:05

source share

3 answers

Using the " prefetch=False " Requests "parameter you get the opportunity to take out arbitrary sizes of processing at a time (and not all at once).

What you need to do is say that Requests should not preload the entire request and save your time, how much you have spent reading so far, while getting small pieces. You can get the piece using r.raw.read(CHUNK_SIZE) . In general, the code will look something like this:

 import requests import time CHUNK_SIZE = 2**12 # Bytes TIME_EXPIRE = time.time() + 5 # Seconds r = requests.get('http://ipv4.download.thinkbroadband.com/1GB.zip', prefetch=False) data = '' buffer = r.raw.read(CHUNK_SIZE) while buffer: data += buffer buffer = r.raw.read(CHUNK_SIZE) if TIME_EXPIRE < time.time(): # Quit after 5 seconds. data += buffer break r.raw.release_conn() print "Read %s bytes out of %s expected." % (len(data), r.headers['content-length'])

Please note that sometimes it can be a bit longer than 5 seconds, since the last r.raw.read(...) can lag behind an arbitrary amount of time. But at least it doesn't depend on multithreaded or socket timeouts.

+2

shazow Nov 27

source share

Start the download in a thread, which you can then interrupt if you don’t finish on time.

 import requests import threading URL='http://ipv4.download.thinkbroadband.com/1GB.zip' TIMEOUT=0.5 def download(return_value): return_value.append(requests.get(URL)) return_value = [] download_thread = threading.Thread(target=download, args=(return_value,)) download_thread.start() download_thread.join(TIMEOUT) if download_thread.is_alive(): print 'The download was not finished on time...' else: print return_value[0].headers['content-length']

-3

Piotr Dobrogost Nov 26 '12 at 21:14

source share

Hristo Hristov · Accepted Answer · 2012-11-27 20:42

And the answer: do not use requests as it blocks. Use non-blocking network I / O, such as eventlet:

 import eventlet from eventlet.green import urllib2 from eventlet.timeout import Timeout url5 = 'http://ipv4.download.thinkbroadband.com/5MB.zip' url10 = 'http://ipv4.download.thinkbroadband.com/10MB.zip' urls = [url5, url5, url10, url10, url10, url5, url5] def fetch(url): response = bytearray() with Timeout(60, False): response = urllib2.urlopen(url).read() return url, len(response) pool = eventlet.GreenPool() for url, length in pool.imap(fetch, urls): if (not length): print "%s: timeout!" % (url) else: print "%s: %s" % (url, length)

Produces expected results:

 http://ipv4.download.thinkbroadband.com/5MB.zip: 5242880 http://ipv4.download.thinkbroadband.com/5MB.zip: 5242880 http://ipv4.download.thinkbroadband.com/10MB.zip: timeout! http://ipv4.download.thinkbroadband.com/10MB.zip: timeout! http://ipv4.download.thinkbroadband.com/10MB.zip: timeout! http://ipv4.download.thinkbroadband.com/5MB.zip: 5242880 http://ipv4.download.thinkbroadband.com/5MB.zip: 5242880

How to perform a download with a limited response time using python requests?

More articles: