Python: uploading a large file to a local path and setting custom HTTP headers

I want to upload a file with http-url to a local file. File is large enough, and I want to download it and save it, but do not read(), and write()the whole file as one giant string.

The interface urllib.urlretrieveis what I want. However, I see no way to set request headers on boot through urllib.urlretrieve, which should I do.

If I use urllib2, I can set request headers through its object Request. However, I do not see the API in urllib2for downloading the file directly to the path on the disk, for example urlretrieve. It seems that instead I will have to use a loop to iterate over the returned data in chunks, write it to a file, and check when we are done.

What would be the best way to create a function that works like urllib.urlretrieve, but allows you to pass request headers?

+5
source share
2 answers

What is the harm when writing your own function using urllib2?

import os
import sys
import urllib2

def urlretrieve(urlfile, fpath):
    chunk = 4096
    f = open(fpath, "w")
    while 1:
        data = urlfile.read(chunk)
        if not data:
            print "done."
            break
        f.write(data)
        print "Read %s bytes"%len(data)

and using the request object to set headers

request = urllib2.Request("http://www.google.com")
request.add_header('User-agent', 'Chrome XXX')
urlretrieve(urllib2.urlopen(request), "/tmp/del.html")
+3
source

urllib urlretrieve, urllib.URLopener addheader() (..: addheader('Accept', 'sound/basic'), docstring urllib.addheader).

URL- urllib, . urllib._urlopener ( ):

import urllib

class MyURLopener(urllib.URLopener):
    pass # your override here, perhaps to __init__

urllib._urlopener = MyURLopener

, read(), . , , urlretrieve , , . TCP/IP , , EOF , read() . , , ... , . urllib2, , .

+2

All Articles