How to download pdf file via https using python

Question

How to download pdf file via https using python

I am writing a python script that will save the pdf file locally according to the format specified in the url. eg.

https://Hostname/saveReport/file_name.pdf #saves the content in PDF file.

I open this url through a python script:

  import webbrowser webbrowser.open("https://Hostname/saveReport/file_name.pdf")

The url contains a lot of images and text. Once this URL is open, I want to save the pdf file using a python script.

This is what I have done so far. Code 1:

 import requests url="https://Hostname/saveReport/file_name.pdf" #Note: It https r = requests.get(url, auth=('usrname', 'password'), verify=False) file = open("file_name.pdf", 'w') file.write(r.read()) file.close()

Code 2:

  import urllib2 import ssl url="https://Hostname/saveReport/file_name.pdf" context = ssl._create_unverified_context() response = urllib2.urlopen(url, context=context) #How should i pass authorization details here? html = response.read()

In the above code, I get: urllib2.HTTPError: HTTP error 401: unauthorized

If I use code 2, how can I transfer authorization data?

+8

python url python-2.7 pdf pdf-generation

user3439895 Nov 02 '15 at 10:26

source share

2 answers

You can try something like:

 import requests response = requests.get('https://websitewithfile.com/file.pdf',verify=False, auth=('user', 'pass')) with open('file.pdf','w') as fout: fout.write(response.read()):

0

Raphaël vigée Nov 02 '15 at 10:34

source share

Joran beasley · Accepted Answer · 2015-11-02T22:39:30+0000

I think it will work

 import requests url="https://Hostname/saveReport/file_name.pdf" #Note: It https r = requests.get(url, auth=('usrname', 'password'), verify=False,stream=True) r.raw.decode_content = True with open("file_name.pdf", 'wb') as f: shutil.copyfileobj(r.raw, f)

How to download pdf file via https using python

More articles: