How to display pdf that was downloaded in python

Question

How to display pdf that was downloaded in python

I took a pdf file from the Internet, using, for example,

import requests pdf = requests.get("http://www.scala-lang.org/docu/files/ScalaByExample.pdf")

I would like to modify this code to display it

 from gi.repository import Poppler, Gtk def draw(widget, surface): page.render(surface) document = Poppler.Document.new_from_file("file:///home/me/some.pdf", None) page = document.get_page(0) window = Gtk.Window(title="Hello World") window.connect("delete-event", Gtk.main_quit) window.connect("draw", draw) window.set_app_paintable(True) window.show_all() Gtk.main()

How do I change the document = line to use a pdf variable containing pdf?

(I don't mind using popplerqt4 or anything else if that makes it easier.)

+7

python pdf poppler pdf-rendering

marshall Feb 10 '14 at 17:46

source share

6 answers

Beatriz kanzki · Answer 1 · 2016-10-16T06:11:32+0000

It all depends on the OS you are using. This can usually help:

 import os os.system('my_pdf.pdf')

or

 os.startfile('path_to_pdf.pdf')

or

 import webbrowser webbrowser.open(r'file:///my_pdf.pdf')

logc · Answer 2 · 2014-02-13T14:00:22+0000

How to use a temporary file?

 import tempfile import urllib import urlparse import requests from gi.repository import Poppler, Gtk pdf = requests.get("http://www.scala-lang.org/docu/files/ScalaByExample.pdf") with tempfile.NamedTemporaryFile() as pdf_contents: pdf_contents.file.write(pdf) file_url = urlparse.urljoin( 'file:', urllib.pathname2url(pdf_contents.name)) document = Poppler.Document.new_from_file(file_url, None)

Raghav RV · Answer 3 · 2014-02-19T18:34:26+0000

Try this and tell me if it works:

 document = Poppler.Document.new_from_data(str(pdf.content),len(repr(pdf.content)),None)

naren · Answer 4 · 2014-02-19T19:10:57+0000

If you want to open pdf using Acrobat Reader, then the code below should work

 import subprocess process = subprocess.Popen(['<here path to acrobat.exe>', '/A', 'page=1', '<here path to pdf>'], shell=False, stdout=subprocess.PIPE) process.wait()

lonelyjohner · Answer 5 · 2014-03-01T07:04:06+0000

Since there is a library called pyPdf, you can download a PDF file with this. If you have further questions, send me a messege.

Dysmas · Answer 6 · 2015-08-15T14:53:37+0000

August 2015: under new lighting in Windows 7, the problem remains the same:

 Poppler.Document.new_from_data(data, len(data), None)

returns: Type error: there should be strings, not bytes.

 Poppler.Document.new_from_data(str(data), len(data), None)

returns: the PDF document is corrupted (4).

I could not use this function.

I tried using NamedTemporayFile instead of a file on disk, but for some unknown reason it returns an unknown error.
Therefore, I am using a temporary file. Not the most beautiful way, but it works.

Here is the test code for Python 3.4, if anyone has an idea:

 from gi.repository import Poppler import tempfile, urllib from urllib.parse import urlparse from urllib.request import urljoin testfile = "d:/Mes Documents/en cours/PdfBooklet3/tempfiles/preview.pdf" document = Poppler.Document.new_from_file("file:///" + testfile, None) # Works fine page = document.get_page(0) print(page) # OK f1 = open(testfile, "rb") data1 = f1.read() f1.close() data2 = "".join(map(chr, data1)) # converts bytes to string print(len(data1)) document = Poppler.Document.new_from_data(data2, len(data2), None) page = document.get_page(0) # returns None print(page) pdftempfile = tempfile.NamedTemporaryFile() pdftempfile.write(data1) file_url = urllib.parse.urljoin('file:', urllib.request.pathname2url(pdftempfile.name)) print( file_url) pdftempfile.seek(0) document = Poppler.Document.new_from_file(file_url, None) # unknown error

How to display pdf that was downloaded in python

More articles: