Error in PDF root object

This PDF root object will crash Adobe Reader. Other PDF readers such as Foxit, Nuance, Evince, SumatraPDF will open the PDF file without any problems. The problem is / Dests, which solves the indirect object (link to PDF). Removing / Dests <→ you will get Adobe Reader to open the file but not print it. All other readers work fine without / Dests. Any ideas on syntax correction in the following root object example?

17 0 obj << /Type /Catalog /Pages 2 0 R /Outlines 15 0 R /PageMode /UseOutlines /Dests << /__WKANCHOR_2 8 0 R /#8d#c2#ca#ebs#e4#60#00#9e#97l#b9#80#1b#cb#86sQR#83 9 0 R >> >> endobj 
+4
source share
3 answers

OK, I found a few spare minutes ...

So, the first thing I noticed is that * all other readers can actually open the file (I only tested a few). But they spit out many, many warnings and error messages ... (Try Ghostscript: gs virkerikke.pdf or try evince ...) There is at least a corrupted xref table in the PDF (or at least this is one of the complaints).

xpdf complains:

 [....] Error: Invalid XRef entry Error: Invalid XRef entry Error: Invalid XRef entry Error (157): Unterminated string Error (159): End of file inside dictionary 

gv complains:

 Warning: translation table syntax error: Unknown keysym name: apLineDel Warning: ... found while parsing '<Key>apLineDel: GV_Page(page+5) ' Warning: String to TranslationTable conversion encountered errors 

evince complains:

 [....] Error: Invalid XRef entry Error: Invalid XRef entry Error: Invalid XRef entry Error (157): Unterminated string Error (159): End of file inside dictionary Error (157): Unterminated string Error (159): End of file inside dictionary Error (157): Unterminated string Error (159): End of file inside dictionary [....] Error (1918): Unterminated string Error (1920): End of file inside dictionary 

gs complains:

 **** Warning: File has a corrupted %%EOF marker, or garbage after %%EOF. 

mupdf complains:

 + pdf/pdf_xref.c:60: pdf_read_start_xref(): cannot find startxref | pdf/pdf_xref.c:477: pdf_load_xref(): cannot read startxref \ pdf/pdf_xref.c:532: pdf_open_xref_with_stream(): trying to repair warning: ignoring invalid character in hex string: '!' warning: ignoring invalid character in hex string: 'O' warning: ignoring invalid character in hex string: 'T' warning: ignoring invalid character in hex string: 'Y' [....] 

qpdf --qdf complains:

 virkerikke.pdf (object 17 0, file position 2234): null character not allowed in name token 

OK, now open this crappy file in a text editor, trying to restore it. I found that this file (32746 bytes in size) has serious syntax problems:

  • Trash after %%EOF :. After its marker %%EOF there is a complete and syntactically correct HTML file with the inscription "Wkhtmltopdf - Teknisk regelverk". Its size is 11878 bytes. Delete this part and you will have the “best” PDF with a size of only 20868 bytes ... although Acrobat / Adobe Reader still does not open it after saving the edited file.
  • Invalid character in name token:. This is inside the token /#8d#c2#ca#ebs#e4#60#00#9e#97l#b9#80#1b#cb#86sQR#83 . 2x appears in this file. Already in my first comments I told you that this key did not look reliable for me, because it contains only very few ASCII characters, but many binary bytes (using their hexadecimal representation. (I did not pay attention to the fact that it even contained #00 , which is a PDF representation of the character nul , the use of which is illegal for token names in the PDF.) Replace the other token (fantasy) with exactly the same length (both occurrences) I have chosen /aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa . Save the edited file.

Now even Acrobat / Adobe Readers will open this repaired file without complaint. In addition, “other readers” will now work better with this file, splashing out less warnings and will now be able to identify some metadata (such as creation date and producer == wkhtmltopdf) that they could not get for the original file.

+3
source

/Dests should be a dictionary ( /Key value pairs) containing names (keys) and corresponding destinations (values). The /Dests first appeared in PDF 1.1.

In PDF 1.1, only name objects are allowed. PDF 1.2 allows keys to also be byte strings.

So what version of PDF claims your file is?

From the specification for PDF 1.7 ("ISO 32000-1") describing the meaning of /Dests :

In PDF 1.1, the correspondence between name objects and destinations is defined by the label Dests in the document catalog (see 7.7.2, “Document catalog”). The value of this entry must be a dictionary in which each key is a destination name, and the corresponding value is either an array defining the destination using the syntax shown in table 151, or a dictionary with the entry D whose value is such an array.

+1
source

Seems pretty simple. Move the dests array to your own object.

Instead

 17 0 obj << /Type /Catalog /Pages 2 0 R /Outlines 15 0 R /PageMode /UseOutlines /Dests << /__WKANCHOR_2 8 0 R /#8d#c2#ca#ebs#e4#60#00#9e#97l#b9#80#1b#cb#86sQR#83 9 0 R >> >> endobj 

instead, you should:

 17 0 obj << /Type /Catalog /Pages 2 0 R /Outlines 15 0 R /PageMode /UseOutlines /Dests 1234 0 R >> endobj 1234 0 obj <</__WKANCHOR_2 8 0 R/#8d#c2#ca#ebs#e4#60#00#9e#97l#b9#80#1b#cb#86sQR#83 9 0 R>> endobj 

The object number will be something pseudo-random.

And how to transfer the dest array from the root to your own object, which will depend entirely on which PDF software you use. The Hex Editor is an option, but then you switch to SuperUser and not to StackOverflow ... technically. I suspect you might get a mulligan on this one. I would allow myself to sit down.

0
source

All Articles