OK, I found a few spare minutes ...
So, the first thing I noticed is that * all other readers can actually open the file (I only tested a few). But they spit out many, many warnings and error messages ... (Try Ghostscript: gs virkerikke.pdf or try evince ...) There is at least a corrupted xref table in the PDF (or at least this is one of the complaints).
xpdf complains:
[....] Error: Invalid XRef entry Error: Invalid XRef entry Error: Invalid XRef entry Error (157): Unterminated string Error (159): End of file inside dictionary
gv complains:
Warning: translation table syntax error: Unknown keysym name: apLineDel Warning: ... found while parsing '<Key>apLineDel: GV_Page(page+5) ' Warning: String to TranslationTable conversion encountered errors
evince complains:
[....] Error: Invalid XRef entry Error: Invalid XRef entry Error: Invalid XRef entry Error (157): Unterminated string Error (159): End of file inside dictionary Error (157): Unterminated string Error (159): End of file inside dictionary Error (157): Unterminated string Error (159): End of file inside dictionary [....] Error (1918): Unterminated string Error (1920): End of file inside dictionary
gs complains:
**** Warning: File has a corrupted %%EOF marker, or garbage after %%EOF.
mupdf complains:
+ pdf/pdf_xref.c:60: pdf_read_start_xref(): cannot find startxref | pdf/pdf_xref.c:477: pdf_load_xref(): cannot read startxref \ pdf/pdf_xref.c:532: pdf_open_xref_with_stream(): trying to repair warning: ignoring invalid character in hex string: '!' warning: ignoring invalid character in hex string: 'O' warning: ignoring invalid character in hex string: 'T' warning: ignoring invalid character in hex string: 'Y' [....]
qpdf --qdf complains:
virkerikke.pdf (object 17 0, file position 2234): null character not allowed in name token
OK, now open this crappy file in a text editor, trying to restore it. I found that this file (32746 bytes in size) has serious syntax problems:
- Trash after
%%EOF :. After its marker %%EOF there is a complete and syntactically correct HTML file with the inscription "Wkhtmltopdf - Teknisk regelverk". Its size is 11878 bytes. Delete this part and you will have the “best” PDF with a size of only 20868 bytes ... although Acrobat / Adobe Reader still does not open it after saving the edited file. - Invalid character in name token:. This is inside the token
/#8d#c2#ca#ebs#e4#60#00#9e#97l#b9#80#1b#cb#86sQR#83 . 2x appears in this file. Already in my first comments I told you that this key did not look reliable for me, because it contains only very few ASCII characters, but many binary bytes (using their hexadecimal representation. (I did not pay attention to the fact that it even contained #00 , which is a PDF representation of the character nul , the use of which is illegal for token names in the PDF.) Replace the other token (fantasy) with exactly the same length (both occurrences) I have chosen /aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa . Save the edited file.
Now even Acrobat / Adobe Readers will open this repaired file without complaint. In addition, “other readers” will now work better with this file, splashing out less warnings and will now be able to identify some metadata (such as creation date and producer == wkhtmltopdf) that they could not get for the original file.
source share