Combining PDF files with ghostscript, including original file names?

I have about 250 one-page PDF files with names like:

file_1_100.pdf, file_1_200.pdf, file_1_300.pdf, file_2_100.pdf, file_2_200.pdf, file_2_300.pdf, file_3_100.pdf, file_3_200.pdf, file_3_300.pdf ...etc 

I use the following command to combine them into a single PDF file:

 gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=finished.pdf file*pdf 

It works great by combining them in the correct order. However, when I look at final.pdf, I want to have a link that tells me the orignal file name for each page.

Does anyone have any suggestions? Can I add page names that link to files or something else?

+7
source share
2 answers

It's easy enough to put file names on the Bookmarks list, which many PDF views can display.

This is done using PostScript using the pdfmark distiller operator. For example, use the following

 gs -sDEVICE=pdfwrite -o finished.pdf control.ps 

where control.ps contains PS commands for printing pages and displaying bookmark icons (/ OUT):

 (examples/tiger.eps) run [ /Page 1 /Title (tiger.eps) /OUT pdfmark (examples/colorcir.ps) run [ /Page 2 /Title (colorcir.ps) /OUT pdfmark 

Please note that you can also enumerate using PS to automate the whole process:

 /PN 1 def (file*.pdf) { /FN exch def FN run [ /Page PN /Title FN /OUT pdfmark % do the file and bookmark it by filename /PN PN 1 add def % bump the page number } 1000 string filenameforall 

Please note that filenameforall is not specified in the order of listing, so you can sort the list to control the order using the Ghostscript.sort extension (array lt.sort lt).

Also, thinking about it, I also realized that if the imput file has more than one page, there is a better way to bookmark the correct page number using the "PageCount" device property.

 [ (file*.pdf) { dup length string copy } 1000 string filenameforall ] % create array of filenames { lt } .sort % sort in increasing alphabetic order /PN 1 def { /FN exch def /PN currentpagedevice /PageCount get 1 add def % get current page count done (next is one greater) FN run [ /Page PN /Title FN /OUT pdfmark % do the file and bookmark it by filename } forall 

The above creates an array of strings (copying them to unique string objects, since filenameforall simply overwrites the string that it sets), then sorts it and finally processes the array of strings using the forall operator. Using the PageCount device property to get the number of pages already created, the page number (PN) for the bookmark will be correct. I tested this snippet as "control.ps".

+7
source

To stamp the file name on each page, you can use a combination of ghostscript and pdftk. Taken from https://superuser.com/questions/171790/print-pdf-file-with-file-path-in-footer

 gs \ -o outdir\footer.pdf \ -sDEVICE=pdfwrite \ -c "5 5 moveto /Helvetica findfont 9 scalefont setfont (foobar-filename.pdf) show" pdftk \ foobar-filename.pdf \ stamp outdir\footer.pdf \ output outdir\merged_foobar-filename.pdf 
+1
source

All Articles