Bookmark PDF?

I have to process separate PDF files, each of which was created by merging several PDF files. Each consolidated PDF has places where parts of the PDF begin to display with a bookmark.

Is there a way to automatically split this into bookmarks using a script?

We only have bookmarks indicating the details, not page numbers, so we will need to display the page numbers from the bookmarks. A Linux tool would be better.

+3
source share
3 answers

you have programs that are built as pdf-split that can do this for you:

A-PDF Split is a very simple, lightweight desktop program that allows you to split any Acrobat PDF file into smaller PDF files. It provides complete flexibility and user control in terms of file splitting and how split output files are uniquely named. A-PDF Split provides many alternatives for splitting your large files — by page, bookmark, and odd / even page. Even you can extract or delete part of a PDF file. A-PDF Split also offers advanced specific delimiters that can be saved and subsequently imported for use with repetitive file splitting tasks. A-PDF Split provides maximum file sharing flexibility to fit any need.

A-PDF Split works with password-protected PDF files and can apply various PDF protection functions for shared output files. If necessary, you can recompile the generated split files with other PDF files using a utility such as A-PDF merging to generate new composite pdf files.

A-PDF Split does NOT require Adobe Acrobat and creates documents compatible with Adobe Acrobat Reader version 5 and higher.

change *

also found a free open source program here if you do not want to pay.

+2
source

pdftk can be used to split a PDF file and extract bookmark page numbers.

To get bookmark page numbers, do

pdftk in.pdf dump_data 

and make your script to read the page numbers from the output.

Then use

 pdftk in.pdf cat AB output out_A-B.pdf 

to get pages from A to B in out_A-B.pdf.

The script might be something like this:

 #!/bin/bash infile=$1 # input pdf outputprefix=$2 [ -e "$infile" -a -n "$outputprefix" ] || exit 1 # Invalid args pagenumbers=( $(pdftk "$infile" dump_data | \ grep '^BookmarkPageNumber: ' | cut -f2 -d' ' | uniq) end ) for ((i=0; i < ${#pagenumbers[@]} - 1; ++i)); do a=${pagenumbers[i]} # start page number b=${pagenumbers[i+1]} # end page number [ "$b" = "end" ] || b=$[b-1] pdftk "$infile" cat $a-$b output "${outputprefix}"_$a-$b.pdf done 
+10
source

There, the command line tool written in Java is called Sejda , where you can find the splitbybookmarks command, which does exactly what you requested. This is Java, so it runs on Linux and is a command line tool that you can write a script for this.

Denial of responsibility
I am one of the authors

+4
source

All Articles