Extra characters when converting PDF to image using PDFBox

I am using Apache PDFBox 1.8.9. I have one PDF page that contains text, and I want to convert this page to an image. PDF created using Libre Office. I am using the following code:

PDDocument document = PDDocument.loadNonSeq(new File(filename), null); 
List<PDPage> pdPages = document.getDocumentCatalog().getAllPages();
int page = 0;
for (PDPage pdPage : pdPages) {
 ++page;
 BufferedImage bim = pdPage.convertToImage(BufferedImage.TYPE_INT_RGB, 300);
 ImageIOUtil.writeImage(bim, "png", "/home/file" + "-" + page, 300); 
} 
document.close();

The code works, I get a PNG image. The problem is that there are many weird extra characters that make reading text difficult. How to fix it?

The image I get is (enlarged image):

bad conversion

and this is the same area in the pdf viewer:

original input pdf

The full PDF file can be downloaded at https://yadi.sk/i/iX-KJwlhhXMY2

+4
source share

All Articles