Joining 1000 PDF through iText throws java.lang.OutOfMemoryError: Java Heap Space

I am trying to combine 1000 pdf files through iText. I am not sure where the memory leak occurs. The following is sample code. Please note that I delete the child-pdf file as soon as I merge with the parent file. Please indicate the error in the code below or is there a better way to do this without the concept of memory. This process runs through a servlet (not a standalone program)

FileInputStream local_fis = null; BufferedInputStream local_bis = null; File localFileObj = null; for(int taIdx=0;taIdx<totalSize;taIdx++){ frObj = (Form3AReportObject)reportRows.get(taIdx); localfilename = companyId + "_" + frObj.empNumber + ".pdf"; local_fis = new FileInputStream(localfilename); local_bis = new BufferedInputStream(local_fis); pdfReader = new PdfReader(local_bis); cb = pdfWriter.getDirectContent(); document.newPage(); page = pdfWriter.getImportedPage(pdfReader, 1); cb.addTemplate(page, 0, 0); local_bis.close(); local_fis.close(); localFileObj = new File(localfilename); localFileObj.delete(); } document.close(); 
+4
source share
7 answers

You might want to try something like the following (exception handling, file closure, and deletion removed for clarity):

 for(int taIdx = 0; taIdx < totalSize; taIdx++) { Form3AReportObject frObj = (Form3AReportObject)reportRows.get(taIdx); localfilename = companyId + "_" + frObj.empNumber + ".pdf"; FileInputStream local_fis = new FileInputStream(localfilename); pdfWriter.freeReader(new PdfReader(local_fis)); pdfWriter.flush(); } pdfWriter.close(); 
+8
source

Who says there is a memory leak? Your combined document should fit completely into memory, there is no way around it, and it may be larger than the default heap size of 64 MB in memory (not disk).

I don’t see a problem with your code, but if you want to diagnose it in detail, use visualvm heap profiler (comes with the JDK since the Java 6 10 or so update).

+2
source

Instead of merging 1000 PDF files, try creating them zip.

0
source

What if you do not use InputStream? If you can, try using only the path for your file in the new PDFReader ("/ somedirectory / file.").

This forces the reader to act on the disk.

0
source

The above code is trying to create a PdfContentByte ( cb ) object in a loop. Moving it outside can solve the problem. I used the same code in my application to stitch 13k separate PDF files into one PDF without any problems.

0
source
 public class PdfUtils { public static void concatFiles(File file1, File file2, File fileOutput) throws Exception { List<File> islist = new ArrayList<File>(); islist.add(file1); islist.add(file2); concatFiles(islist, fileOutput); } public static void concatFiles(List<File> filelist, File fileOutput) throws Exception { if (filelist.size() > 0) { PdfReader reader = new PdfReader(new FileInputStream( filelist.get(0)) ); Document document = new Document(reader.getPageSizeWithRotation(1)); PdfCopy cp = new PdfCopy(document, new FileOutputStream( fileOutput )); document.open(); for (File file : filelist ) { PdfReader r = new PdfReader( new FileInputStream( file)); for (int k = 1; k <= r.getNumberOfPages(); ++k) { cp.addPage(cp.getImportedPage(r, k)); } cp.freeReader(r); } cp.close(); document.close(); } else{ throw new Exception("La lista dei pdf da concatenare Γ¨ vuota"); } } } 
0
source

All Articles