What (tf) are the secrets of PDF memory allocation (CGPDFDocumentRef)

For reading PDF, I want to prepare a document by taking “screenshots” of each page and save them to disk. The first approach is

CGPDFDocumentRef document = CGPDFDocumentCreateWithURL((CFURLRef) someURL); for (int i = 1; i<=pageCount; i++) { NSAutoreleasePool *pool = [[NSAutoreleasePool alloc]init]; CGPDFPageRef page = CGPDFDocumentGetPage(document, i); ...//getting + manipulating graphics context etc. ... CGContextDrawPDFPage(context, page); ... UIImage *resultingImage = UIGraphicsGetImageFromCurrentImageContext(); ...//saving the image to disc [pool drain]; } CGPDFDocumentRelease(document); 

This leads to a lot of memory, which, it seems, will not be released after the first run of the cycle (preparation of the first document), but more unreleased memory in additional runs:

 MEMORY BEFORE: 6 MB MEMORY DURING 1ST DOC: 40 MB MEMORY AFTER 1ST DOC: 25 MB MEMORY DURING 2ND DOC: 40 MB MEMORY AFTER 2ND DOC: 25 MB .... 

Change code to

 for (int i = 1; i<=pageCount; i++) { CGPDFDocumentRef document = CGPDFDocumentCreateWithURL((CFURLRef) someURL); NSAutoreleasePool *pool = [[NSAutoreleasePool alloc]init]; CGPDFPageRef page = CGPDFDocumentGetPage(document, i); ...//getting + manipulating graphics context etc. ... CGContextDrawPDFPage(context, page); ... UIImage *resultingImage = UIGraphicsGetImageFromCurrentImageContext(); ...//saving the image to disc CGPDFDocumentRelease(document); [pool drain]; } 

changes memory usage to

 MEMORY BEFORE: 6 MB MEMORY DURING 1ST DOC: 9 MB MEMORY AFTER 1ST DOC: 7 MB MEMORY DURING 2ND DOC: 9 MB MEMORY AFTER 2ND DOC: 7 MB .... 

but obviously a step back in productivity.

When I start reading PDF (later in time, in a different way), in the first case, memory is no longer allocated (up to 25 MB), and in the second case, the memory reaches 20 MB (from 7).

In both cases, when I delete the linear memory CGContextDrawPDFPage(context, page); , (almost) constant at 6 MB during and after all preparation of documents.

Can anyone explain what is going on there?

+7
source share
1 answer

CGPDFDocument caches quite aggressively, and you have very little control over this, except how, as you did, freeing the document and reloading it from disk.

The reason you don't see many allocations when you delete a call to CGContextDrawPDFPage is because Quartz loads page resources lazily. When you simply call CGPDFDocumentGetPage, all that happens is that it loads some basic metadata, such as bounding rectangles and annotations (very small in memory).

Fonts, images, etc. only load when you actually draw the page - but then they are stored relatively long in the internal cache. This is done to speed up rendering, since page resources are often shared between multiple pages. In addition, quite often you have to display the page several times (for example, when scaling). You will notice that it is much faster to make the page a second time.

+4
source

All Articles