Reading Pdf with C

I want to be able to read the contents of pdf files. I need to do this using C on Linux.

The closer I can get to this here , but I think that Haru can create PDF files and cannot read them (not 100% sure).

PS: I need only plain text from pdf

+4
source share
3 answers

Take a look at libpoppler . I never used it to extract text, just requesting PDF attributes. It is quite easy to use.

+4
source

How well do you need to disassemble them? Simple row extraction should be relatively simple, fully accurate rendering more difficult. Take a look at the source for evince or ghostscript?

This is for C ++, but it can be a good starting point for understanding the structure of the PDF http://www.codeproject.com/KB/cpp/ExtractPDFText.aspx (sorry wrong link earlier)

+1
source

Another possible, although I never used it, is VersyPDF. He claims to allow editing PDF files ... http://versypdf.sybrex-systems-ltd.qarchive.org/

0
source

All Articles