How to read a .pdf program file and convert it to audio (.mp3 format)?

I want to parse a PDF file from my C # application and create an audio file from it. How can I do it?

I am especially looking for a good pdf text library or a way to remove a PDF from its text.

+5
source share
8 answers

As an input document, it is desirable to have a tagged PDF document . This means that the document contains tags for marking the logical structure of the document (usually a PDF document will contain only visual information).

PDF DAISY, , .. XML, .

Daisy XML , Daisy reader, , MP3- .

- Daisy , :

PDF DAISY/NIMAS

+5

Festival . pdf api ...

+4

SDK Microsoft.

+2

, .pdf. PDF , Google.

, , , , , ..

, . , , .

, .

Cepstral TTS, . ( Cepstral , , .)

, , TTS, .. :

http://www.w3.org/TR/speech-synthesis/

SSML .

, TTS , mp3.

+2

- PDF, Acrobat " " ""?

+2

, . -, pdf, - . mp3.

0

Mac OS X pdf, "say". .

0

, , , , (, , ), OCR PDF.

The most difficult task is probably working with various PDF layouts (columns, lines, embedded graphics, musical notes, URLs, etc.), which can confuse the text recognition process.

However, in the general case (if this should not be a learning experience), it is certainly easier to just resort to using existing software solutions:

0
source

All Articles