What technology is used for A.nnotate.com?

I would like to know how services like A.nnotate.com, Scribd, Google Docs transfer PDF, .doc or any other document to HTML and how the annotation system works?

+4
source share
1 answer

A.nnotate.com does the conversion of server-side PDF pages to PNG images with a given zoom level using xpdf - this is what is displayed in the browser.

Highlighting the text is done by extracting text positions from the PDF, and then adding a transparent overlay on top of the page images with absolutely positioned html DIVS on top of the words. Annotations then use ajax gui to attach notes to the selected text.

Other formats (MS Word, PPT, etc.) are first converted to PDF using openoffice, and then to images and text overlays as for PDF files.

I think other HTML document sites do something similar for rendering PDF files like HTML (e.g. page images + word overlay like transparent divs) - an alternative trick is converting embedded PDF fonts to HTML5 CSS fonts and using absolutely positioned divs for text (& extract and image position too).

+5
source

All Articles