I am looking for a Java library that can do the following:
parse emails in * .eml or * .msg format for attachments of the types DOC, DOCX, JPEG, PNG, GIF, TXT, XLS, XLSX, PPT, PDF and convert the attached files to TIFF format.
It can be either open source or with a commercial library. As an alternative, I'm looking for linux command line tools that do this. We have already tried an open office, but there are too many problems with some document formats.
UPDATE:
What I have discovered as a result of research so far:
For parsing email and extracting attachments, JavaMail (http://www.oracle.com/technetwork/java/javamail/index.html) is a good choice.
For converting documents, JodConverter (http://code.google.com/p/jodconverter/) is a convenient library. However, this is only a shell for an open office, so if you have problems with an open office (and I often have problems with openoffice), to convert a document, you will have them also with JodConcerter.
In conclusion, I was not lucky (so far) to find any document conversion library implemented in native Java, which transmits all common document formats, neither open source nor commercial. This seems to be a real market gap.
source share