This is from lucid :
InputStream input = new FileInputStream(new File(resourceLocation));
ContentHandler textHandler = new BodyContentHandler();
Metadata metadata = new Metadata();
PDFParser parser = new PDFParser();
parser.parse(input, textHandler, metadata);
input.close();
out.println("Title: " + metadata.get("title"));
out.println("Author: " + metadata.get("Author"));
out.println("content: " + textHandler.toString());
Instead of creating, PDFParseryou can use Tika AutoDetectParserto automatically process type file types:
Parser parser = new AutoDetectParser();
source
share