It depends on the version of the Word document that you are targeting. It can be Word 95 (classic word), Open XML, RTF, etc.
RTF may be the easiest to handle, Open XML is normalized, so the documents are accessible, the .doc format has the opposite construction, therefore it is known, and I think that in fact there is a Java library for processing it.
The exact answer depends on your specific needs ...
source share