You can use a utility program that converts HTML to Java text (Google for it, for example this one ) by removing tags and converting special HTML characters. However, this will not give you everything you need, especially not formatting (like lists) and links.
Another option is to use XSLT to convert XHTML (write it correctly ...) to text and use an XSLT processor (like Xalan-J or Saxon ) to start it. This is a fairly simple XSLT exercise if your requirements are simple (for example, you don't care about CSS issues).
source share