I am looking for documentation (official documentation, if possible) for the TagSoup and jTidy libraries.
I want to use these libraries to manipulate html "tagoup" files that contain xml tags with different namespaces mixed between html tags (html, xhtml or html5).
I tested HTMLCleaner, NekoHTML and Jericho, but I did not find the documentation for jTidy and TagSoup, except for the simplest examples, to clear the file.
I need documentation on how to manipulate content, replace tags, retrieve information, etc.
thank
Note: After checking all the parameters, I used StAX / Woodstox :
source
share