How can I navigate the HTML tree using Jsoup?

Question

How can I navigate the HTML tree using Jsoup?

I think this question is asked, but I did not find anything.

From a Document element in Jsoup, how can I navigate for all elements of HTML content?

I read the documentation and I thought about using the childNodes() method, but it takes only nodes from one left below (which I understand). I think I can use some recursion with this method, but I want to know if there is a more suitable / native way to do this.

+7

java jsoup traversal

Renato dinhani Apr 11 '12 at 18:07

source share

3 answers

1) You can select all elements of the document using the * selector.

 Elements elements = document.body().select("*");

2) To obtain the text of each individually using the Element.ownText () method.

 for (Element element : elements) { System.out.println(element.ownText()); }

3) To change the text of each separately using Element.html (String strHtml). (Cleans up any existing internal HTML in the element and replaces it with parsed HTML.)

 element.html(strHtml);

Hope this helps you. Thanks!

0

Gaurav darji Jun 12 '16 at 15:43

source share

You can use the following code:

 public class JsoupDepthFirst { private static String htmlTags(Document doc) { StringBuilder sb = new StringBuilder(); htmlTags(doc.children(), sb); return sb.toString(); } private static void htmlTags(Elements elements, StringBuilder sb) { for(Element el:elements) { if(sb.length() > 0){ sb.append(","); } sb.append(el.nodeName()); htmlTags(el.children(), sb); sb.append(",").append(el.nodeName()); } } public static void main(String... args){ String s = "<html><head>this is head </head><body>this is body</body></html>"; Document doc = Jsoup.parse(s); System.out.println(htmlTags(doc)); } }

-one

Shradha shiwani Jan 30 '16 at 1:12

source share

Vivien barousse · Accepted Answer · 2012-04-11T18:09:35+0000

From Document (and any Node ), you can use the traverse(NodeVisitor) method.

For example:

 document.traverse(new NodeVisitor() { public void head(Node node, int depth) { System.out.println("Entering tag: " + node.nodeName()); } public void tail(Node node, int depth) { System.out.println("Exiting tag: " + node.nodeName()); } });

How can I navigate the HTML tree using Jsoup?

More articles: