Jsoup node the number of hash codes when traversing the DOM tree

Question

Jsoup node the number of hash codes when traversing the DOM tree

I am using java jsoup to create HTML DOM trees that I use Node.hashCode(). But I found that while traversing the DOM tree, there are many hash code collisions using the following code:

doc.traverse(new NodeVisitor(){

    @Override
    public void head(Node node, int depth) {

        System.out.println("node hash: "+ node.hashCode());

        /* some other operations */
    }

    @Override
    public void tail(Node node, int depth) {
        // TODO Auto-generated method stub

        /* some codes */
    }
}

So, when this is done, I see many identical hash codes even in the first few outputs.

The hash codes are quite large and I do not expect such strange behavior. I used jsoup-1.8.1. Any input would be greatly appreciated, thanks.

+4

java dom html jsoup hash

J freebird Mar 10 '15 at 17:54

source share

1 answer

JonasCz · Accepted Answer · 2015-03-10T18:31:51+0000

Note. This bug was fixed in jSoup 1.8.2, so my answer is no longer relevant.

jSoup. :

@Override
public int hashCode() {
   int result = parentNode != null ? parentNode.hashCode() : 0;
   // not children, or will block stack as they go back up to parent)
   result = 31 * result + (attributes != null ? attributes.hashCode() : 0);
   return result;
}

Java, , , . ( , @alkis )

: . HTML:

<html>
    <head>
    </head>
    <body>
        <div style="blah">TODO: write content</div>
        <div style="blah">Nothing here</div>
        <p style="test">Empty</p>
        <p style="nothing">Empty</p>
    </body>
</html>

:

String html = //HTML posted above

Document doc = Jsoup.parse(html);

Elements elements = doc.select("[style]");
for (Element e : elements) {
   System.out.println(e.hashCode());
}

:

-148184373
-148184373
-1050420242
2013043377

, , , .

, .

.

Jsoup node the number of hash codes when traversing the DOM tree

More articles: