HashSet Conflicts in Java

I have a program for my Java class where I want to use hashSets to compare a directory of text documents. Essentially, my plan is to create a hash set of strings for each article, and then add two of them hashSets together into one hash set and find the number of the same six-word sequences.

My question is, do I need to manually check, process, collide, or do Java for me?

+6
source share
2 answers

Java Hash Maps / Sets Automatically transfer hash collisions, so it is important to override the equals and hashCode methods. Because both of them are used by sets to distinguish between repeating or unique entries.

It is also important to note that these hava hash collisions are performance indicators, since multiple objects are referenced by the same Hash.

 public class MyObject { private String name; //getter and setters public int hashCode() { int hashCode = //Do some object specifc stuff to gen hashCode return int; } public boolean equals(Object obj) { if(this==obj) return true; if(obj instanceOf MyObject) { if(this.name.equals((MyObject)obj.getName())) { return true; } return false; } } } 

Note. Standard Java objects, such as String, have already implemented hashCode and are equal, so you only need to do this for your data objects.

+3
source

I think you didn’t ask for hash collisions, right? The question is what happens when HashSet a and HashSet b are added to the same set, for example. a.addAll (b).

The answer will consist of all elements and there will be no duplicates. In the case of Strings, this means that you can count the number of equal String from sets with a.size () before add - a.size () after adding + b.size ().

It does not matter if some of the lines have the same hash code, but are not equal.

0
source

Source: https://habr.com/ru/post/927816/


All Articles