Compare two lists and remove duplicates from one

I have an object called FormObject that contains two ArrayLists - oldBooks and newBooks - both of which contain Book objects.

oldBooks is allowed to contain duplicate Book objects newBooks are not allowed to contain duplicate Book objects within themselves and cannot include duplicate Book objects in the oldBooks list.

The definition of a duplicate book is complex, and I cannot override the equals method, since the definition is not universal for all uses of the Book object.

I plan to use a method of the FormObject class called removeDuplicateNewBooks that will perform the above functions.

How would you do that? My first thought was to use HashSets to eliminate duplicates, but not being able to override equal objects in the Book object, this would not work.

+4
source share
3 answers

You can use TreeSet with a custom Comparator<Book> :

  • build a TreeSet with Comparator , implementing the user logic you want.
  • use set.addAll(bookList)

Now Set contains only unique books.

+7
source

To create unique books:

Create a wrapper class around the book and declare it equals / hashCode with methods based on the book's private object:

 public class Wrapper{ private final Book book; public Wrapper(final Book book){ assert book != null; this.book = book; } public Book getBook(){ return this.book; } @Override public boolean equals(final Object other){ return other instanceof Wrapper ? Arrays.equals( this.getBookInfo(), ((Wrapper) other).getBookInfo() ) : false; } @Override public int hashCode(){ return Arrays.hashCode(this.getBookInfo()); } private String[] getBookInfo(){ return new String[] { this.book.getAuthor(), this.book.getTitle(), this.book.getIsbn() }; } } 

EDIT: Optimized equals and hashCode and fixed hashCode bug.

Now use the duplicate removal kit:

 Set<Wrapper> wrappers = new HashSet<Wrapper>(); for(Book book: newBooks){ wrappers.add(new Wrapper(book); } newBooks.clear(); for(Wrapper wrapper: wrappers){ newBooks.add(wrapper.getBook()); } 

(But, of course, TreeSet's answer with a custom comparator is more elegant, because you can use the book class itself)

EDIT: (remote link to apache commons, because my improved equals / hashCode methods are better)

+4
source

HashingStrategy is the concept you are looking for. This is a strategy interface that allows you to define custom implementations of equals and hashcode.

 public interface HashingStrategy<E> { int computeHashCode(E object); boolean equals(E object1, E object2); } 

Eclipse Collections includes hash tables as well as iteration patterns based on hash strategies. First, you must create your own HashingStrategy to answer if the two Books are equal.

You can then use distinct() to remove duplicates in newBooks and UnifiedSetWithHashingStrategy to eliminate duplicates in lists.

 List<Book> oldBooks = ...; List<Book> newBooks = ...; HashingStrategy<Book> hashingStrategy = new HashingStrategy() { ... }; Set<Book> set = UnifiedSetWithHashingStrategy<>(hashingStrategy, oldBooks); List<Book> result = ListIterate.distinct(newBooks, hashingStrategy).reject(set::contains); 

The distinct() method returns only unique elements according to the hashing strategy. It returns a list, not a set, preserving the original order. The reject() call returns another new list without the elements that the collection contains, in accordance with the same hashing strategy.

If you can change newBooks to implement the Eclipse Collections interface, you can directly call the distinct() method.

 MutableList<Book> newBooks = ...; MutableList<Book> result = newBooks.distinct(hashingStrategy).reject(oldBooks::contains); 

Note. I am a committer for Eclipse collections.

+1
source

All Articles