The AbstractSet.removeAll () method does not work properly

Below is the code shown below:

[B]

[a, b]

However, I would expect it to print two identical lines in the output.

import java.util.*; public class Test{ static void test(String... abc) { Set<String> s = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER); s.addAll(Arrays.asList("a", "b")); s.removeAll(Arrays.asList(abc)); System.out.println(s); } public static void main(String[] args) { test("A"); test("A", "C"); } } 

The specification explicitly states that removeAll

"Deletes all items in the collection that are also contained in the specified collection.

So, from my understanding, the current behavior is unpredictable. Please help me figure this out.

+7
java
source share
3 answers

This is inconsistency in the implementation of TreeSet<E> , bordering on an error. The code will ignore the custom comparator when the number of elements in the collection that you pass to removeAll is greater than or equal to the number of elements in the set.

The inconsistency is caused by a small optimization: if you look at the implementation of removeAll , which is inherited from AbstractSet , the optimization is performed as follows:

 public boolean removeAll(Collection<?> c) { boolean modified = false; if (size() > c.size()) { for (Iterator<?> i = c.iterator(); i.hasNext(); ) modified |= remove(i.next()); } else { for (Iterator<?> i = iterator(); i.hasNext(); ) { if (c.contains(i.next())) { i.remove(); modified = true; } } } return modified; } 

you can see that the behavior is different when c has fewer elements than this set (upper branch), and when it has the same or more elements (lower branch).

The upper branch uses the comparator associated with this set, while the lower branch uses equals to compare c.contains(i.next()) - all in one method!

You can demonstrate this behavior by adding a few additional elements to the original set of trees:

 s.addAll(Arrays.asList("x", "z", "a", "b")); 

Now the output for both test cases becomes identical, because remove(i.next()) uses a set comparator.

+2
source share

You only read the documentation in part. You forgot one important paragraph from TreeSet :

Note that the order supported by the set (regardless of whether an explicit comparator is provided) must be consistent with equals if it implements the Set interface correctly. (See Comparable or Comparator for an exact definition of matching with peers.) This is because the Set interface is defined in terms of the equality operation, but the TreeSet instance does all the element comparisons using its compareTo (or compare), so there are two elements that are for this methods are considered equal, equal, in terms of set. The behavior of the set is well defined, even if its ordering does not match the equal; it just does not fulfill the general contract of the Set interface .

Now the implementation of removeAll comes from AbstractSet and uses the equals method. According to your code, you will have "a".equals("A") not true , so that the elements are not considered equal, even if you provide a comparator that controls them when used in the TreeSet itself. If you try using a wrapper, the problem goes away:

 import java.util.*; import java.lang.*; class Test { static class StringWrapper implements Comparable<StringWrapper> { public final String string; public StringWrapper(String string) { this.string = string; } @Override public boolean equals(Object o) { return o instanceof StringWrapper && ((StringWrapper)o).string.compareToIgnoreCase(string) == 0; } @Override public int compareTo(StringWrapper other) { return string.compareToIgnoreCase(other.string); } @Override public String toString() { return string; } } static void test(StringWrapper... abc) { Set<StringWrapper> s = new TreeSet<>(); s.addAll(Arrays.asList(new StringWrapper("a"), new StringWrapper("b"))); s.removeAll(Arrays.asList(abc)); System.out.println(s); } public static void main(String[] args) { test(new StringWrapper("A")); test(new StringWrapper("A"), new StringWrapper("C")); } } 

This is because you now provide a consistent implementation between equals and compareTo your object so that you never have incoherent behavior between how objects are added inside a sorted set and how all abstract set behavior uses them.

This, in general, is a rule three times for Java code: if you implement compareTo or equals or hashCode , you should always implement them all to avoid problems with standard collections (even if hashCode is less important if you do not use these objects in any hashed collection ) This is indicated many times in the java documentation.

+6
source share

The reason is that the String.CASE_INSENSITIVE_ORDER comparator you are String.CASE_INSENSITIVE_ORDER not consistent with equals.

As indicated by TreeSet :

Note that the ordering supported by the set (whether an explicit comparator is provided) must be compatible with peers if it implements the Set interface correctly.

Consistency with equal values ​​specified in Comparable :

A natural ordering for a class C is called consistent with equalities if and only if e1.compareTo (e2) == 0 has the same logical value as e1.equals (e2) for each e1 and e2 of class C.

And as an example for a case-insensitive comparator, you use:

 "a".compareTo("A") == 0 => true 

but

 "a".equals("A") => false 
+1
source share

All Articles