The most efficient way to see if an ArrayList contains an object in Java

I have an ArrayList of objects in Java. Objects have four fields, two of which I would use to consider an object equal to another. I am looking for the most efficient way, given these two fields, to see if the array contains this object.

The key is that these classes are created based on XSD objects, so I cannot change the classes themselves to overwrite .equals .

Is there a better way than just iterating over and manually comparing two fields for each object, and then breaking when they are detected? It just seems so dirty looking for the best way.

Edit: ArrayList comes from a SOAP response that is not bound to objects.

+66
java optimization arraylist algorithm search
Feb 17 '09 at 10:18
source share
12 answers

It depends on how efficiently you need things. Simply repeating a list looking for an element that satisfies a certain condition is O (n), but it is also ArrayList.Contains if you can implement the Equals method. If you are not doing this in loops or inner loops, this approach is probably just fine.

If you really need very effective search speeds at all costs, you need to do two things:

  • Work on what class: Create an adapter class that can wrap the generated class and that implement equals () based on these two fields (assuming they are public). Remember to also implement hashCode () (*)
  • Wrap each object with this adapter and put it in a HashSet. HashSet.contains () has a constant access time, i.e. O (1) instead of O (n).

Of course, there is still an O (n) cost to build this HashSet. You will only win if the cost of creating a HashSet is negligible compared to the total cost of all contains () checks that you need to do. Trying to create a list without duplicates is such a case.




* () The implementation of hashCode () is best done by the XOR'ing (^) operator with the hash codes of the same fields that you use to implement equals (but multiplying by 31 reduces the chance of getting XOR 0)
+96
Feb 17 '09 at 22:39
source share

You can use Comparator with built-in Java methods for sorting and binary search. Suppose you have a class where a and b are the fields that you want to use for sorting:

 class Thing { String a, b, c, d; } 

You would define your comparator:

 Comparator<Thing> comparator = new Comparator<Thing>() { public int compare(Thing o1, Thing o2) { if (o1.a.equals(o2.a)) { return o1.b.compareTo(o2.b); } return o1.a.compareTo(o2.a); } }; 

Then sort the list:

 Collections.sort(list, comparator); 

And finally do a binary search:

 int i = Collections.binarySearch(list, thingToFind, comparator); 
+34
Feb 17 '09 at 22:58
source share

Given your limitations, you are stuck with brute force search (or creating an index if the search is repeated). Can you talk about how an ArrayList is generated - perhaps there is room for maneuver.

If you are looking for more beautiful code, consider the Apache Commons Collections classes, in particular CollectionUtils.find () , for ready-made syntactic sugar:

 ArrayList haystack = // ... final Object needleField1 = // ... final Object needleField2 = // ... Object found = CollectionUtils.find(haystack, new Predicate() { public boolean evaluate(Object input) { return needleField1.equals(input.field1) && needleField2.equals(input.field2); } }); 
+9
Feb 17 '09 at 22:29
source share

If the list is sorted , you can use binary search . If not, then there is no better way.

If you do this a lot, it is almost certainly worth the time to sort the list for the first time. Since you cannot modify classes, you will need to use Comparator for sorting and searching.

+5
Feb 17 '09 at 22:22
source share

Even if the equals method compared these two fields, then logically, it would be the same code as you manually. Ok, that might be dirty, but it’s still the correct answer

+3
Feb 17 '09 at 22:22
source share

If you are a user of my ForEach DSL , you can do this with a Detect request.

 Foo foo = ... Detect<Foo> query = Detect.from(list); for (Detect<Foo> each: query) each.yield = each.element.a == foo.a && each.element.b == foo.b; return query.result(); 
+3
Mar 01 '09 at 19:19
source share

Is there a better way than just iterating over and manually comparing two fields for each object, and then breaking when they are detected? It just seems so dirty looking for the best way.

If your problem is maintainability, you can do something with Fabian Steeg (what I would do), although this is probably not the “most efficient” (because you need to sort the array first and then perform a binary search), but, by far the cleanest and best option.

If you are really interested in efficiency, you can create your own list implementation that uses the field in your object as a hash and uses the HashMap as storage. But probably that would be too much.

Then you must change the place where you fill the data from ArrayList to YourCustomList.

Like:

  List list = new ArrayList(); fillFromSoap( list ); 

To:

  List list = new MyCustomSpecialList(); fillFromSoap( list ); 

The implementation will look something like this:

 class MyCustomSpecialList extends AbstractList { private Map<Integer, YourObject> internalMap; public boolean add( YourObject o ) { internalMap.put( o.getThatFieldYouKnow(), o ); } public boolean contains( YourObject o ) { return internalMap.containsKey( o.getThatFieldYouKnow() ); } 

}

To a large extent, like a HashSet, the problem is that the HashSet relies on a good implementation of the hashCode method, which you probably don't have. Instead, you use the hash “this field you know” as the hash, which is the one that makes one object equal to another.

Of course, implementing a list from scratch is much more complicated than my snippet above, so I say that the Fabian Steeg proposal would be better and easier to implement (although something like that would be more efficient)

Tell us what you did at the end.

+2
Feb 18 '09 at 0:12
source share

Perhaps the list is not what you need.

Maybe TreeSet would be a better container. You get the embedding and extraction of O (log N) and ordered the iteration (but don't allow duplicates).

LinkedHashMap might be even better for your use case, check this out too.

+2
Feb 18 '09 at 2:01
source share

Constructing a HashMap of these objects based on the field value as a key may be appropriate in terms of performance, for example. fill in Maps once and find objects very efficiently

+1
Feb 17 '09 at 22:30
source share

If you need to find a lot of time in one list, it can pay off to create an index.

Iterate once and build a HashMap with the equivalent value that you are looking for as the key and the corresponding node as the value. If you want everything to be equal instead of any given value, then let the map be of the list value type and build the entire list in the initial iteration.

Note that you need to measure before that, since the overhead of creating an index can only obscure the move until the expected node is found.

+1
Feb 17 '09 at 22:35
source share

There are three main options:

1) If search performance is of paramount importance, and it is practically advisable, use one of the forms of the hash table built once (and modified as / if the list changes).

2) If the list is conveniently sorted or it is advisable to sort it, and the O (log n) search is sufficient, sort and search.

3) If extracting O (n) is fast enough or if it is impractical to manipulate / maintain a data structure or alternative, iterate over the list.

Before writing code that is more complicated than just iterating over a list, it’s worth considering some questions.

  • Why do you need something else? (Time)? Elegance? Maintainability? Reuse? All this is in order, separately or together, but they affect the decision.

  • How much control do you have over the data structure in question? Can you influence its construction? Managed later?

  • What is the life cycle of a data structure (and underlying objects)? Is it created right away and never changed or was it very dynamic? Can your code track (or even change) its life cycle?

  • Are there other important limitations, such as memory size? Is there duplicate information? Etc.

+1
Feb 17 '09 at 23:18
source share

I would say that the simplest solution would be to wrap the object and delegate the call to contains to the collection of the wrapped class. This seems like a comparator, but does not force you to sort the resulting collection, you can simply use ArrayList.contains ().

 public class Widget { private String name; private String desc; public String getName() { return name; } public void setName(String name) { this.name = name; } public String getDesc() { return desc; } public void setDesc(String desc) { this.desc = desc; } } public abstract class EqualsHashcodeEnforcer<T> { protected T wrapped; public T getWrappedObject() { return wrapped; } @Override public boolean equals(Object obj) { return equalsDelegate(obj); } @Override public int hashCode() { return hashCodeDelegate(); } protected abstract boolean equalsDelegate(Object obj); protected abstract int hashCodeDelegate(); } public class WrappedWidget extends EqualsHashcodeEnforcer<Widget> { @Override protected boolean equalsDelegate(Object obj) { if (obj == null) { return false; } if (obj == getWrappedObject()) { return true; } if (obj.getClass() != getWrappedObject().getClass()) { return false; } Widget rhs = (Widget) obj; return new EqualsBuilder().append(getWrappedObject().getName(), rhs.getName()).append(getWrappedObject().getDesc(), rhs.getDesc()).isEquals(); } @Override protected int hashCodeDelegate() { return new HashCodeBuilder(121, 991).append( getWrappedObject().getName()).append( getWrappedObject().getDesc()).toHashCode(); } } 
0
Feb 18 '09 at 0:49
source share



All Articles