Best way to check the contents of a shared list

I need to work on some code that uses shared lists to store a collection of custom objects.

Then he does something like the following to check if the given object is in the collection and do something:

List<CustomObject> customObjects; //fill up the list List<CustomObject> anotherListofCustomObjects; //fill it up //... foreach (CustomObject myCustomObject in customObjects) { if (anotherListofCustomObjects.Contains(myCustomObject)) { //do stuff } } 

The problem is that 7000 objects are forever processed.

This is not my code. I'm just trying to find options to improve it. It seems to me that it is much faster to use a dictionary to get material by key, and not to scroll through the entire collection, as shown above.

Suggestions?

+3
source share
8 answers

Well, you seem to have answered yourself? If you need a quick query on a dataset, then a dictionary can be better than a flat list (for the large data sizes that you have).

You could, for example, use this object as your own key -

 Dictionary<CustomObject,CustomObject> ... 

Note that the meaning of equality depends on the context. If you pass the source link, then it's fine - ContainsKey will do the job. If you have a different object than the one for the purpose, to compare with it, then you will need to implement your own GetHashCode() , Equals() and ideally IEquatable<CustomObject> . Either in CustomObject or in custom IEqualityComparer<CustomObject> .

+3
source

Another way, besides dictionaries, is if you are using .NET 3.5, use Linq for objects and Intersect:

 foreach(CustomObject c in customObjects.Intersect(anotherListOfCustomObjects)) { // do stuff. } 

According to the reflector, it uses Hash sets to perform sequence intersection.

+9
source

In fact, your code is O (n ^ 2), which will be slow. You can:

  • use dictionaries or KeyedCollections, this will make it O (nlog n)
  • if you can guarantee that the elements are in the same order, you can rewrite the last loop to use only one index, and that will be O (n)
+2
source

You can also consider System.Collections.ObjectModel.KeyedCollection<TKey, TItem> .

In addition to this, I usually create my own IKeyable interface and a specific implementation of KeyedCollection that uses IKeyable for the necessary overload.

+1
source

Tests are your friend. The size of the collection determines the data structure / algorithm you should use. I suggest you perform some performance tests on the following parameters:

  • Your current decision
  • Use the BinarySearch algorithm in a sorted list.
  • Use a HashSet<CustomObject> .

Given the number of elements, I suspect that the HashSet<CustomObject> path is the path.

+1
source

If you must maintain two separate lists, one of the Set types can be faster (using the join operation). Some of the available libraries

0
source

Just a small addition to the other comments. If you need a different list of clients for sorting, you can use a SortedList.

0
source

Hashset also works great.

 new HashSet<CustomObject>().Join() 
0
source

All Articles