Best way to check the contents of a shared list

Question

Best way to check the contents of a shared list

I need to work on some code that uses shared lists to store a collection of custom objects.

Then he does something like the following to check if the given object is in the collection and do something:

List<CustomObject> customObjects; //fill up the list List<CustomObject> anotherListofCustomObjects; //fill it up //... foreach (CustomObject myCustomObject in customObjects) { if (anotherListofCustomObjects.Contains(myCustomObject)) { //do stuff } }

The problem is that 7000 objects are forever processed.

This is not my code. I'm just trying to find options to improve it. It seems to me that it is much faster to use a dictionary to get material by key, and not to scroll through the entire collection, as shown above.

Suggestions?

+3

performance generics c # .net

Johnidol Dec 15 '08 at 14:01

source share

8 answers

Another way, besides dictionaries, is if you are using .NET 3.5, use Linq for objects and Intersect:

 foreach(CustomObject c in customObjects.Intersect(anotherListOfCustomObjects)) { // do stuff. }

According to the reflector, it uses Hash sets to perform sequence intersection.

+9

Frans bouma Dec 15 '08 at 14:10

source share

In fact, your code is O (n ^ 2), which will be slow. You can:

use dictionaries or KeyedCollections, this will make it O (nlog n)
if you can guarantee that the elements are in the same order, you can rewrite the last loop to use only one index, and that will be O (n)

+2

Grzenio Dec 15 '08 at 14:08

source share

You can also consider System.Collections.ObjectModel.KeyedCollection<TKey, TItem> .

In addition to this, I usually create my own IKeyable interface and a specific implementation of KeyedCollection that uses IKeyable for the necessary overload.

+1

Joel Coehoorn Dec 15 '08 at 14:15

source share

Tests are your friend. The size of the collection determines the data structure / algorithm you should use. I suggest you perform some performance tests on the following parameters:

Your current decision
Use the BinarySearch algorithm in a sorted list.
Use a HashSet<CustomObject> .

Given the number of elements, I suspect that the HashSet<CustomObject> path is the path.

+1

bruno conde Dec 15 '08 at 14:37

source share

If you must maintain two separate lists, one of the Set types can be faster (using the join operation). Some of the available libraries

0

Anthony mastrean Dec 15 '08 at 14:12

source share

Just a small addition to the other comments. If you need a different list of clients for sorting, you can use a SortedList.

0

Cristian libardo Dec 15 '08 at 14:12

source share

Hashset also works great.

 new HashSet<CustomObject>().Join()

0

Brian rudolph Dec 15 '08 at 16:04

source share

Marc gravell · Accepted Answer · 2008-12-15T14:07:53+0000

Well, you seem to have answered yourself? If you need a quick query on a dataset, then a dictionary can be better than a flat list (for the large data sizes that you have).

You could, for example, use this object as your own key -

 Dictionary<CustomObject,CustomObject> ...

Note that the meaning of equality depends on the context. If you pass the source link, then it's fine - ContainsKey will do the job. If you have a different object than the one for the purpose, to compare with it, then you will need to implement your own GetHashCode() , Equals() and ideally IEquatable<CustomObject> . Either in CustomObject or in custom IEqualityComparer<CustomObject> .

Best way to check the contents of a shared list

More articles: