The fastest way to find out if two ICollection <T> collections contain the same objects
What is the fastest way to find out if two ICollection<T> collections contain exactly the same records? Brute force is clear, I was wondering if there is a more elegant method.
We use C # 2.0, so no extension methods, if possible, please!
Edit: the answer will be interesting for both ordered and unordered collections, and I hope it will be different for everyone.
use c5
http://www.itu.dk/research/c5/
"Make sure all items in comes in this package
(taking into account multiplicities).
items to look for.
True if all items are found. "
[Tested] public virtual bool ContainsAll<U>(SCG.IEnumerable<U> items) where U : T { HashBag<T> res = new HashBag<T>(itemequalityComparer); foreach (T item in items) if (res.ContainsCount(item) < ContainsCount(item)) res.Add(item); else return false; return true; } 
First compare the Count collection if they have the same score to compare brute force with all elements. The worst-case scenarios are O (n). This is when the order of the elements must be the same.
The second case, when the order does not match, you need to use a dictionary to store the number of elements found in collections: Here, an algorithm is possible
- Compare collection Count: return false if they are different
- Iterate the first collection
- If the item does not exist in the dictionary, add it and enter using Key = Item, Value = 1 (count)
- If an element exists, increment the counter for the int element in the dictionary;
- Iterate the second collection
- If the item is not in the dictionary then return false
- If the item is in the dictionary decrement index for the item
- If count == 0 remove the item;
- return Dictionary.Count == 0;
For ordered collections, you can use the SequenceEqual() extension method defined by System.Linq.Enumerable :
if (firstCollection.SequenceEqual(secondCollection)) Do you mean the same entries or the same entries in the same order?
In any case, assuming you want to compare whether they contain the same records in the same order, "brute force" is actually your only option in C # 2.0. I know what you mean by unethical, but if atomic matching itself is O (1), the whole process should be in O (N), which is not , which is bad.
If the records should be in the same order (besides the same), I suggest - as an optimization - you iterate both collections at the same time and compare the current record in each collection. Otherwise brute force is the way to go.
Oh, and another suggestion - you can override Equals for the collection class and implement equality things there (depending on your project).
Again, using the C5 library, which has two sets, you can use:
C5.ICollection <T> set1 = C5.ICollection <T> ();
C5.ICollection <T> set2 = C5.ICollecton <T> ();
if (set1.UnsequencedEquals (set2)) {
// Do something
}
The C5 library includes a heuristic that actually first checks for the disjoint hash codes of the two sets (see C5.ICollection<T>.GetUnsequencedHashCode() ), so if the hashes of the two sets are unequal, it does not need to iterate over each element to verify equality.
You should also note that C5.ICollection<T> inherits from System.Collections.Generic.ICollection<T> , so you can use C5 implementations while still using .NET interfaces (although you have access to smaller functions through .NET stingy interfaces).
Brute force accepts O (n) - compares all elements (assuming they are sorted), which in my opinion is the best you could do - unless there is some kind of data property that makes it easier.
I assume the case is not sorted, its O (n * n).
In this case, I think a solution based on merge sort will probably help.
For example, could you re-model it so that there is only one collection? Or 3 collections, one for those who are only in collection A, only for B and for both - so if only A and B are empty, then they are the same ... I probably completely disconnected from the wrong tangent here. ..