A set that allows you to quickly insert / delete and arbitrary selection in C #

Question

A set that allows you to quickly insert / delete and arbitrary selection in C #

What data structure can I use in C # to provide quick insert / delete and even random selection? The list has a slow deletion over the element (since it must find the index of the element every time), while the HashSet does not seem to allow random selection of the element (without copying to the list.)

The data structure will be constantly updated, so insertion and deletion should be online procedures. There seems to be a way to do the insertion, deletion and random selection of all O (log n).

A binary search tree with arbitrary integer keys assigned to objects would solve all these problems, but I cannot find the corresponding class in the C # standard library. Is there a canonical way to solve this problem without writing a custom binary search tree?

+4

c # data-structures

Corey staten Sep 22 '11 at 18:16

source share

3 answers

Search binary trees and derived structures, such as SortedDictionary or SortedSet , work by comparing keys.

Your objects are not comparable on their own, but they offer an object identifier and a hash value. Consequently, a HashSet is the correct data structure. Note: A Dictionary<int,YourType> not suitable, since deletion becomes a linear search (O (n)) and does not solve the random problem after deletion.

The insert is O (1)
Delete - O (1)
RandomElement is O (n). It can be easily implemented, for example.
```
 set.ElementAt(random.Next(set.Count)) 
```
No need to copy to the staging list.

+2

Mark Sep 23 '11 at 12:04

source share

I understand that this question is older than 3 years, but only for people who come across this page:

If you do not need to store items in a sorted dataset, you can simply use List<ItemType> .

Insert and random selection - O (1). You can do the deletion in O (1) by simply moving the last item to the position of the item you want to delete and delete it from the end.

code:

 using System; // For the Random using System.Collections.Generic; // The List // List: List<ItemType> list = new List<ItemType>(); // Add x: ItemType x = ...; // The item to insert into the list list.Add( x ); // Random selection Random r = ...; // Probably get this from somewhere else int index = r.Next( list.Count ); ItemType y = list[index]; // Remove item at index list[index] = list[list.Count - 1]; // Copy last item to index list.RemoveAt( list.Count - 1 ); // Remove from end of list

EDIT: Of course, to remove an item from a List<ItemType> , you need to know its index. If you want to remove a random element, you can use a random index (as is done in the example above). If you want to remove this element, you can save a Dictionary<ItemType,int> , which maps the elements to their indexes. Adding, deleting and updating these indexes can be done in O (1) (amortized).

Together, this leads to complexity O (1) (amortized) for all transactions.

0

Rdj Dec 08 '14 at 20:50

source share

James Michael Hare · Accepted Answer · 2011-09-22T18:38:39+0000

C # BCL already has a BST, it is called SortedDictionary<TKey, TValue> , if you do not want Key Value Pairs, but instead want individual elements, you can use SortedSet<T> (SortedSet is in .NET 4.0).

It looks like in your example you will need SortedDictionary<int, WhateverValueType> . Although I don’t know exactly what you need when you say “uniform random choice”.

Of course, Dictionary<TKey, TValue> is O (1), which is much faster. Therefore, if you do not need a sorted order of keys, I would use this.

UPDATE . From the sounds of your needs, you will have the efficiency of catch-22. To be able to go into a random continuous index in the data structure, how often will you insert / delete? If not often, you can use an array and just Sort () after (O (n log n)) or always insert / delete in order (O (n)).

Or you can wrap Dictionary<int, YourType> and save the parallel List<int> and update it after each addition / removal:

 _dictionary.Add(newIndex, newValue); _indexes.Add(newIndex);

And then just access the random index from the list of search queries. The best part is that in this method the Add () method will be ~ O (1) (unless you change the size of the list, but you can set the initial capacity to avoid some of them), but you will incur the cost of O (n) when deleting.

I am afraid the problem is that you will sacrifice time in the search or delete / insert. The problem is that all the best access time containers are non-contiguous. Together with the double combination List<int>/Dictionary<int, YourValue> you will have a good mix.

UPDATE 2 . It seems like from our continued discussion that if this absolute performance is your requirement, you might be better off if you skip yourself. It was interesting to think about, though, I will update if I think about anything else.

A set that allows you to quickly insert / delete and arbitrary selection in C #

More articles: