Data structure: insert, delete, contains, receive a random element, all at point O (1)

I was given this problem in an interview. How would you answer?

Create a data structure that offers the following operations in O (1):

  • Insert
  • remove
  • contains
  • get a random item
+79
data-structures
Apr 15 2018-11-21T00:
source share
13 answers

Consider a data structure consisting of a hash table H and an array A. The hashtable keys are the elements in the data structure, and the values ​​are their positions in the array.

  • insert (value): add the value to the array and let me be its index in A. Set H [value] = i.
  • remove (value): We will replace the cell containing the value in A with the last element in A. Let d be the last element in the array A with index m. let I be H [value], the index in the array of the value to be deleted. Set A [i] = d, H [d] = i, reduce the size of the array by one and remove the value from H.
  • contains (value): return H.contains (value)
  • getRandomElement (): let r = random (current size A). return A [r].

since the array should automatically increase in size, it will amortise O (1) to add an element, but I think OK.

+118
Apr 16 '11 at 6:21
source share

O (1) search implies a hashed data structure .

For comparison:

  • O (1) insertion / deletion using O (N) search implies a linked list.
  • O (1) insert, O (N) delete and O (N) lookup implies a list with array support
  • O (logN) insert / delete / lookup implies a tree or a bunch.
+19
Apr 15 2018-11-11T00:
source share

You may not like this because they are probably looking for a smart solution, but sometimes it pays to stick to your pistols ... The hash table already satisfies the requirements - perhaps better than anything else (although, obviously, in the amortized constant time and with various compromises with other solutions).

The requirement that choosing a "random element" is difficult: in a hash table, you will need to scan or probe such an element.

For closed hashing / open addressing, the probability that any given bucket will be occupied is equal to size() / capacity() , but, as a rule, this is usually kept in a constant multiplicative range by implementing a hash table (for example, a table can stored more than its current contents, say, from 1.2x to ~ 10x depending on performance / memory settings). This means that on average we can expect a search from 1.2 to 10 buckets - completely regardless of the total size of the container; amortized to O (1).

I can imagine two simple approaches (and many more):

  • search linearly from a random bucket

    • consider the empty slave statements ala "-AC ----- BD": you can say that the first "random" choice is fair, although it favors B because B was no longer more likely than other elements, but if you do repeated "random" samples using the same values, it is clear that B, which often prefer, may be undesirable (nothing in the question requires even probability)
  • repeat random buffers until you find a filled file

    • "only" capacity () / size () the average buckets visited (as indicated above) - but in practice more expensive, because random number generation is relatively expensive and infinitely bad if worse behavior is infinitely unlikely ...
      • a faster compromise would be to use a list of pre-generated random offsets from the initial randomly selected bucket,% -to them in counting branches

Not a great solution, but it can still be a better general compromise than the overhead of memory and performance to maintain a second array of indices at any time.

+5
Apr 18 '11 at 12:41
source share

The best solution is probably a hash table + array, it is very fast and deterministic.

But the lowest rating (just use a hash table!) Is actually cool too!

  • re-hashed hash table or new bucket selection (i.e. one item for each bucket, without linked lists)
  • getRandom () repeatedly attempts to select a random bucket until it is empty.
  • as fault-tolerant, possibly getRandom (), after unsuccessful attempts, N (the number of elements) selects a random index i in [0, N-1], and then linearly goes through the hash table and selects the #ith element.

People may not like this because of β€œpossible endless cycles,” and I saw that very smart people have this reaction too, but it’s wrong! There are simply no infinitely unlikely events.

Assuming the good behavior of your pseudo-random source - which is easy to establish for this particular behavior - and that the hash tables are always at least 20% full, it is easy to see that:

It never happens that getRandom () should try more than 1000 times. Just never. In fact, the probability of such an event is 0.8 ^ 1000, which is 10 ^ -97, so we would have to repeat it 10 ^ 88 times in order to have one chance in a billion of what ever happens once. Even if this program worked full time on all computers of mankind until the Sun dies, this will never happen.

+2
Jan 13 2018-12-12T00:
source share

Here is a C # solution for this problem, I came up a little back when I asked the same question. It implements Add, Remove, Contains and Random along with other standard .NET interfaces. It's not that you ever needed to translate it during an interview, but it's nice to have a specific solution to watch ...

 using System; using System.Collections; using System.Collections.Generic; using System.Linq; using System.Threading; /// <summary> /// This class represents an unordered bag of items with the /// the capability to get a random item. All operations are O(1). /// </summary> /// <typeparam name="T">The type of the item.</typeparam> public class Bag<T> : ICollection<T>, IEnumerable<T>, ICollection, IEnumerable { private Dictionary<T, int> index; private List<T> items; private Random rand; private object syncRoot; /// <summary> /// Initializes a new instance of the <see cref="Bag&lt;T&gt;"/> class. /// </summary> public Bag() : this(0) { } /// <summary> /// Initializes a new instance of the <see cref="Bag&lt;T&gt;"/> class. /// </summary> /// <param name="capacity">The capacity.</param> public Bag(int capacity) { this.index = new Dictionary<T, int>(capacity); this.items = new List<T>(capacity); } /// <summary> /// Initializes a new instance of the <see cref="Bag&lt;T&gt;"/> class. /// </summary> /// <param name="collection">The collection.</param> public Bag(IEnumerable<T> collection) { this.items = new List<T>(collection); this.index = this.items .Select((value, index) => new { value, index }) .ToDictionary(pair => pair.value, pair => pair.index); } /// <summary> /// Get random item from bag. /// </summary> /// <returns>Random item from bag.</returns> /// <exception cref="System.InvalidOperationException"> /// The bag is empty. /// </exception> public T Random() { if (this.items.Count == 0) { throw new InvalidOperationException(); } if (this.rand == null) { this.rand = new Random(); } int randomIndex = this.rand.Next(0, this.items.Count); return this.items[randomIndex]; } /// <summary> /// Adds the specified item. /// </summary> /// <param name="item">The item.</param> public void Add(T item) { this.index.Add(item, this.items.Count); this.items.Add(item); } /// <summary> /// Removes the specified item. /// </summary> /// <param name="item">The item.</param> /// <returns></returns> public bool Remove(T item) { // Replace index of value to remove with last item in values list int keyIndex = this.index[item]; T lastItem = this.items[this.items.Count - 1]; this.items[keyIndex] = lastItem; // Update index in dictionary for last item that was just moved this.index[lastItem] = keyIndex; // Remove old value this.index.Remove(item); this.items.RemoveAt(this.items.Count - 1); return true; } /// <inheritdoc /> public bool Contains(T item) { return this.index.ContainsKey(item); } /// <inheritdoc /> public void Clear() { this.index.Clear(); this.items.Clear(); } /// <inheritdoc /> public int Count { get { return this.items.Count; } } /// <inheritdoc /> public void CopyTo(T[] array, int arrayIndex) { this.items.CopyTo(array, arrayIndex); } /// <inheritdoc /> public bool IsReadOnly { get { return false; } } /// <inheritdoc /> public IEnumerator<T> GetEnumerator() { foreach (var value in this.items) { yield return value; } } /// <inheritdoc /> IEnumerator IEnumerable.GetEnumerator() { return this.GetEnumerator(); } /// <inheritdoc /> public void CopyTo(Array array, int index) { this.CopyTo(array as T[], index); } /// <inheritdoc /> public bool IsSynchronized { get { return false; } } /// <inheritdoc /> public object SyncRoot { get { if (this.syncRoot == null) { Interlocked.CompareExchange<object>( ref this.syncRoot, new object(), null); } return this.syncRoot; } } } 
+1
Dec 22 2018-11-11T00:
source share

Although it's already old, but since there is no answer in C ++, here are my two cents.

 #include <vector> #include <unordered_map> #include <stdlib.h> template <typename T> class bucket{ int size; std::vector<T> v; std::unordered_map<T, int> m; public: bucket(){ size = 0; std::vector<T>* v = new std::vector<T>(); std::unordered_map<T, int>* m = new std::unordered_map<T, int>(); } void insert(const T& item){ //prevent insertion of duplicates if(m.find(item) != m.end()){ exit(-1); } v.push_back(item); m.emplace(item, size); size++; } void remove(const T& item){ //exits if the item is not present in the list if(m[item] == -1){ exit(-1); }else if(m.find(item) == m.end()){ exit(-1); } int idx = m[item]; m[v.back()] = idx; T itm = v[idx]; v.insert(v.begin()+idx, v.back()); v.erase(v.begin()+idx+1); v.insert(v.begin()+size, itm); v.erase(v.begin()+size); m[item] = -1; v.pop_back(); size--; } T& getRandom(){ int idx = rand()%size; return v[idx]; } bool lookup(const T& item){ if(m.find(item) == m.end()) return false; return true; } //method to check that remove has worked void print(){ for(auto it = v.begin(); it != v.end(); it++){ std::cout<<*it<<" "; } } }; 

Here is a piece of client code to verify the solution.

 int main() { bucket<char>* b = new bucket<char>(); b->insert('d'); b->insert('k'); b->insert('l'); b->insert('h'); b->insert('j'); b->insert('z'); b->insert('p'); std::cout<<b->random()<<std::endl; b->print(); std::cout<<std::endl; b->remove('h'); b->print(); return 0; } 
+1
19 Oct '15 at 3:46
source share

For this question I will use two data structures

  • Hashmap
  • ArrayList / Array / Double LinkedList.

Steps: -

  • Insert: - Check if X is present in the HashMap - O (1) time complexity. if not Present Then Add to end ArrayList - O (1) time complexity. add it to HashMap also as a key and last index as a value - O (1) time complexity.
  • Delete: - check if X is present in the HashMap - O (1) time complexity. If present, find its index and remove it from the HashMap - O (1) time complexity. replace this element with the last element in the ArrayList and delete the last element - O (1) time complexity. Update the index of the last item in the HashMap - O (1) time complexity.
  • GetRandom: - Generate a random number from 0 to the last ArrayList index. return an ArrayList element arbitrarily created index - O (1) time complexity.
  • Search: - See in HashMap for x as a key. - Body complexity O (1).

The code: -

 import java.util.ArrayList; import java.util.Collections; import java.util.HashMap; import java.util.HashSet; import java.util.LinkedList; import java.util.List; import java.util.Random; import java.util.Scanner; public class JavaApplication1 { public static void main(String args[]){ Scanner sc = new Scanner(System.in); ArrayList<Integer> al =new ArrayList<Integer>(); HashMap<Integer,Integer> mp = new HashMap<Integer,Integer>(); while(true){ System.out.println("**menu**"); System.out.println("1.insert"); System.out.println("2.remove"); System.out.println("3.search"); System.out.println("4.rendom"); int ch = sc.nextInt(); switch(ch){ case 1 : System.out.println("Enter the Element "); int a = sc.nextInt(); if(mp.containsKey(a)){ System.out.println("Element is already present "); } else{ al.add(a); mp.put(a, al.size()-1); } break; case 2 : System.out.println("Enter the Element Which u want to remove"); a = sc.nextInt(); if(mp.containsKey(a)){ int size = al.size(); int index = mp.get(a); int last = al.get(size-1); Collections.swap(al, index, size-1); al.remove(size-1); mp.put(last, index); System.out.println("Data Deleted"); } else{ System.out.println("Data Not found"); } break; case 3 : System.out.println("Enter the Element to Search"); a = sc.nextInt(); if(mp.containsKey(a)){ System.out.println(mp.get(a)); } else{ System.out.println("Data Not Found"); } break; case 4 : Random rm = new Random(); int index = rm.nextInt(al.size()); System.out.println(al.get(index)); break; } } } } 

- The complexity of the time is O (1). - The complexity of the space O (N).

+1
Oct. 16 '17 at 9:32 on
source share

In C # 3.0 + .NET Framework 4, a generic Dictionary<TKey,TValue> even better than Hashtable, because you can use the System.Linq ElementAt() extension method to index into a basic dynamic array, where the KeyValuePair<TKey,TValue> saved:

 using System.Linq; Random _generator = new Random((int)DateTime.Now.Ticks); Dictionary<string,object> _elements = new Dictionary<string,object>(); .... Public object GetRandom() { return _elements.ElementAt(_generator.Next(_elements.Count)).Value; } 

However, as far as I know, the Hashtable (or its offspring) is not a real solution to this problem, because Put () can only amortize O (1) and not true O (1), since this is O (N) at the dynamic change boundary size.

Is there a real solution to this problem? All I can think of is if you specified the initial Dictionary / Hashtable capacity an order of magnitude higher than what you would expect from a need, then you will get O (1) operations because you never need to resize.

0
Mar 14 '12 at 4:40
source share

We can use hashing to support operations over Θ (1).

insert (x) 1) Check if x is present by searching the hash map. 2) If not, then paste it at the end of the array. 3) Add also to the hash table, x is added as the index of the key and the last array as the index.

delete (x) 1) Check for the presence of x by searching the hash map. 2) If there is, then find its index and remove it from the hash map. 3) Change the last element with this element in the array and delete the last element. The permutation is performed because the last element can be deleted O (1) times. 4) Update the index of the last item in the hash map.

getRandom () 1) Create a random number from 0 to the last index. 2) Return the element of the array to a randomly generated index.

search (x) Search for x in the hash map.

0
Jul 30 '15 at 12:33
source share

I agree with Anon. Except for the last requirement, which requires a random element with equal honesty, all other requirements can only be resolved using a single Hash-based DS. I will choose a HashSet for this in Java. Modulo the hash code of the element, I get the index no of the base array O (1) times. I can use this to add, delete and contains operations.

0
Sep 05 '16 at 10:11
source share

Can't we do it with HashSet Java? It provides insert, del, O (1) search by default. For getRandom, we can use the Set iterator, which in any case gives random behavior. We can simply iterate over the first element from the set without worrying about the rest

 public void getRandom(){ Iterator<integer> sitr = s.iterator(); Integer x = sitr.next(); return x; } 
0
Jan 24 '17 at 7:03
source share
 /* Java program to design a data structure that support folloiwng operations in Theta(n) time a) Insert b) Delete c) Search d) getRandom */ import java.util.*; // class to represent the required data structure class MyDS { ArrayList<Integer> arr; // A resizable array // A hash where keys are array elements and vlaues are // indexes in arr[] HashMap<Integer, Integer> hash; // Constructor (creates arr[] and hash) public MyDS() { arr = new ArrayList<Integer>(); hash = new HashMap<Integer, Integer>(); } // A Theta(1) function to add an element to MyDS // data structure void add(int x) { // If ekement is already present, then noting to do if (hash.get(x) != null) return; // Else put element at the end of arr[] int s = arr.size(); arr.add(x); // And put in hash also hash.put(x, s); } // A Theta(1) function to remove an element from MyDS // data structure void remove(int x) { // Check if element is present Integer index = hash.get(x); if (index == null) return; // If present, then remove element from hash hash.remove(x); // Swap element with last element so that remove from // arr[] can be done in O(1) time int size = arr.size(); Integer last = arr.get(size-1); Collections.swap(arr, index, size-1); // Remove last element (This is O(1)) arr.remove(size-1); // Update hash table for new index of last element hash.put(last, index); } // Returns a random element from MyDS int getRandom() { // Find a random index from 0 to size - 1 Random rand = new Random(); // Choose a different seed int index = rand.nextInt(arr.size()); // Return element at randomly picked index return arr.get(index); } // Returns index of element if element is present, otherwise null Integer search(int x) { return hash.get(x); } } // Driver class class Main { public static void main (String[] args) { MyDS ds = new MyDS(); ds.add(10); ds.add(20); ds.add(30); ds.add(40); System.out.println(ds.search(30)); ds.remove(20); ds.add(50); System.out.println(ds.search(50)); System.out.println(ds.getRandom());`enter code here` } } 
0
Aug 29 '17 at 8:48
source share

I think we can use a double list of links with a hash table. the key will be an element, and its associated value will be node in the double list of links.

  • insert (H, E): insert node into the double list of links and enter the entry as H [E] = node; O (1)
  • delete (H, E): get the node address by H (E), go to the previous one from this node and delete and make H (E) as NULL, therefore O (1)
  • contains (H, E) and getRandom (H) obviuosly O (1)
-2
Jun 02 '13 at 19:50
source share



All Articles