(Algorithm) Find if two unsorted arrays have any common elements in O (n) time without sorting?

We have two unsorted arrays, and each array has a length n. These arrays contain random integers in the range 0-n 100 . How to find if these two arrays have any common elements in O (n) / linear time? Sorting is not allowed.

+7
source share
8 answers

You have not defined a calculation model. Assuming you can only read O (1) bits O (1) times (something else would be a pretty exotic computational model), there could be no algorithm to solve the problem in O (n) worst case time complexity.

Sketch of the proof: Each input number takes O (log (n ^ 100)) = O (100 log n) = O (log n) bits. Thus, the entire input bit is O (n log n), which cannot be read in O (n) time. Thus, any O (n) algorithm cannot read the entire input and therefore does not respond if these bits matter.

+6
source

Hashtable will save you. Indeed, it looks like a Swiss algorithm knife.
Just put all the values ​​from the first array into it and then check if any value from the second array is present.

+11
source

Answering Neil: Since you know at the beginning that your is N (two arrays of size N), you can create a hash with the size of the array 2 * N * some_ratio (for example: some_ratio = 1.5). With this size, almost all simple hash functions will provide you with a good distribution of objects.

You can also implement find_or_insert to search for an existing one or insert a new one with the same action, this will reduce the hash function and comparison calls. (C ++ stl find_or_insert is not good enough, as it does not tell you whether this element was before or not).

+2
source

Linearity test

Using the Mathematica hash function and arbitrary integers.

alt text

Tested up to n = 2 ^ 20 , generating random numbers up to (2 ^ 20) ^ 100 = (approximately 10 ^ 602)

Just in case ... the program:

k = {}; For[t = 1, t < 21, t++, i = 2^t; Clear[a, b]; Table[a[RandomInteger[i^100]] = 1, {i}]; b = Table[RandomInteger[i^100], {i}]; Contains = False; AppendTo[k, {i, First@Timing @For[j = 2, j <= i, j++, Contains = Contains || (NumericQ[a[b[[j]]]]); ]}]]; ListLinePlot[k, PlotRange -> All, AxesLabel -> {"n", "Time(secs)"}] 
+2
source

If storage is not important, then a hash table of zero in favor of an array of n in length. Mark true when you meet this number in the first array. When passing through the second array, if you find that any of them is true, you have your own answer. O (n).

 Define largeArray(n) // First pass for(element i in firstArray) largeArray[i] = true; // Second pass Define hasFound = false; for(element i in secondArray) if(largeArray[i] == true) hasFound = true; break; 
+1
source

Put the elements of the first array in the hash table and check for a scan of the second array. This gives you a solution in the middle case of O (N).

If you really want the O (N) worst-case solution, instead of using a hash table, use a linear array in the range 0-n ^ 100. Note that you only need to use one bit for each record.

+1
source

Have you tried sorting counting ? It is easy to implement, uses O (n) space, and also has \ theta (n) time complexity.

0
source

Based on ideas published before the date. We can store the whole elements of a single array in a hash map. The maximum number of different integers can be stored in RAM. The hash map will only have unique integer values. Duplicates are ignored.

Here is a Perl implementation.

 #!/usr/bin/perl use strict; use warnings; sub find_common_elements{ # function that prints common elements in two unsorted array my (@arr1,@array2) =@ _; # array elements assumed to be filled and passed as function arguments my $hash; # hash map to store value of one array # runtime to prepare hash map is O(n). foreach my $ele ($arr1){ $hash->{$ele}=true; # true here element exists key is integer number and value is true, duplicate elements will be overwritten # size of array will fit in memory as duplicate integers are ignored ( mx size will be 2 ^( 32) -1 or 2^(64) -1 based on operating system) which can be stored in RAM. } # O(n ) to traverse second array and finding common elements in two array foreach my $ele2($arr2){ # search in hash map is O(1), if all integers of array are same then hash map will have only one entry and still search tim is O(1) if( defined $hash->{$ele}){ print "\n $ele is common in both array \n"; } } } 

Hope this helps.

0
source

All Articles