Algorithm for finding duplicate records in constant space and time O (n)

Question

Algorithm for finding duplicate records in constant space and time O (n)

For an array of N integer for which only one integer is repeated. Find a repeating integer in O (n) time and constant space. For integer value or N value

no range of values

For example, an array of 6 integers is given as 23 45 67 87 23 47. Answer 23 (I hope this covers the ambiguous and foggy part)

I searched the net, but could not find a question in which the range of integers was not fixed. In addition, here is an example that answers a similar question to mine, but here he created a hash table with the highest integer value in C ++. But cpp does not allow you to create an array with an element of 2 ^ 64 (on a 64-bit computer).

I'm sorry I didn't mention this before the array is immutable

+7

c ++ algorithm

Anubhav agarwal Nov 24 '11 at 16:43

source share

7 answers

Jun Tarui has shown that any duplicate crawler using O (log n) requires at least Ω (log n / log log n), which exceeds linear time, i.e. your question is provably insoluble, even if you resolve the logarithmic space.

There is an interesting algorithm of Gopalan and Radhakrishnan that finds duplicates in one pass on the input and O ((log n) ^ 3), which sounds like your best choice a priori.

Radix sort has a time complexity of O (kn), where k> log_2 n is often regarded as a constant, albeit a large one. Of course, you cannot implement radix sorting in constant space, but you can probably reuse your data entry space.

There are numerical tricks, if you assume the features of the numbers themselves. If almost all numbers from 1 to n are present, just add them and subtract n (n + 1) / 2. If all numbers are prime numbers, you can cheat by ignoring the division time.

Aside, there is the well-known lower bound of Ω (log_2 (n!)) When sorting comparisons, which suggests that google can help you find the lower bounds of simple problems like finding duplicates.

+8

Jeff burdges Nov 25 '11 at 0:08

source share

If the range of integers is limited, you can do the sort option in O (n) time. The complexity of the space is O (k), where k is the upper bound of the integers (*), but constant, therefore O (1).

If the range of integers is unlimited, then I don’t think there is a way to do this, but I am not an expert in complex puzzles.

(*) It O (k), since there is also a constant upper bound on the number of occurrences of each integer, namely 2.

+4

Fred foo Nov 24 '11 at 18:03

source share

In the case when the entries are limited by the length of the array, you can check Find any of several possible repeated integers in the list and O (N) and O (1) space solutions .

The generalization that you mentioned is discussed in the following question: Algorithm for finding a duplicate number in a list that can contain any number of repetitions , and O (n log ^ 2 n) and O (1) spatial solution .

+3

Pengone Dec 01 '11 at 21:37

source share

The approach that will be closest to O (N) in time is probably a regular hash table, where hash entries are just numbers used as keys. You went through the list, inserting each record in the hash table, after you first checked if it was already in the table.

Not strictly O (N), however, since hash search / insertion slows as the table populates. And from the storage point of view, this would be expensive for large lists - at least 3x and possibly 10-20x the size of an array of numbers.

+2

Hot licks Nov 24 '11 at 18:29

source share

As others have already mentioned, I see no way to do this in O (n).

However, you can try the probabilistic approach using the Bloom Filter . This will give you O (n) if you are lucky.

+2

ruslik Nov 28 '11 at 12:23

source share

Since the extra space is unacceptable, this cannot be done without comparison. The concept of a lower bound on the time complexity of sorting sorting can be applied here to prove that the problem in its original form cannot be solved in O (n) in the worst case.

0

bashrc Dec 9 '11 at 17:49

source share

Igor Oks · Accepted Answer · 2011-11-24T16:58:59+0000

If the array is not sorted, you can only do this in O(nlogn) .

Some approaches can be found here .

Algorithm for finding duplicate records in constant space and time O (n)

More articles: