Find duplicate element in array in time O (n)

Question

Find duplicate element in array in time O (n)

I was asked this question at an interview, and I wondered about the correct answer.

You have an array of numbers from 0 to n-1, one of the numbers is deleted and replaced by the number already in the array, which makes a duplicate of this number. How can we detect this duplicate in O (n) time?

For example, an array of 1,2,3,4 becomes 1,2,2,4 .

A simple O (n ² ) time solution is to use a nested loop to find a duplicate of each element.

+60

java arrays algorithm

Marwan Tushyeh Feb 18 '13 at 20:09

source share

24 answers

This can be done in the spaces O(n) and O(1) .

(The algorithm only works because numbers are integers in a known range):

In one pass through the vector, calculate the sum of all numbers and the sum of the squares of all numbers.

Subtract the sum of all numbers from N(N-1)/2 . Call it A

Subtract the sum of the squares from N(N-1)(2N-1)/6 . Divide it by A Call result B

The number that has been deleted is (B + A)/2 , and the number that has been replaced is (B - A)/2 .

Example:

Vector [0, 1, 1, 2, 3, 5] :

N = 6
The sum of the vector is 0 + 1 + 1 + 2 + 3 + 5 = 12. N (N-1) / 2 is 15. A = 3.
The sum of the squares is 0 + 1 + 1 + 4 + 9 + 25 = 40. N (N-1) (2N-1) / 6 is 55. B = (55-40) / A = 5.
The number that was deleted is (5 + 3) / 2 = 4.
The number that it has been replaced is (5 - 3) / 2 = 1.

Why does it work:

The sum of the original vector [0, ..., N-1] is N(N-1)/2 . Suppose the value of A been deleted and replaced with B Now the sum of the modified vector will be N(N-1)/2 + b - a . If we subtract the sum of the modified vector from N(N-1)/2 , we get a - b . So A = a - b .
Similarly, the sum of the squares of the original vector is N(N-1)(2N-1)/6 . The sum of the squares of the modified vector is N(N-1)(2N-1)/6 + b ² - a ² . Subtracting the sum of the squares of the modified vector from the original sum, we get a ² - b ² , which coincides with (a+b)(ab) . So, if we divide it by a - b (i.e., A ), we get B = a + b .
Now B + A = a + b + a - b = 2a and B - A = a + b - (a - b) = 2b .

+151

rici Feb 18 '13 at 22:50

source share

You can do this O (N) times without extra space. Here's how the algorithm works:

Iterate through the array as follows:

For each item encountered, set its corresponding index value to negative. For example: if you find [0] = 2. Received [2] and negated the value.
By doing this, you note that this is occurring. Since you know that you cannot have negative numbers, you also know that it is you who deny this.
Make sure the index corresponding to the value is already marked negative, if so, you get a duplicate element. For example: if a [0] = 2, go to [2] and check if it is negative.

Let's say you have the following array:

 int a[] = {2,1,2,3,4};

After the first element, your array will be:

 int a[] = {2,1,-2,3,4};

After the second element, your array will be:

 int a[] = {2,-1,-2,3,4};

When you reach the third element, you go to [2] and see it is already negative. You get a duplicate.

+30

user1117564 May 30 '13 at 13:20

source share

Scan an array 3 times:

XOR combine all elements of the array -> A XOR combines all numbers from 0 to N-1 → B Now A XOR B = X XOR D , where X is the deleted item and D is the duplicate item.
Select any nonzero bit in A XOR B XOR to combine all the elements of the array in which this bit is set → A1 . XOR combines all the numbers from 0 to N-1 where this bit is set → B1 . Now either A1 XOR B1 = X or A1 XOR B1 = D
Scan the array again and try to find A1 XOR B1 . If found, this is a duplicate of the item. If not, then the repeating element is A XOR B XOR A1 XOR B1 .

+10

Evgeny Kluev Feb 18 '13 at 20:42

source share

Use a HashSet to hold all the numbers you have already seen. It works in (depreciated) time O(1) , so the total value is O(N) .

+7

parsifal Feb 18 '13 at 20:12

source share

I suggest using BitSet. We know that N is small enough to index the array, so the bit-bit will have a reasonable size.

For each element of the array, check the bit corresponding to its value. If it is already installed, then there is a duplicate. If not, set the bit.

+7

Patricia Shanahan Feb 18 '13 at 20:23

source share

Use a hash table. The inclusion of an item in a hash table is O (1).

+3

Juan Lopes Feb 18 '13 at 20:12

source share

@rici is right about using time and space: "This can be done in O (n) time and O (1) space."

However, the question can be expanded to a broader requirement: there is no need for only one duplicate number, and the numbers could not be sequential.

OJ puts it like this here : (Note 3, apparently, can be narrowed down)

Given the nums array containing n + 1 integers, where each integer is between 1 and n (inclusive), prove that there must be at least one duplicate number. Assuming there is only one duplicate number, find the duplicate.
Note:
You should not modify the array (suppose the array is read-only).
You should use only constant, O (1) extra space.
The complexity of the execution must be less than O (n2).
There is only one duplicate number in the array, but it can be repeated more than once.

The question is very well explained and answered here by Keith Schwartz using the Floyd Loop Search Algorithm :

The main trick we need to use to solve this problem is to notice that, since we have an array of n elements from 0 to n - 2, we can represent the array as a definition of a function f from the set {0, 1 , ..., n - 1} on itself. This function is defined as f (i) = A [i]. Given this setting, the duplicated value corresponds to a pair of indices i! = J such that f (i) = f (j). Therefore, our task is to find this pair (i, j). Once we get it, we can easily find the duplicate value by simply choosing f (i) = A [i].
But how do we find this recurring value? It turns out that this is a well-studied problem in computer science called loop detection. The general view of the problem is as follows. We are given the function f. Define the sequence x_i as

  x_0 = k (for some k) x_1 = f(x_0) x_2 = f(f(x_0)) ... x_{n+1} = f(x_n)

Assuming that f maps from region to region, this function will have one of three forms. Firstly, if the region is infinite, then the sequence can be infinitely long and not repeating. For example, the function f (n) = n + 1 for integers has this property - not a single number is duplicated. Secondly, the sequence can be closed, which means that there exists some self such that x_0 = x_i. In this case, the sequence cycles through some fixed set of values infinitely. Finally, the sequence may be "rho-shaped." In this case, the sequence looks something like this:

  x_0 -> x_1 -> ... x_k -> x_{k+1} ... -> x_{k+j} ^ | | | +-----------------------+

That is, the sequence begins with a chain of elements that enters the cycle, and then cyclically continues indefinitely. We will denote the first element of the cycle, which is achieved in the sequence of "record" of the cycle.

The python implementation can also be found here :

 def findDuplicate(self, nums): # The "tortoise and hare" step. We start at the end of the array and try # to find an intersection point in the cycle. slow = 0 fast = 0 # Keep advancing 'slow' by one step and 'fast' by two steps until they # meet inside the loop. while True: slow = nums[slow] fast = nums[nums[fast]] if slow == fast: break # Start up another pointer from the end of the array and march it forward # until it hits the pointer inside the array. finder = 0 while True: slow = nums[slow] finder = nums[finder] # If the two hit, the intersection index is the duplicate element. if slow == finder: return slow

+3

Stephenye Jun 22 '16 at 2:52

source share

One working solution:

number of paragraphs - integers

create an array from [0 .. N]

 int[] counter = new int[N];

Then read again and add a counter:

  if (counter[val] >0) { // duplicate } else { counter[val]++; }

+2

AlexWien Feb 18 '13 at 20:26

source share

This is an alternative solution in the spaces O(n) and O(1) . He looks like rici's . I find it a little easier to understand, but in practice it will overflow faster.

Let X be the missing number and R repeating number.

We can assume that the numbers are taken from [1..n] , i.e. zero does not appear. In fact, while looping through the array, we can check if zero was found and return immediately if not.

Now consider:

 sum(A) = n (n + 1) / 2 - X + R product(A) = n! R / X

where product(A) is the product of all elements in A skipping zero. We have two equations with two unknowns, from which X and R can be obtained algebraically.

Edit : for a popular query, here is an example:

Let set:

 S = sum(A) - n (n + 1) / 2 P = n! / product(A)

Then our equations become:

 R - X = S X = RP

which can be decided to:

 R = S / (1 - P) X = PR = PS / (1 - P)

Example:

 A = [0 1 2 2 4] n = A.length - 1 = 4 S = (1 + 2 + 2 + 4) - 4 * 5 / 2 = -1 P = 4! / (1 * 2 * 2 * 4) = 3 / 2 R = -1 / (1 - 3/2) = -1 / -1/2 = 2 X = 3/2 * 2 = 3

+1

Gil Vegliach Jan 11 '17 at 20:55 on

source share

You can proceed as follows:

sort your array using linear time sorting algorithm (e.g. sorting sort) - O (N)
scan the sorted array and stop as soon as two consecutive elements are equal - O (N)

0

Andrea Scarcella Feb 18 '13 at 21:33

source share

 public class FindDuplicate { public static void main(String[] args) { // assume the array is sorted, otherwise first we have to sort it. // time efficiency is o(n) int elementData[] = new int[] { 1, 2, 3, 3, 4, 5, 6, 8, 8 }; int count = 1; int element1; int element2; for (int i = 0; i < elementData.length - 1; i++) { element1 = elementData[i]; element2 = elementData[count]; count++; if (element1 == element2) { System.out.println(element2); } } } }

0

Sourav Apr 21 '14 at 0:10

source share

  public void duplicateNumberInArray { int a[] = new int[10]; Scanner inp = new Scanner(System.in); for(int i=1;i<=5;i++){ System.out.println("enter no. "); a[i] = inp.nextInt(); } Set<Integer> st = new HashSet<Integer>(); Set<Integer> s = new HashSet<Integer>(); for(int i=1;i<=5;i++){ if(!st.add(a[i])){ s.add(a[i]); } } Iterator<Integer> itr = s.iterator(); System.out.println("Duplicate numbers are"); while(itr.hasNext()){ System.out.println(itr.next()); } }

First of all, create an array of integers using the Scanner class. Then repeating the cycle through the numbers and checking whether it is possible to add a number for installation (numbers can be added for installation only when this particular number should not already be installed, means that set does not allow duplicating the number for adding and returning boolean vale FALSE when adding duplicate value). If not. cannot be added, which means it is duplicated, so add a duplicate number to another set so that we can print later. Please note that we add a duplicate number to the set, because it is possible that a repeating number can be repeated several times, so add it only once. At the end, we print the set using Iterator.

0

Khan Sep 17 '14 at 8:57

source share

// This is similar to the HashSet approach, but uses only one data structure:

  int[] a = { 1, 4, 6, 7, 4, 6, 5, 22, 33, 44, 11, 5 }; LinkedHashMap<Integer, Integer> map = new LinkedHashMap<Integer, Integer>(); for (int i : a) { map.put(i, map.containsKey(i) ? (map.get(i)) + 1 : 1); } Set<Entry<Integer, Integer>> es = map.entrySet(); Iterator<Entry<Integer, Integer>> it = es.iterator(); while (it.hasNext()) { Entry<Integer, Integer> e = it.next(); if (e.getValue() > 1) { System.out.println("Dupe " + e.getKey()); } }

0

Max Tomlinson Oct 10 '14 at 23:38

source share

We can use hashMap effectively:

 Integer[] a = {1,2,3,4,0,1,5,2,1,1,1,}; HashMap<Integer,Integer> map = new HashMap<Integer,Integer>(); for(int x : a) { if (map.containsKey(x)) map.put(x,map.get(x)+1); else map.put(x,1); } Integer [] keys = map.keySet().toArray(new Integer[map.size()]); for(int x : keys) { if(map.get(x)!=1) { System.out.println(x+" repeats : "+map.get(x)); } }

0

hitesh141 Apr 23 '15 at

source share

This program is based on C #, and if you want to make this program using a different programming language, you must first change the array in ascending order and compare the first element with the second element. If it is equal, then a repeated number is found. Program

 int[] array=new int[]{1,2,3,4,5,6,7,8,9,4}; Array.Sort(array); for(int a=0;a<array.Length-1;a++) { if(array[a]==array[a+1] { Console.WriteLine("This {0} element is repeated",array[a]); } } Console.WriteLine("Not repeated number in array");

0

Rishabh Rawat Sep 07 '15 at 13:17

source share

sort array O (n ln n)

using a sliding window trick to move an O (n) array

The space is O (1)

 Arrays.sort(input); for(int i = 0, j = 1; j < input.length ; j++, i++){ if( input[i] == input[j]){ System.out.println(input[i]); while(j < input.length && input[i] == input[j]) j++; i = j - 1; } }

Test case int [] {1, 2, 3, 7, 7, 8, 3, 5, 7, 1, 2, 7}

output 1, 2, 3, 7

0

AhDah Sep 14 '15 at 2:58

source share

This question may be the same question as check whether there is a circle in the array , here is one of the good links to solve it.

0

zangw Sep 29 '15 at 1:55

source share

Go through the array and check the sign of array[abs(array[i])] , if positive make it negative, and if it is negative, print it as follows:

 import static java.lang.Math.abs; public class FindRepeatedNumber { private static void findRepeatedNumber(int arr[]) { int i; for (i = 0; i < arr.length; i++) { if (arr[abs(arr[i])] > 0) arr[abs(arr[i])] = -arr[abs(arr[i])]; else { System.out.print(abs(arr[i]) + ","); } } } public static void main(String[] args) { int arr[] = { 4, 2, 4, 5, 2, 3, 1 }; findRepeatedNumber(arr); } }

Link: http://www.geeksforgeeks.org/find-duplicates-in-on-time-and-constant-extra-space/

0

Arpit Jan 16 '16 at 11:09

source share

As described

You have an array of numbers from 0 to n-1, one of the numbers is deleted and replaced by the number already in the array, which makes a duplicate of this number.

I assume that the elements in the array are sorted, except for the duplicate entry. If this is a scenario, we can easily achieve the goal, as shown below:

  public static void main(String[] args) { //int arr[] = { 0, 1, 2, 2, 3 }; int arr[] = { 1, 2, 3, 4, 3, 6 }; int len = arr.length; int iMax = arr[0]; for (int i = 1; i < len; i++) { iMax = Math.max(iMax, arr[i]); if (arr[i] < iMax) { System.out.println(arr[i]); break; }else if(arr[i+1] <= iMax) { System.out.println(arr[i+1]); break; } } }

O (n) time and O (1) space, please share your thoughts.

0

sahaS Jun 23 '17 at 17:08

source share

Sketch of my solution:

save Sum variable, which is initially zero.
Iterate over the array and add the current record to Sum, if it is odd,
subtract it when it is even.

As a result, you get the Sum value, which is equal to plus or minus twice the number.

You will need to correct the sum so that the length N of the array is odd or even (add or subtract N).

0

John Pool Dec 07 '18 at 7:02

source share

This can be done in time O (n) and space O (1) . Without changing the input array

The idea is similar to finding the start node of a loop in a linked list.
Maintain two pointers: fast and slow

 slow = a[0] fast = a[a[0]]

cycle to slow! = fast
As soon as we find the loop (slowly == fast)
Reset Slow Return to Zero

 slow = 0

find start node

 while(slow != fast){ slow = a[slow]; fast = a[fast]; }

slow your duplicate.

Here is the Java implementation:

 class Solution { public int findDuplicate(int[] nums) { if(nums.length <= 1) return -1; int slow = nums[0], fast = nums[nums[0]]; //slow = head.next, fast = head.next.next while(slow != fast){ //check for loop slow = nums[slow]; fast = nums[nums[fast]]; } if(slow != fast) return -1; slow = 0; //reset one pointer while(slow != fast){ //find starting point of loop slow = nums[slow]; fast = nums[fast]; } return slow; } }

0

Ankit Sharma Jun 02 '19 at 9:02

source share

Here is a simple solution with hashmap in O (n) time.

 #include<iostream> #include<map> using namespace std; int main() { int a[]={1,3,2,7,5,1,8,3,6,10}; map<int,int> mp; for(int i=0;i<10;i++){ if(mp.find(a[i]) == mp.end()) mp.insert({a[i],1}); else mp[a[i]]++; } for(auto i=mp.begin();i!=mp.end();++i){ if(i->second > 1) cout<<i->first<<" "; } }

0

Shadab Eqbal Jul 25 '19 at 8:32

source share

 int a[] = {2,1,2,3,4}; int b[] = {0}; for(int i = 0; i < a.size; i++) { if(a[i] == a[i+1]) { //duplicate found //copy it to second array b[i] = a[i]; } }

-2

mirfan899 May 20 '13 at 18:58

source share

qPCR4vir · Accepted Answer · 2013-02-18 20:26

We have the original array int A[N]; We create the second array bool B[N] also of the type bool=false . Iterate the first array and set B[A[i]]=true if it was false, otherwise bing!

Find duplicate element in array in time O (n)

Example:

Why does it work:

More articles: