Find duplicate value from array of strings

I found two ways to find the duplicate value from a string array.

First way:

private static String FindDupValue(String[] sValueTemp) { for (int i = 0; i < sValueTemp.length; i++) { String sValueToCheck = sValueTemp[i]; if(sValueToCheck==null || sValueToCheck.equals(""))continue; for (int j = 0; j < sValueTemp.length; j++) { if(i==j)continue; String sValueToCompare = sValueTemp[j]; if (sValueToCheck.equals(sValueToCompare)){ return sValueToCompare; } } } return ""; } 

The second way:

 private static String FindDupValueUsingSet(String[] sValueTemp) { Set<String> sValueSet = new HashSet<String>(); for(String tempValueSet : sValueTemp) { if (sValueSet.contains(tempValueSet)) return tempValueSet; else if(!tempValueSet.equals("")) sValueSet.add(tempValueSet); } return ""; } 

Both methods are correct.

My question is which of the best ways and why? Or is there any other better way to find duplicate value from an array?

+6
source share
5 answers

The good thing about Set is that adding an operation returns true if this set does not already contain the specified element.

 public static void main(String[] args) { Set<String> set = new HashSet<>(); String[] stringsToTest = {"a", "b", "c", "a"}; for (String s : stringsToTest) { boolean notInSetYet = set.add(s); if (!notInSetYet) { System.out.println("Duplicate: " + s); } } } 

Output:

Duplicate: a

+2
source

The second way.

Using Set is much more efficient for this operation because sValueSet.contains(tempValueSet) uses a reverse map (and therefore hash codes and fast search time) instead of a full iteration.

+1
source

Both approaches are almost identical in terms of algorithmic complexity.

The first difficulty of the approach is O(N * N) , where N is the length of the array. I donโ€™t think itโ€™s necessary to explain why, but just in case, these nested loops take N * N units of time, which gives complexity.

As for the second approach - the presence of a HashSet allows you to search with constant complexity ( O(1) ), since the search is based on a hashed String value. You might think that this approach is more efficient, but in reality it is not so much, because we need to meet the paste operation on a HashSet .

Adding to a HashSet has O(N) complexity (worst case scenario). For N String objects, you can have N insert operations, which again gives O(N * N) complexity.

So, to summarize, both approaches have similar costs. I would prefer the second option, although it would be more readable.

+1
source

in your first approach in the second cycle, I believe that if you change the starting point of the cycle from j = 0 to j = i, it will make it faster. because you will avoid comparing two values โ€‹โ€‹twice

 private static String FindDupValue(String[] sValueTemp) { for (int i = 0; i < sValueTemp.length; i++) { String sValueToCheck = sValueTemp[i]; if(sValueToCheck==null || sValueToCheck.equals(""))continue; for (int j = i; j < sValueTemp.length; j++) { if(i==j)continue; String sValueToCompare = sValueTemp[j]; if (sValueToCheck.equals(sValueToCompare)){ return sValueToCompare; } } } return ""; 

}

+1
source

This is apparently one of the fastest approaches running in O (n), assuming O (1) HashSet.add for HashSet.add , plus requiring only one hash calculation per iteration, omitting the use of contains . It is true that String caches hashcode (thanks to Jonah), this code generalizes the concept of lowering contains .

 private static String FindDupValueUsingSet(String[] sValueTemp) { Set<String> sValueSet = new HashSet<String>(); for(String tempValueSet : sValueTemp) if (!tempValueSet.equals("")) //exclude empty Strings (add null checking if required) if (!sValueSet.add(tempValueSet)) return tempValueSet; return ""; } 
+1
source

All Articles