Two pairs of numbers with the same sum

Given the list of [a_1 a_2 ... a_n] (not necessarily different) integers, determine if there are pairwise distinct indices w,x,y,z such that a_w + a_x = a_y + a_z .

I know that one way is to use 4 levels of for loops, each of which iterates over one of the indices. When we get equal amounts, check if all the indices are pairwise distinct. If they are, return true . If we run out of options, return false . This has an O(n^4) runtime.

Can we do better?

+7
source share
3 answers

Calculate all possible values ​​for a_w + a_x , insert them into the hash table. Insert (a_w + a_x, w) and (a_w + a_x, x) into the second hash table.

Before inserting a value into the first hash table, check if it is already in the table. If yes, check the second table. If there is one of (a_w + a_x, w) or (a_w + a_x, x), do not insert anything (we have a duplicate element). If none of these pairs is in the second table, we have a positive answer.

If, after processing all (w, x) pairs, we do not have a positive answer, this means that there are no such pairwise different indices.

The complexity of time is O (n 2 ). Space requirements are also O (n 2 ).

You can do the same in O (n) space, but O (n 2 * log (n)) time with a slightly modified algorithm from this answer: Sum-subset with a fixed size of the subset :

  • Sort list.
  • Use a priority queue for items containing a_w + a_x as a key and w, x as values. Pre-fill this queue with elements n-1 , where x = 0 and w = 1 .. n-1.
  • Re-enter the minimum element (sum, w, x) from this queue and put the element (a_w + a_x_plus_1, w, x+1) in the queue (but do not put the elements when x> = w). Stop if two consecutive items removed from the queue have the same sum.
  • To handle duplicates, you can compare w, x of two consecutive elements that have an equal sum. But it’s easier to use krijampani for pre-processing. If the sorted list contains two pairs of duplicates or one element is duplicated 4 times, success. Otherwise, no more than one value is duplicated; leave only one copy of this list in the list and add its doubled value to the priority queue together with the β€œspecial” pair of indices: (2a, -1, -1).
+4
source

Evgeny's solution may be some, which is simplified by preprocessing the original array as follows.

First, we use a hash table to calculate the frequency of each element in the original array. If at least 2 elements have duplicates (their frequency is at least 2) or if the element has a frequency of at least 4, then the answer will be true . Otherwise, if the element a occurs with a frequency of 2 or 3, we add 2a to the second hash table and replace all copies of a with one copy in the original array.

Then, in the modified array for each pair of indices i , j with i < j we add a_i + a_j to the second hash table and return true if we find a duplicate record in this hash table.

+3
source

If you have 8.5 GB of memory (more for unsigned integers, less if sums or indices do not cover the entire int range), create three arrays. First, 1 bit is used for each possible amount. This is a bitmap image of the results. The second uses 32 bits for every possible amount. He writes index j. The third uses 1 bit for every possible amount. This is a bit field that records whether this sum was encountered in the current iteration, I - zero with each iteration. Iteration i = 0 ... n and j = i + 1 ... n. For each sum, see If it is set in the first bit field (if it was met before). If so, see if the index written in the 2nd array matches either i or j (if old j matches either new i or new j). If this is not the case, make sure that the bit in the 2nd array is set (if it was set in the current iteration, so the old one corresponds to the new i). If not, you have a match! (Old I will never match old j or new j, and new I will never match new j.) Exit. Otherwise, write down the sum in all three arrays and continue.

Although it uses a $ 40 memory (I love the present :), it is probably much faster than using hash cards and boxing. May even use less memory for large n. One data flaw will almost never be in the L2 cache. But try installing the JVM to use huge pages, so that at least a miss TLB also does not fall into the main memory. This is o (n ^ 2) for processing and o (1) for memory.

0
source

All Articles