Find the minimum cost of converting an array to an arithmetic progression

I recently came across this question in an interview. I could not come up with an algorithm for this.

Given an array of unsorted integers, we need to find the minimum cost at which this array can be converted into an arithmetic progression, where the cost of 1 unit occurs if any element changes in the array. In addition, the value of an element changes between (-inf, inf).

I realized that DP can be used here, but I could not solve the equation. There were some limitations on values, but I don’t remember them. I'm just looking for high level pseudo code.

+7
algorithm
source share
5 answers

EDIT Here, unfortunately, the correct solution, although it is simple to understand it, is not very effective for O (n ^ 3).

function costAP(arr) { if(arr.length < 3) { return 0; } var minCost = arr.length; for(var i = 0; i < arr.length - 1; i++) { for(var j = i + 1; j < arr.length; j++) { var delta = (arr[j] - arr[i]) / (j - i); var cost = 0; for(var k = 0; k < arr.length; k++) { if(k == i) { continue; } if((arr[k] + delta * (i - k)) != arr[i]) { cost++; } } if(cost < minCost) { minCost = cost; } } } return minCost; } 
  • Find the relative delta between each individual pair of indices in an array
  • Use relative delta to check the cost of converting the entire array to AP using this delta
  • Minimum refund
+2
source share

Louis Ricci had the correct basic idea of ​​finding the largest existing arithmetic progression, but suggested that it should appear in one pass, when in fact elements of this progression can appear in any subset of positions, for example:

 1 42 3 69 5 1111 2222 8 

only 4 changes required:

  42 69 1111 2222 1 3 5 8 

To calculate this, note that each access point has the rightmost element. We can assume that each element i of the input vector will be in the extreme right position of the AP in turn, and for each such, we will consider all positions j to the left of i, determining the step size implied for each (i, j) combination, and when it an integer (indicating a valid access point), add one to the number of elements that imply this step size and end at position i, since all such elements belong to the same access point. The total maximum is the longest AP:

 struct solution { int len; int pos; int step; }; solution longestArithProg(vector<int> const& v) { solution best = { -1, 0, 0 }; for (int i = 1; i < v.size(); ++i) { unordered_map<int, int> bestForStep; for (int j = 0; j < i; ++j) { int step = (v[i] - v[j]) / (i - j); if (step * (i - j) == v[i] - v[j]) { // This j gives an integer step size: record that j lies on this AP int len = ++bestForStep[step]; if (len > best.len) { best.len = len; best.pos = i; best.step = step; } } } } ++best.len; // We never counted the final element in the AP return best; } 

The above C ++ code uses O (n ^ 2) time and O (n) space, as it iterates over each pair of positions i and j, performing one hash for reading and writing for each. To answer the original problem:

 int howManyChangesNeeded(vector<int> const& v) { return v.size() - longestArithProg(v).len; } 
+2
source share

Given an array a = [a_1, a_2, ..., a_n] unsorted integers, let diffs = [a_2-a_1, a_3-a_2, ..., a_n-a_(n-1)] .

Find the maximum value in diffs and adjust any values ​​in a necessary so that all neighboring values ​​differ by this value.

0
source share

Interestingly, even I had the same question in my campus recruitment test. During the test itself, I realized that this logic of changing elements, based on the most frequent differences between the two subsequent elements of the array, fails in some cases. For example, 4,5,8,9. According to the logic a2-a1, a3-a2, as suggested above, answer 1, which is not so. Since you proposed DP, I feel that it may be in the order of considering 2 values ​​for each element in the value of the array when it changes, and also when it does not change and returns a minimum of 2. Finally it ends when you reach the end of the array.

0
source share

This problem has a simple geometric interpretation, which shows that it can be solved O (n ^ 2) times and probably cannot be solved faster (short for 3SUM ). Suppose our array is [1, 2, 10, 3, 5] . We can write this array as a sequence of points

 (0,1), (1,2), (2,10), (3,3), (4,5) 

in which the value x is the index of the array element, and the y-value is the value of the array element. Now the question is to find a line that conveys the maximum possible number of points in this set. The cost of transforming an array is the number of points that are not on the line, which is minimized when the maximum number of points on the line.

A decisive answer to this question is given in this SO publication: What is the most efficient algorithm for finding a straight line through most points?

Idea: for each point P in the set from left to right, find the line passing through this point and the maximum number of points to the right of P. (We do not need to look at the points to the left of P, because they would be caught at an earlier iteration).

To find the maximum number of P-collinear points to the right of P, for each such point Q we calculate the slope of the segment PQ. Focus the various slopes on the hash map. The slope that displays the maximum number of hits is what you are looking for.

Technical issue: You probably don't want to use floating point arithmetic to calculate slopes. On the other hand, if you use rational numbers, you will probably have to calculate the greatest common factor to compare fractions by comparing the numerator and denominator, which multiplies the operating time by the coefficient log n. Instead, you should verify that the rational numbers a/b and c/d equal by checking ad == bc .

The SO link, the link to which is given above, gives an abbreviation for 3SUM , i.e. this problem is 3SUM-hard, which shows that if this problem can be solved much faster than O (n ^ 2), then 3SUM can also be solved much faster than O (n ^ 2). Here a condition occurs, which includes integers (-inf, inf). If it is known that integers are taken from a limited set, the abbreviation from 3SUM is not final.

An interesting question is whether the idea on Wikipedia can solve 3SUM in O (n + N log N) time, when integers are in a limited set (-N, N), can be used to solve the minimum cost to convert an array to an AP problem in time faster than O (n ^ 2).

0
source share

All Articles