The minimum steps required to make an array of integers adjacent

Given a sorted array of different integers, what is the minimum number of steps required for integer continuity? Here the condition is that: at one step only one element can be changed and can be increased or decreased by 1. For example, if we have 2,4,5,6 , then “2” can be made “3”, which makes the elements adjacent ( 3,4,5,6 ). Here the minimum steps are 1. Similarly for an array: 2,4,5,8 :

  • Step 1: "2" can be done "3"
  • Step 2: "8" can be done "7"
  • Step 3: "7" can be done "6"

So now the sequence is 3,4,5,6 , and the number of steps is 3.

I tried to do the following, but not sure if it is correct?

  //n is the number of elements in array a int count=a[n-1]-a[0]-1; for(i=1;i<=n-2;i++) { count--; } printf("%d\n",count); 

Thanks.

+7
source share
7 answers

The intuitive guess is that the "center" of the optimal sequence will be the arithmetic mean, but this is not so. Find the right solution with some vector math:

Part 1: Assuming that the first number will be left alone (we will consider this assumption later), calculate the differences, so 1 12 3 14 5 16 - 1 2 3 4 5 6 will give 0 -10 0 -10 0 -10 .

sidenote: Note that an “adjacent” array, by your implied definition, will be an increasing arithmetic sequence with a difference of 1. (Note that there are other reasonable interpretations of your question: some people may consider 5 4 3 2 1 adjacent or 5 3 1 be adjacent or 1 2 3 2 3 be adjacent. You also did not indicate whether negative numbers should be treated differently.)

: Adjacent numbers must lie between the minimum and maximum numbers. [proof left to reader]

Part 2: Now back to our example, assuming that we have completed 30 steps (sum (abs ( 0 -10 0 -10 0 -10 )) = 30) to turn 1 12 3 14 5 16 into 1 2 3 4 5 6 . This is one correct answer. But 0 -10 0 -10 0 -10 + c is also an answer that gives an arithmetic sequence of difference 1 for any constant c. In order to minimize the number of “steps”, we must choose the appropriate c. In this case, every time we increase or decrease c, we increase the number of steps by N = 6 (vector length). So, for example, if we wanted to turn our original sequence 1 12 3 14 5 16 into 3 4 5 6 7 8 (c = 2), then the differences would be 2 -8 2 -8 2 -8 and sum(abs(2 -8 2 -8 2 -8))=30 .

Now it is very clear if you can visualize it, but it is difficult to type in the text. First we took our difference vector. Imagine you drew it like this:

  4| 3| * 2| * | 1| | | * 0+--+--+--+--+--* -1| | -2| * 

We can “shift” this vector up and down by adding or subtracting 1 from everything. (This is equivalent to finding c.) We want to find a shift that minimizes the number | that you see (the area between the curve and the x axis). This is NOT average (this will be the minimization of the standard deviation or RMS error , not the absolute error). To find the minimization of c, we consider this as a function and consider its derivative. If the differences are all far from the x axis (we are trying to make 101 112 103 114 105 116 ), it makes sense just not to add this additional material, so we shift the function down along the x axis. Each time we decrease c, we improve the solution by 6. Now suppose that one of * passes the x axis. Each time we decrease c, we improve the solution by 5-1 = 4 (we save 5 work steps, but must perform 1 additional step for * below the x axis). In the end, when HALF * passes along the x axis, we SHOULD NOT IMPROVE THE SOLUTION (derivative: 3-3 = 0). (In fact, soon we begin to make the decision worse and never improve it again. We not only found the minimum of this function, but also see that this is a global minimum.)

So the solution is this: Pretend the first number in place. Calculate the vector of differences. Minimize the sum of the absolute value of this vector; do this by finding the median DIFFERENCES and subtracting it from the differences to get an improved vector of differences. The sum of the absolute value of the "improved" vector is your answer. This is O(N) . Equal optimality solutions will (as indicated above) always be “adjoining”. The only solution exists only if there is an odd number of numbers; otherwise, if there is an even number of numbers, and the median difference is not an integer, then in equally optimal solutions there will be difference vectors with correcting factors of any number between the two medians.

So, I think that this would not be complete without a final example.

  • input: 2 3 4 10 14 14 15 100
  • vector of differences: 2 3 4 5 6 7 8 9 - 2 3 4 10 14 14 15 100 = 0 0 0 -5 -8 -7 -7 -91
  • note that the medians of the difference vector are no longer in the middle, we need to run the search algorithm median O(N) to extract them ...
  • the medians of the difference vector -5 and -7
  • take -5 as our correction factor (any number between medians, such as -6 or -7, would also be the right choice)
  • Thus, our new goal is 2 3 4 5 6 7 8 9 + 5 = 7 8 9 10 11 12 13 14 , and the new differences are 5 5 5 0 -3 -2 -2 -86 *
  • this means that we need to perform 5+5+5+0+3+2+2+86 = 108 steps

* (we get this by repeating step 2 with our new goal or adding 5 to each number of the previous difference ... but since you only care about the amount, we just add 8 * 5 (vector length multiplies the correct coefficient) by the previously calculated amount )

As an alternative, we could also take -6 or -7 as our correction factor. Say we took -7 ...

  • then the new goal would be 2 3 4 5 6 7 8 9 + 7 = 9 10 11 12 13 14 15 16 and the new differences would be 7 7 7 2 1 0 0 -84
  • this would mean that we would need to do 7 + 7 + 7 + 2 + 1 + 0 + 0 + 84 = 108 steps, the same as above

If you yourself imitate this, you can see that the number of steps becomes> 108, as we make offsets further from the range [-5, -7].

pseudo code:

 def minSteps(array A of size N): A' = [0,1,...,N-1] diffs = A'-A medianOfDiffs = leftMedian(diffs) return sum(abs(diffs-medianOfDiffs)) 

Python:

 leftMedian = lambda x:sorted(x)[len(x)//2] def minSteps(array): target = range(len(array)) diffs = [ta for t,a in zip(target,array)] medianOfDiffs = leftMedian(diffs) return sum(abs(d-medianOfDiffs) for d in diffs) 

change

It turns out that for arrays of different integers this is equivalent to a simpler solution : choosing one of (up to 2) medians, assuming that they do not move, and accordingly move the other numbers. This simple method often gives incorrect answers if you have duplicates, but the OP did not ask about it, so this would be a simpler and more elegant solution. In addition, we can use the proof that I gave in this decision to justify the “suppose that the median is not moving” solution as follows: the correction factor will always be in the center of the array (that is, the median of differences will be from the median of numbers). Thus, any restriction that also guarantees this can be used to create variations of this brainteaser.

+10
source

Get one of the medians of all numbers. Since the numbers are already sorted, this should not be a big problem. Suppose the median is not moving. Then calculate the total cost of moving all the numbers accordingly. This should give an answer.

community change:

 def minSteps(a): """INPUT: list of sorted unique integers""" oneMedian = a[floor(n/2)] aTarget = [oneMedian + (i-floor(n/2)) for i in range(len(a))] # aTargets looks roughly like [mn/2?, ..., m-1, m, m+1, ..., m+n/2] return sum(abs(aTarget[i]-a[i]) for i in range(len(a))) 
+7
source

This is probably not an ideal solution, but the first idea.

Given the sorted sequence [x 1 , x 2 , ..., x n ]:

  • Write a function that returns the differences of an element to the previous and next element, that is (x n - x n - 1 , x n + 1 - x n ).

  • If the difference in the previous element is> 1, you will need to increase all the previous elements x n - x n - 1 - 1. That is, the number of necessary steps would increase by the number of previous elements and times; (x n - x n - 1 - 1). Let me call this number a.

  • If the difference in the next element is> 1, you will have to reduce all subsequent elements x n + 1 - x n - 1. That is, the number of necessary steps will increase by the number of subsequent elements and times; (x n + 1 - x n - 1). Let me call this number b.

  • If a <b, then increase all previous elements until they are adjacent to the current element. If a> b, then reduce all subsequent elements until they are adjacent to the current element. If a = b, it does not matter which of these two actions is selected.

  • Add the number of steps performed in the previous step (by increasing the total number of necessary steps with a or b) and repeat until all elements are adjacent.

+3
source

First of all, imagine that we choose an arbitrary target of adjacent increasing values, and then calculate the cost (the number of necessary steps) to modify the array that must correspond.

 Original: 3 5 7 8 10 16 Target: 4 5 6 7 8 9 Difference: +1 0 -1 -1 -2 -7 -> Cost = 12 Sign: + 0 - - - - 

Since the input array is already ordered and different, it strictly increases. Because of this, it can be shown that the differences will always not increase.

If we change the goal by increasing it by 1, the cost will change. Each position in which the difference is currently positive or equal to zero will increase the cost by 1. Each positive result in which the difference is currently negative will lead to a decrease in value by 1:

 Original: 3 5 7 8 10 16 New target: 5 6 7 8 9 10 New Difference: +2 +1 0 0 -1 -6 -> Cost = 10 (decrease by 2) 

Conversely, if we reduce the goal by 1, each position in which the difference is currently positive will lead to a decrease in value by 1, while each position in which the difference is zero or negative will increase the value by 1

 Original: 3 5 7 8 10 16 New target: 3 4 5 6 7 8 New Difference: 0 -1 -2 -2 -3 -8 -> Cost = 16 (increase by 4) 

To find the optimal values ​​for the target array, we must find a goal so that any change (increment or decrement) does not reduce the cost. Please note that incrementing the goal can only lower the cost when there are more positions with a negative difference than with a zero or positive difference. A decrease can only reduce the cost when there are more positions with a positive difference than with a zero or negative difference.

Here are some examples of difference sign distributions. Remember that the array of differences does not increase, so the positive points should always be the first, and the negative ones should be the last:

  CC + + + - - - optimal + + 0 - - - optimal 0 0 0 - - - optimal + 0 - - - - can increment (negatives exceed positives & zeroes) + + + 0 0 0 optimal + + + + - - can decrement (positives exceed negatives & zeroes) + + 0 0 - - optimal + 0 0 0 0 0 optimal CC 

Note that if one of the central elements (indicated by C) is zero, the goal should be optimal. In this case, at best, any increment or decrement will not change the value, but may increase it. This result is important because it gives us a trivial solution. We select the target so that a[n/2] remains unchanged. There may be other possible goals that give the same cost, but certainly not, which is better. Here the source code is changed to calculate this value:

 //n is the number of elements in array a int targetValue; int cost = 0; int middle = n / 2; int startValue = a[middle] - middle; for (i = 0; i < n; i++) { targetValue = startValue + i; cost += abs(targetValue - a[i]); } printf("%d\n",cost); 
+1
source

You cannot do this by repeating once on the array, that's for sure.
You need to first check the difference between every two numbers, for example:
2,7,8,9 can be 2,3,4,5 with 18 steps or 6,7,8,9 with 4 steps.
Create a new array with the difference: for 2,7,8,9 it will be 4,1,1 . Now you can decide whether to increase or decrease the first number.

0
source

Suppose an adjacent array looks something like this:

cc + 1 c + 2 c + 3 .. etc.

Now let's take an example -

5 7 8 10

The adjacent array in this case will be -

cc + 1 c + 2 c + 3

To get the minimum steps, the sum of the modulus of the integer difference (before and after) wrt of the i-th index should be minimal. In this case

(c-5) ^ 2 + (c-6) ^ 2 + (c-6) ^ 2 + (c-7) ^ 2 should be minimal

Let f (c) = (c-5) ^ 2 + (c-6) ^ 2 + (c-6) ^ 2 + (c-7) ^ 2 = 4c ^ 2 - 48c + 146

Using differential calculus to get the lows,

f '(c) = 8c - 48 = 0 => c = 6

So, our adjacent array is 6 7 8 9, and the minimum cost is 2.

To summarize, just generate f (c), get the first differential and find out c. This should take O (n).

0
source

Brute Force Approach O(N*M)

If you draw a line through each point of array a , then y0 is the value, where each line starts at index 0 . Then the answer will be the minimum of the number of steps required to go from a to every line starting with y0 in Python:

 y0s = set((y - i) for i, y in enumerate(a)) nsteps = min(sum(abs(y-(y0+i)) for i, y in enumerate(a)) for y0 in xrange(min(y0s), max(y0s)+1))) 

Enter

 2,4,5,6 2,4,5,8 

Exit

 1 3 
-one
source

All Articles