This seems like a problem for dynamic programming. When you create an array, you create another array containing the cumulative sum up to each specific index. Therefore, each i in this array has sums from 1..i .
Now it is easy to see that the sum of the values ​​for the indices p..q is SUM(q) - SUM(p-1) (with the special case that SUM(0) - 0 ). Obviously, here I am using indexes based on 1 ... This operation is O (1), so now you only need the O (n) algorithm to find the best one.
A simple solution is to track p and q and move them around the array. You start with q . Then you cut p and expand q several times, like a caterpillar scanning your array.
To expand q :
p <- 1 q <- 1 while SUM(q) - SUM(p-1) < K q <- q + 1 end while
Now q is in a position where the sum of the submarine has just exceeded (or is equal to) K The length of the subarray is q - p + 1 .
After the q loop, you check to see if the length of the subarray is less than your current one. Then you advance p one step (so you don't accidentally miss the optimal solution) and go back in.
You don't need to create a SUM array ... You can just build the sum of the subarray as you go ... You need to return to using the "real" p instead of once before that.
subsum <- VAL(1) p <- 1 q <- 1 while q <= N -- Expand while q < N and subsum < K q <- q + 1 subsum <- subsum + VAL(q) end while -- Check the length against our current best len <- q - p + 1 if len < bestlen ... end if -- Contract subsum <- subsum - VAL(p) p <- p + 1 end while
Notes:
j_random_hacker said: this will help explain why it is acceptable to consider only O (n) different subarrays, what this algorithm considers, and not all O (n ^ 2) possible different subarrays
The philosophy of dynamic programming:
- the solution path that leads to a suboptimal result does not follow; and
- use the knowledge of previous solutions to calculate a new solution.
In this case, one candidate for a solution (some (p,q) such that p <= q ) is calculated by summing the elements. Since these elements are positive integers, we know that for any candidate candidate (p,q) (p,q+1) will be more candidate solutions (p,q+1) .
So, we know that if (p,q) is a minimal solution, then (p,q+1) is not. We finish our search as soon as we have a candidate, and check if this candidate is better than any that we have seen so far. This means that for each p we only need to check one candidate. This leads to the fact that both p and q only increase, and, therefore, the search is linear.
Another part of this (using the previous solutions) comes from recognizing that sum(p,q+1) = sum(p,q) + X(q+1) and similarly sum(p+1,q) = sum(p,q) - X(p) . Therefore, we do not need to sum all the elements between p and q at each step. We only need to add or subtract one value when we advance one of the search pointers.
Hope this helps.