Subqueries with a variety of algorithm numbers

Question

Subqueries with a variety of algorithm numbers

I tried to come up with an efficient algorithm for the problem below, but I think I failed. I am assigned a n * n board with different numbers in it and an integer k (k <= n). I have to find the square k * k contained on the board, where the number of different numbers is the largest. For these examples:

n = 4 k = 3
10 9 8 1
7 6 5 7
5 3 0 2
3 4 1 3

n = 4 k = 2
1 2 1 2
2 1 2 1
1 2 1 2
2 1 3 4

The answers are as follows:

9 8 1
6 5 7
3 0 2

12
3 4

My solution to this problem (in C ++) is based on choosing the first square k * k in the upper left corner, creating a map linking the number (key) with its frequency of occurrence (value). Then I move the square one column, removing the first column of the square on the map and adding the next column. When I reach the right side, I go down one row and go to the border. Then one step down and right to the border. And so on until I get to the end. The answer is based on the maximum card size at a given point. I assume that this solution is rather ill-conceived (but probably even better than brute force), I appreciate any suggestions. Can this problem somehow simplify the problem with the modified maximum rectangle? ( http://www.drdobbs.com/database/184410529 )

EDIT (additional information) as proposed by Daniel

At the beginning, my algorithm analyzes the first square k * k, that is: 10 9 8 | 7 6 5 | 5 3 0. As each element is analyzed, it writes the corresponding data to the card. So, first I have a pair (10 → 1) (the number 10 appeared once), then I add (9 → 1), (8 → 1), (7 → 1), (6 → 1), (5 → 1 ) Then I meet the next 5, so I change its appearance to two (5 → 2). And finally, I add (3 → 1), (0 → 1). In fact, my map contains 8 elements (because, as mentioned above, 5 happened twice). I remember these square coordinates and the size of the map. I move my k * k square one column to the right. Therefore, I reduce the appearance of elements from the first column on my map. Therefore, I delete the pair (10 → 1) and (7 → 1) and change (5 → 2) to (5 → 1). And I add the last column: (1 → 1), (7 → 1) and (2 → 1) (since all numbers are new). Now I note that the size of the map is larger than before (9> 8), so I keep the current coordinates in the old. In fact, I finish my algorithm here (my additional condition is if (map.size () == k * k) end;), but otherwise I would "go" one line lower than left to the border, and thus I will be finished analysis of all possible squares k * k.

In fact, I am looking for the best solution in the means of consuming time, since my decision is rejected by the testing system (I exceed the deadlines). I think this is better than brute force, because I do not analyze each square one by one, but I can be wrong. In any case, this is still not enough.

I can attach C ++ code if it makes it easier for you, but I doubt it will help. I'm just looking for algorithm options.

+4

algorithm square

Michal b Jun 2 '12 at 14:16

source share

1 answer

xan · Answer 1 · 2012-06-16T03:36:34+0000

Your algorithm sounds good, with O(n * n * k * log k) time complexity and O(k * k) memory. If you know that the values are integers in your example, you can get rid of log k by replacing the map with an array. Otherwise, it is possible that your code implements inefficiency that implements the algorithm. Try choosing a time when you change n and k to see if the time is growing as expected.

As another possible direction, you can try a dynamic programming solution. Define the function f(x, y, a, b) to compute a set of unique values (possibly a bitmap) in the a x b rectangle anchored in (x, y) . Then the problem is to find the maximum |f(x, y, k, k)| . f(x, y, a, b) calculated as the union of 4 or more smaller rectangular sets with a size of approximately a/2 x b/2 . If smaller rectangular sets are cached, you do not have to re-arrange them. It will take a lot of memory for the cache, but you can limit it by organizing your decompositions to use the power of 2 sizes. For instance,

  f(x, y, 21, 21) = f(x, y, 16, 16) union f(x + 16, y, 4, 16) union f(x + 20, y, 1, 16) union f(x, y + 16, 16, 4) union f(x, y + 20, 16, 1) union f(x + 16, y + 16, 4, 4) union f(x + 20, y + 16, 1, 4) union f(x + 16, y + 20, 4, 1) union f(x + 20, y + 20, 1, 1)

I think this approach is more like O (n * n * log k * log k), and therefore it will pay off only for large k values, such as more than 1000.

Subqueries with a variety of algorithm numbers

More articles: