List grid points on a 2D plane in descending order (x * y)

Given N > 0 and M > 0 , I want to list all the pairs (x, y), so 1 <= x <= N and 1 <= y <= M in descending order (x * y). Example: for N = 3 and M = 2, the sequence of transfers should be:

 1. (3, 2) -- 3 * 2 = 6 2. (2, 2) -- 2 * 2 = 4 3. (3, 1) -- 3 * 1 = 3 4. (2, 1) -- 2 * 1 = 2 5. (1, 2) -- 1 * 2 = 2 6. (1, 1) -- 1 * 1 = 1 

The order (2, 1) and (1, 2) can be replaced. One obvious way is to list them all, insert into vector<pair<int, int> > and call std::sort() with my own comparison function. However, since N and M can be large, and most of the time I only need the first few members of the sequence, I hope that there can be some smarter way that generates such a sequence, instead of generating them all and -sorting that requires as much as the elements of the N*M array.

Update: I forgot to mention that although in most cases I only need the first few terms, the number of required terms is unknown before listing.

+7
source share
8 answers

If you just want to keep it in place, keeping the time more or less equal, you can expect that each subsequent smaller element should be adjacent (in the two-dimensional grid that you referenced) to one of the elements that you have already encountered. (You can prove this by induction, it is not particularly difficult. I will assume for the rest of that M> = N.)

The basic algorithm looks something like this:

 Start with a list (Enumerated Points) containing just the maximum element, M*N Create a max heap (Candidates) containing (M-1),N and M,(N-1). Repeat this: 1.Pick the largest value (x,y) in Candidates and append it to Enumerated Points 2.Add (x-1),y and x,(y-1) to Candidates if they are not there already 

You can repeat this if you need more elements at enumerated points. The maximum size of the Candidates should be M + N, so I think this is O (k log (M + N)), where k is the number of points required.

ADDITION: The issue of avoiding duplication is not entirely difficult, but worth mentioning. I’ll take in this algol that you set the grid aside so that the numbers go down when you move down and to the right. In any case, this happens as follows:

At the beginning of the algorithm, create an array (column size) that has one element for each column. You must make this array contain the number of rows in each column that are part of the list of enumerated points.

After you add a new element and update this array, you will check the size of the column on each side to decide whether the grid squares will be directly to the right and below this new enumeration point already in the candidate list.

Check the size of the column on the left - if it is larger than this, you do not need to add an element below your new enumeration point.

Check the size of the column to the right - if it is smaller than the same size of this column, you do not need to update the element to the right of it.

To make this obvious, consider this partially completed diagram for M = 4, N = 2:

 4 3 2 1 * * * 2 |2 * 3 2 1 |1 

Elements (4.2), (3.2), (2.2) and (4.1) are already included in the list. (The first coordinate is M, the second is N.). The size of the column array is [2 1 1 0], since this is the number of elements in each column that are listed in the list of enumerated points. We are going to add (3.1) to the new list. We can look at the size of the column to the right and conclude that adding (2.1) is not required, since the size of the column for M = 2 is greater than 1-1. The conclusions are pretty obvious - we already added (2.1) when we added (2.2).

+6
source

Here's the O (K logK) solution, where K is the number of terms you want to generate.
Edit: Q stores only one copy of each item; the insert fails if the element is already present.

 priority_queue Q Q.insert( (N*M, (N,M)) ) // priority, (x,y) repeat K times: (v, (x,y)) = Q.extract_max() print(v, (x,y)) Q.insert( (x-1)*y, (x-1,y) ) Q.insert( x*(y-1), (x,y-1) ) 

This works because before visiting (x, y) you must visit either (x + 1, y) or (x, y + 1). Complexity is O (KlogK), since Q no more than 2K elements are inserted into it.

+1
source

This is actually equivalent to listing primes; the numbers you want are all numbers that are not first (except for all those with x or y equal to 1).

I'm not sure that the method of listing prime numbers will be faster than what you already offer (at least in terms of algorithmic complexity).

0
source

Since you mention that most of the time you need the first few members of a sequence; after generating them all, you don’t need to sort them all to find these first few terms. You can use Max Heap depending on the number of terms you want, say k. Therefore, if the heap has size k (<N & <<M), then you can have the largest k terms after nlogk, which is better than nlogn for sorting.

Here n = N * M

0
source

A dummy approach that loops from NxM to 1 looks for pairs that, when multiplied, produce the current number:

 #!/usr/bin/perl my $n = 5; my $m = 4; for (my $p = $n * $m; $p > 0; $p--) { my $min_x = int(($p + $m - 1) / $m); for my $x ($min_x..$n) { if ($p % $x == 0) { my $y = $p / $x; print("x: $x, y: $y, p: $p\n"); } } } 

For N = M, the complexity is O (N 3 ), but the memory usage is O (1).

Refresh . Please note that the complexity is not as bad as it seems, because the number of elements that need to be generated is already N 2 . For comparison, the generate-all-the-pairs-and-sort method is O (N 2 logN) using O (N 2 ) memory.

0
source

Here is the algorithm for you. I will try to give you a description in English.

In the rectangle we are working with, we can always assume that the point P(x, y) has a larger "area" than the point below it P(x, y-1) . Therefore, when we look for points of maximum area, we need to compare only the top sloppy point in each column (i.e., for every possible x ). For example, when looking at a pristine 3 x 5 grid

 5 abc 4 def 3 ghi 2 jkl 1 mno 1 2 3 

we really need to compare a , b and c . All other points are guaranteed to have a smaller area than at least one of them. So build the maximum heap that contains only the topmost point in each column. When you push a point from the heap, click the point located directly below it (if that point exists). Repeat until the heap is empty. This gives you the following algorithm (tested, but it is in Ruby):

 def enumerate(n, m) heap = MaxHeap.new n.times {|i| heap.push(Point.new(i+1, m))} until(heap.empty?) max = heap.pop puts "#{max} : #{max.area}" if(max.y > 1) max.y -= 1 heap.push(max) end end end 

This gives you O((2k + N) log N) runtime. Cost of operations heap log N ; we make N of them when constructing the initial heap, and then 2k when we pull out the points k maximum area (2, provided that each pop is followed by a push of the point below it).

It has the added advantage that you do not need to create all your points, unlike the initial proposal to create the entire set, and then sort it. You create as many points as necessary so that your heap is accurate.

And finally: Improvements can be made! I did not do this, but you can get even better performance with the following settings:

  • Just go down to y = x in each column instead of y = 1 . To create points that you no longer check, use the fact that the area P(x, y) is equal to the area P(y, x) . Note. . If you use this method, you will need two versions of the algorithm. Columns work when M >= N , but if M < N , you need to do this line by line.
  • Only consider columns that can contain a maximum. In the example I gave, there is no reason to include a in the heap from the very beginning, since it is guaranteed to be less than b . Therefore, you only need to add columns to the heap when the top points of adjacent columns appear.

And it turned into a small essay ... In any case - I hope this helps!

Edit: A complete algorithm that includes both the improvements I mentioned above (but still in Ruby because I'm lazy). Note that there is no need for any additional structures to avoid duplicate insertion - unless it is a “top” point, each point will only insert another row in it in the row / column.

 def enumerate(n, m, k) heap = MaxHeap.new heap.push(Point.new(n, m)) result = [] loop do max = heap.pop result << max return result if result.length == k result << Point.new(max.y, max.x) if max.x <= m && max.y <= n && max.x != max.y return result if result.length == k if(m < n) # One point per row heap.push(Point.new(max.x, max.y - 1)) if max.x == n && max.y > 1 heap.push(Point.new(max.x - 1, max.y)) if max.x > max.y else # One point per column heap.push(Point.new(max.x - 1, max.y)) if max.y == m && max.x > 1 heap.push(Point.new(max.x, max.y - 1)) if max.y > max.x end end end 
0
source

I understood!

Consider a grid as a set of M columns, where each column is a stack containing elements from 1 at the bottom to N at the top. Each column is labeled with its x coordinate.

Elements within each column of a column are ordered by its y value, as well as x * y, since x has the same value for all of them.

So, you just need to select the stack that has the larger x * y value on top, pop it and repeat it.

In practice, you will not need stacks, only the index of the top value, and you can use the priority queue to get a column with a large x * y value. Then decrease the index value and, if it is greater than 0 (indicating that the stack is not exhausted), re-insert the stack into the queue with a new priority x * y.

The complexity of this algorithm for N = M is O (N 2 logN) and its memory usage is O (N).

Update : implemented in Perl ...

 use Heap::Simple; my ($m, $n) = @ARGV; my $h = Heap::Simple->new(order => '>', elements => [Hash => 'xy']); # The elements in the heap are hashes and the priority is in the slot 'xy': for my $x (1..$m) { $h->insert({ x => $x, y => $n, xy => $x * $n }); } while (defined (my $col = $h->extract_first)) { print "x: $col->{x}, y: $col->{y}, xy: $col->{xy}\n"; if (--$col->{y}) { $col->{xy} = $col->{x} * $col->{y}; $h->insert($col); } } 
0
source

In Haskell, it produces output immediately . Here is an illustration:

  ------- -*------ -**------ -***------ -****------ -*****------ -******------ -*******------ 

Each marked point creates both (x, y) and (y, x). The algorithm "eats" this thing from the upper right corner, comparing the upper elements in each column. The length of the boundary does not exceed N (we assume N >= M ).

 enuNM nm | n<m = enuNM mn -- make sure N >= M enuNM nm = let b = [ [ (x*y,(x,y)) | y<- [m,m-1..1]] | x<-[n,n-1..m+1]] a = [ (x*x,(x,x)) : concat [ [(z,(x,y)),(z,(y,x))] -- two symmetrical pairs, | y<- [x-1,x-2..1] -- below and above the diagonal , let z=x*y ] | x<-[m,m-1..1]] in foldi (\(x:xs) ys-> x : merge xs ys) [] (b ++ a) merge a@ (x:xs) b@ (y:ys) = if (fst y) > (fst x) then y : merge a ys else x : merge xs b merge a [] = a merge [] b = b foldi fz [] = z foldi fz (x:xs) = fx (foldi fz (pairs f xs)) pairs f (x:y:t) = fxy : pairs ft pairs ft = t 

foldi creates a skewed infinitely deepening tree that serves as a heap, combining all producer flows, each for each x , which are created already sorted in descending order. Since all initial values ​​of producer flows are guaranteed to be in descending order, each initial value can be pushed without comparison, which allows you to build a tree lazily.

Code a creates pairs above the diagonal line using the corresponding pairs from the bottom of the diagonal line (assuming N >= M , for each (x,y) , where x <= M & y < x , (y,x) should also be produced. )

It should be practically O (1) for each of the first few values ​​that are very close to the top of the comparison tree.

  Prelude Main> take 10 $ map snd $ enuNM (2000) (3000)
 [(3000,2000), (2999,2000), (3000,1999), (2998,2000), (2999,1999), (3000,1998), (2997,2
 000), (2998,1999), (2999,1998), (2996,2000)]
 ( 0.01 secs , 1045144 bytes)

 Prelude Main> let xs = take 10 $ map (log.fromIntegral.fst) $ enuNM (2000) (3000)
 Prelude Main> zipWith (> =) xs (tail xs)
 [True, True, True, True, True, True, True, True, True]

 Prelude Main> take 10 $ map snd $ enuNM (2 * 10 ^ 8) (3 * 10 ^ 8)
 [(300000000,200000000), (299999999,200000000), (300000000,199999999), (299999998,20
 0000000), (299999999,199999999), (300000000,199999998), (299999997,200000000), (2999
 99998,199999999), (299999999,199999998), (299999996,200000000)]
 ( 0.01 secs , 2094232 bytes)

We can evaluate the empirical complexity at runtime:

  Prelude Main> take 10 $ drop 50,000 $ map (log.fromIntegral.fst) $ enuNM (2 * 10 ^ 8)
  (3 * 10 ^ 8)
 [38.633119670465554.38.633119670465554.38.63311967046555.38.63311967046554.38.63
 311967046554,38.63311967046553,38.63311967046553,38.633119670465526,38.633119670
 465526.38.63311967046552]
 ( 0.17 secs , 35425848 bytes)

 Prelude Main> take 10 $ drop 100000 $ map (log.fromIntegral.fst) $ enuNM (2 * 10 ^ 8
 ) (3 * 10 ^ 8)
 [38.63311913546512,38.633119135465115,38.633119135465115,38.63311913546511,38.63
 311913546511.38.6331191354651.38.6331191354651.38.633119135465094.38.63311913546
 5094.38.63311913546509]
 ( 0.36 secs , 71346352 bytes)

 * Main> let x = it
 * Main> zipWith (> =) x (tail x)
 [True, True, True, True, True, True, True, True, True]

 Prelude Main> logBase 2 (0.36 / 0.17)
 1.082462160191973 - O (n ^ 1.08) for n = 100000 values ​​produced

This can be translated, for example, using Python generators for Haskell threads, as can be seen here .

0
source

All Articles