I am working on a sorting / ranking algorithm that works with a fairly large number of elements, and I need to effectively implement the following algorithm to make it work:
There are two lists of numbers. They are equally long, about 100-500 thousand units. From this I need to find the nth largest product between these lists, i.e. if you create a matrix where there is one list on the top, on the side you have another, and each cell is the product of the number above and the number on the side.
Example: lists A=[1, 3, 4] and B=[2, 2, 5] . Then the works [2, 2, 5, 6, 6, 15, 8, 8, 20] . If I wanted the 3rd largest of this, there would be 8.
A naive solution would be to simply generate these numbers, sort them, and then choose the nth largest. But this is O(m^2 * log m^2) , where m is the number of items in small lists, and this is simply not fast enough.
I think I need to sort two small lists first. This is O(m * log m) . Then I know for sure that the largest is A [0] * B [0]. The second largest is A [0] * B [1] or [1] * B [0], ...
It seems to me that this can be done in steps O(f(n)) , regardless of the size of the matrix. But I can not find an effective way to do this part.
Edit: an answer was found that suggested storing the position in two sorted sets, and then looking at A [a] * B [b + 1] and A [a + 1] * B [b], returning a large one and increasing a / b . I was going to post this comment before it is deleted:
This will not work. Imagine two lists A = B = [3,2,1]. This will give you a matrix like [9,6,3; 6.4.2; 3,2,1]. So, you start with (0,0) = 9, go to (0,1) = 6, and then the choice is (0,2) = 3 or (1,1) = 4. However, this will skip (1, 0) = 6, which is more than both. So you canβt just look at the two neighbors, but you need to step back.