What is the fastest way to return the x, y coordinates that are on both lists A and list B?

I have two lists (list A and list B) of x, y coordinates, where 0 <x <4000, y <4000, and they will always be integers. I need to know which coordinates are in both lists. What will be your suggestion on how to approach this?

I was thinking of presenting lists as two grids of bits and a bitwise and possibly?

List A contains about 1,000 entries and changes, possibly every 10,000 requests. List B will vary greatly in length and will vary with each passage.

EDIT: I have to note that no coordinate will be in the lists twice; 1.1 cannot be on list A more than once.

+5
source share
8 answers

Think of (x, y) as one 24-bit number, as described in the comments.

Maintain A in numerical order (you said that it is not much different, so it is unlikely to cost).

For each B, do a binary search in the list. Since A is about 1,000 titles, you will need no more than 10 whole comparisons (in the worst case) to verify membership.

If you have a little more memory (about 2 MB) to play with you, you can create a bit vector to support all possible 24-bit numbers, and then perform a one-bit action for each item to verify membership. Thus, A will be represented by a single number 2 ^ 24 bits with a bit set, if the value is (otherwise 0). To verify membership, you simply use the appropriate bit and operation.

+6

A - (, , bst ), , B.

, , , , .

, - , , , , -, .

bst , , - , , , - .

+1

, ( x, y). O (n log n), n . , ; , ( ). O (n) ( O (m log m), m - , ).

( ): , , , () , O (n), .

+1

A , B, A. std:: set > A, , A.find(element_from_b)!= A.end()...

, O (b log a) ( b - B a ). , a 10000, log a .

+1

, STL, (.. return (R.x < L.x || (R.x==L.x && R.y < L.y). std::list::sort std::set_intersection, .

+1
0

, , X Y - () A B? STL:

#include <vector>
#include <std>
using namespace std;
// ...
set<int> a; // x coord (assumed populated elsewhere)
set<int> b; // y coord (assumed populated elsewhere)
set<int> in; // intersection
// ...
set_intersection(a.begin(), a.end(), b.begin(), b.end(), insert_iterator<set<int> >(in,in.begin()));
0

I think hashing is your best bet.

// Psuedocode:

  • INPUT: two lists, each with coordinates (x, y)
  • find a longer list, call it A
  • hash of each item in A
  • go to another list, name it B
  • hash of each item in B and see it in the table.
  • if there is a match, return / store (x, y) somewhere
  • repeat # 4 to the end

Assuming length A is m and length B is n, the runtime is O (m + n) β†’ O (n)

0
source

All Articles