SQL: matching tables using a math operation

I have a table with a set of 300,000 units, where the location of each block is determined by the coordinate (X,Y) . I would like to know which units are at a certain distance from each of them?

Ex.

 UnitiID XY A 10 15 B 10 25 C 25 15 proc sql; create table work.Test2 as select distinct a.UnitID, aX, aY, b.UnitID as CloseUnit label="CloseUnit", sqrt( (aX-bX)**2 + (aY-bY)**2 ) as distance from work.Test as a left join work.Test as b on 0<sqrt( (aX-bX)**2 + (aY-bY)**2 ) <=15 ; quit; 

Result:

 UnitiID XY CloseUnit Distance A 10 15 B 10 A 10 15 C 15 B 10 25 A 10 C 25 15 A 15 

This takes a lot of processor time for the whole table, since we are going to do 300'000 ^ 2 comparisons, how could I complete this task?

+4
source share
1 answer

Optimization comes with several optimizations. First, you can check the distance along the X and Y axis. If the value is greater than 15, the points cannot be in range. The subquery first instructed the database to perform a faster check:

 select * from ( select aX as aX , bX as bX , aY as aY , bY as bY from Test a join Test b on abs(aX - bX) <= 15 and abs(aY - bY) <= 15 ) as SubQueryAlias where sqrt( (aX-bX)**2 + (aY-bY)**2 ) <= 15 

The second optimization would be to move the sqrt calculation to the right:

 where (aX-bX)**2 + (aY-bY)**2 <= 15**2 

Twisting is faster than rooting, especially when it runs on a constant.

For even more optimization, check out the Wikipedia article on Geohash .

+3
source

All Articles