How good is the geostatistical data type in SQL Server 2008?

I have a large database full of clients implemented in sql server 2005. Clients have latitude and longitude represented as Decimal(18,15) . The most important search query in the database is trying to find all customers near a specific location as follows:

 (Addresses.Latitude - @SearchInLat) BETWEEN -1 * @LatitudeBound AND @LatitudeBound) AND ( (Addresses.Longitude - @SearchInLng) BETWEEN -1 * @LongitudeBound AND @LongitudeBound) 

So this is a very simple method. @LatitudeBound and @LongitudeBound are just numbers used to push all customers within the rough bounding rectangle of @SearchInLat, @SearchInLng . Once the results get to the client PC, some results are filtered out, so that there is a limited circle, not a rectangle. (This is done on the client PC to avoid calculating the square roots on the server.)

This method worked quite well in the past. However, now we want the search to do more interesting things - for example, the number of results returned is more predictable or so that the user dynamically increases the size of the search radius. To do this, I considered ugprading on sql server 2008, with its geography data type, spatial indexes, and distance functions. My question is: how fast are they?

The advantage of the simple request that we have at the moment is that it is very fast and not very efficient, which is important because it is often called. How quickly would a question arise based on the following:

 SearchInPoint.STDistance(Addresses.GeographicPoint) < @DistanceBound 

for comparison? Are spatial indexes working well and is STDistance fast?

+7
source share
1 answer

If you process only the standard Lat / Lng pair, as you describe, and all you do is a simple search, then you may not be very dependent on increasing speed using a geometry type.

However, if you want to become more entrepreneurial, as you say, then exchanging for the use of geometry types will open up for you a whole world of new possibilities, not just search.

For example (based on the project I'm working on) you could (if it is uk data) load the polygon definitions for all cities / villages / cities for a given area, then cross-reference the search in a specific city, or if you have a roadmap, You can find which customers lived near the main delivery routes, highways, main roads, all kinds of things.

You can also do very bizarre reporting, imagine a map of cities where each contour was mapped and then shaded to show the density of customers in the area, some simple SQL geometry will easily return you an account directly from the database to calculate such Information.

Then tracking, I donโ€™t know what data you are processing or why you have customers, but if you are delivering something, feed the coordinates of the delivery van, tell us how close it is to this client.

Regarding the question, is STDistance fast? well, itโ€™s hard to say really, I think the best question is: โ€œIs it quick compared to .....โ€, itโ€™s hard to say โ€œyesโ€ or โ€œnoโ€ if you have no way to compare it with.

Spatial indexes are one of the main reasons for moving your data to a geographically aware database, which is optimized for the best results for this task, but like any database, if you create poor indexes, you will get poor performance.

In general, you should definitely see some kind of speed increase, because the math in sorting and indexing is more aware of the purpose of the data, and not just linear in the operation, like a regular index.

Keep in mind that the wiser the SQL server, the better results you get.

One of the last things to mention is data management, if you use a GIS-oriented database, then it opens up the possibility for you to use a GIS package, such as ArcMap or MapInfo, to manage, correct and visualize your data, which means correction very easy to do by pointing, clicking and dragging.

My advice would be to create a side table for your existing one that is formatted for spatial operations, and then record some stored procedures and perform some temporary tests, see which one is better. If you have a significant increase only in the basic operations that you perform, then this is an excuse alone, if it is approximately equal, then your decision really depends on what new functionality you really want to achieve.

+8
source

All Articles