Why does this speed up my SQL query?

I learned a trick some time ago from a DBA friend to speed up some SQL queries. I remember that he mentioned that it had something to do with how SQL Server compiles the query and that the query path is forced to use an indexed value.

Here is my original request (takes 20 seconds):

select Part.Id as PartId, Location.Id as LocationId FROM Part, PartEvent PartEventOuter, District, Location WHERE PartEventOuter.EventType = '600' AND PartEventOuter.AddressId = Location.AddressId AND Part.DistrictId = District.Id AND Part.PartTypeId = 15 AND District.SubRegionId = 11 AND PartEventOuter.PartId = Part.Id AND PartEventOuter.EventDateTime <= '4/28/2009 4:30pm' AND NOT EXISTS ( SELECT PartEventInner.EventDateTime FROM PartEvent PartEventInner WHERE PartEventInner.PartId = PartEventOuter.PartId AND PartEventInner.EventDateTime > PartEventOuter.EventDateTime AND PartEventInner.EventDateTime <= '4/30/2009 4:00pm') 

Here is the β€œoptimized” query (less than 1 second):

 select Part.Id as PartId, Location.Id as LocationId FROM Part, PartEvent PartEventOuter, District, Location WHERE PartEventOuter.EventType = '600' AND PartEventOuter.AddressId = Location.AddressId AND Part.DistrictId = District.Id AND Part.PartTypeId = 15 AND District.SubRegionId = 11 AND PartEventOuter.PartId = Part.Id AND PartEventOuter.EventDateTime <= '4/28/2009 4:30pm' AND NOT EXISTS ( SELECT PartEventInner.EventDateTime FROM PartEvent PartEventInner WHERE PartEventInner.PartId = PartEventOuter.PartId **AND EventType = EventType** AND PartEventInner.EventDateTime > PartEventOuter.EventDateTime AND PartEventInner.EventDateTime <= '4/30/2009 4:00pm') 

Can someone explain in detail why this works much faster? I'm just trying to better understand this.

+4
source share
6 answers

perhaps because you get a Cartesian product without your EventType = EventType

From WikiPedia: http://en.wikipedia.org/wiki/SQL

"[SQL] makes it easier to work with a Cartesian join (combining all possible combinations), which leads to the appearance of result sets when WHERE clauses are wrong. Cartesian joins are so rarely used in practice that requires an explicit keyword CARTESIAN can be justified (SQL 1992 introduced the key the word CROSS JOIN, which allows the user to clearly indicate that the Cartesian join is intended, but the abbreviation "comma-join" without a predicate is still an acceptable syntax that still offers the same error). "

in fact, you are executing more lines than necessary with your first request.

http://www.fluffycat.com/SQL/Cartesian-Joins/

+3
source

Are there a large number of entries with EventType = Null?
Before adding an additional restriction, your subquery will return all these Null entries, which should then be checked by the Not Exists predicate for each row of the outer query ... Therefore, the more you restrict what the subquery returns, the fewer rows you need to scan, to check Does not exist ...

If this is a problem, it may be even faster if you also restricted the EventType = '600' entries in the subquery.

 Select Part.Id as PartId, Location.Id as LocationId FROM Part, PartEvent PartEventOuter, District, Location WHERE PartEventOuter.EventType = '600' AND PartEventOuter.AddressId = Location.AddressId AND Part.DistrictId = District.Id AND Part.PartTypeId = 15 AND District.SubRegionId = 11 AND PartEventOuter.PartId = Part.Id AND PartEventOuter.EventDateTime <= '4/28/2009 4:30pm' AND NOT EXISTS (SELECT PartEventInner.EventDateTime FROM PartEvent PartEventInner WHERE PartEventInner.PartId = PartEventOuter.PartId AND EventType = '600' AND PartEventInner.EventDateTime > PartEventOuter.EventDateTime AND PartEventInner.EventDateTime <= '4/30/2009 4:00pm') 
+1
source

Odd, do you have an index defined with both EventType and EventDateTime in it?

Edit:
Wait, is EventType a column with a null value? Column = Column will evaluate to FALSE * if it is NULL . At least using the default SQL Server settings.

A safer equivalent would be EventType IS NOT NULL . See what gives the same speed result?


*: My T-SQL link states that it must be TRUE with ANSI_NULLS set to OFF , but it says otherwise in my query window. * confuzzled now * .
Any ruling? TRUE , FALSE , NULL or UNKNOWN ? :) Must love the "binary" logic in SQL: (

0
source

SQL Server uses index search if and only if all columns of that index are in the query.

0
source

Each unindexed column that you add performs a table scan. If you narrow down your query earlier in your WHERE clause, subsequent scans are faster. Thus, by adding an index scan, your tables are scanned with less data.

0
source

Such a thing was much more common before than now. For example, Oracle 6 was sensitive to the order in which you placed constraints in WHERE clauses. The reason why it surprises you is because we so well expect that the database engine will always develop the best access path no matter how you structure your SQL. After that, Oracle 6 and 7 (after which I switched to MSSQL) also had a hint extension that you could use to tell the database how it might need to build a query plan.

In this particular case, it is difficult to give a final answer without seeing the actual query plans, but I suspect that the difference is that you have a composite index that uses EventType, which is not used for the first request, but for the second. This would be unusual since I expected your first query to use it anyway, so I suspect the database statistics might be outdated, so

STATISTICS REGISTRATION

then try again and post the results here.

0
source

All Articles