I am using SQL Server 2008. I have a table with over 3 million records that is linked to another table with a million records.
I spent several days experimenting with various ways to query these tables. For me it comes down to two completely different requests, both of which take 6 seconds to work on my laptop.
The first query uses brute force to estimate probable matches and removes incorrect matches using summing summing calculations.
The second gets all possible possible matches, and then removes the wrong matches using the EXCEPT query, which uses two allocated indexes to search for low and high mismatches.
Logically, one would expect brute force to be slow and indexes to be fast. Not this way. And I experimented a lot with indexes until I got the best speed.
In addition, brute force queries do not require as many indexes, which means that technically this will provide better overall system performance.
The following are two implementation plans. If you can’t see them, let me know and I will send it back to landscape / write to you.
Brute force request:
SELECT ProductID, [Rank] FROM ( SELECT p.ProductID, ptr.[Rank], SUM(CASE WHEN p.ParamLo < si.LowMin OR p.ParamHi > si.HiMax THEN 1 ELSE 0 END) AS Fail FROM dbo.SearchItemsGet(@SearchID, NULL) AS si JOIN dbo.ProductDefs AS pd ON pd.ParamTypeID = si.ParamTypeID JOIN dbo.Params AS p ON p.ProductDefID = pd.ProductDefID JOIN dbo.ProductTypesResultsGet(@SearchID) AS ptr ON ptr.ProductTypeID = pd.ProductTypeID WHERE si.Mode IN (1, 2) GROUP BY p.ProductID, ptr.[Rank] ) AS t WHERE t.Fail = 0

Index Based Exception Request:
with si AS ( SELECT DISTINCT pd.ProductDefID, si.LowMin, si.HiMax FROM dbo.SearchItemsGet(@SearchID, NULL) AS si JOIN dbo.ProductDefs AS pd ON pd.ParamTypeID = si.ParamTypeID JOIN dbo.ProductTypesResultsGet(@SearchID) AS ptr ON ptr.ProductTypeID = pd.ProductTypeID WHERE si.Mode IN (1, 2) ) SELECT p.ProductID FROM dbo.Params AS p JOIN si ON si.ProductDefID = p.ProductDefID EXCEPT SELECT p.ProductID FROM dbo.Params AS p JOIN si ON si.ProductDefID = p.ProductDefID WHERE p.ParamLo < si.LowMin OR p.ParamHi > si.HiMax

My question is: based on execution plans that look more efficient? I understand that this may change as my data grows.
EDIT:
I updated the indexes and now have the following execution plan for the second query:
