NESTED LOOPS are good if the condition inside the loop is valid, that is, the index can be used to limit the number of records.
For a query like:
SELECT * FROM a JOIN b ON b.b1 = a.a1 WHERE a.a2 = @myvar
at the start of a each entry from a will be accepted, and all relevant entries in b should be found.
If b.b1 indexed and has high power, then NESTED LOOP will be the preferred way.
In SQL Server , this is also the only way to execute non equijoins (something other than = condition in the ON clause)
HASH JOIN is the fastest method if all (or almost all) records should be parsed.
It takes all the entries from b , builds a hash table on top of them, then takes all the entries from a and uses the value of the join column as a key to find the hash table.
NESTED LOOPS takes this time:
Na * (Nb / C) * R ,
where Na and Nb are the number of entries in a and b , C is the index power, and R is the constant time required to search for the string ( 1 is all fields in the SELECT , WHERE and ORDER BY sections SELECT covered by the index, about 10 if they are not are)
HASH JOIN takes this time:
Na + (Nb * H)
where H is the sum of the constants necessary for constructing and searching the hash table (for writing). They are programmed into the engine.
SQL Server calculates power using table statistics, calculates and compares two values ββand selects the best plan.
Quassnoi
source share