Since you have two datasets ordered by the same value .. have you tried merging join instead of nested loop join?
SET STATISTICS IO ON SET STATISTICS TIME ON SELECT COUNT(*) FROM Address adr INNER JOIN Auditable a on adr.UniqueId = a.UniqueId OPTION (LOOP JOIN) SELECT COUNT(*) FROM Address adr INNER JOIN Auditable a on adr.UniqueId = a.UniqueId OPTION (MERGE JOIN) SELECT COUNT(*) FROM Address adr INNER JOIN Auditable a on adr.UniqueId = a.UniqueId OPTION (HASH JOIN)
Edit:
These explanations are conceptual. SQL Server can perform more complex operations than my examples show. This conceptual understanding, comparable to time measurement and logical IO using SET STATISTICS commands, and consideration of query execution plans - forms the basis of my query optimization method (grown over four years). May he serve you as well as he.
Customization
- Get 5 decks of cards.
- Take 1 deck and create a parent dataset.
- Take the remaining 4 decks and create a child dataset.
- Order each dataset by map value.
- Let m be the number of cards in the parent dataset.
- Let n be the number of cards in the set of child data.
Nestedloop
- Take the card from the top of the parent dataset.
- Search (using binary search) in the set of child data for the first match of the match.
- Look forward in the dataset of the child from the first match until a mismatch is found. You have found all matches.
- Repeat this for each card in the parent dataset.
The nested loop algorithm iterates over the parent dataset, and then searches the dataset once for each parent, which makes its cost: m * log (n)
Combine
- Take the card from the top of the parent dataset.
- Take the card from the top of the child dataset.
- If the cards match, remove the cards from the top of each deck until a mismatch is found. Create each matching pair between parent and child matches.
- If the cards do not match, find less between the parent and child cards and remove the card from the top of this dataset.
The merge combining algorithm iterates the parent dataset once, and the child data is set once, which makes it cost: m + n. It relies on data to be ordered. If you ask to join the association for data not ordered, you will incur an order operation! This results in a cost of (m * log (m)) + (n * log (n)) + m + n. Even in some cases, this may be better than a nested loop.
Hash
- Get a card table.
- Take each card from the parent dataset and place it on the card table where you can find it (it does not have to be related to the cost of the card, you just need to be convenient for you).
- Take each card from the data set for the children, find the corresponding column on the cardboard table and create the appropriate pair.
The hash join algorithm iterates the parent dataset once, and the child data is set once, which makes it cost: m + n. It relies on having a sufficiently large card table to store the entire contents of the parent dataset.
Amy b
source share