The question is how to read the SQL execution plan

I fulfilled the request and included the actual execution plan. There is one Hash Match that interests me because the subtree uses index search instead of index search. When I click on this Hash Match, there is a Probe Residual section. I suggested that these are the values ​​that I join. Will I fix it here or is there a better explanation of what this means?

The second question I had was about the indexes he used. In my example, I am sure that this particular join is combined into two columns. The index that it crawls has both of these columns in it, as well as another column that is not used in the join. I got the impression that this would lead to an index search, not a scan. Am I wrong about this?

+6
sql sql-server sql-server-2005 sql-execution-plan
source share
4 answers

A Hash Join usually (always?) Uses a scan, or at least a range scan. The hash join works by scanning both tables left and right (or the range in the tables) and creating a hash table in memory that contains all the values ​​that are β€œvisible” during the scan.

What happened in your case: QO noticed that it can get all the values ​​of column C from a non-clustered index that contains this column (as a key or as an included column). Being a non-clustered index is probably pretty narrow, so the total amount of IOs for scanning the entire non-clustered index is not exaggerating. QO also believes that the system has enough memory to store the hash table in memory. Comparing the cost of this request (scanning a non-clustered index from end to end, say 10,000 pages) with the cost of a nested loop that used queries (say 5,000 probes of 2-3 pages each), the scan won as requiring less IO. Of course, this is mainly speculation on my part, but I'm trying to present the matter from the point of view of QO, and the plan is probably optimal.

The factors that contributed to this particular choice of plan would be:

  • a large number of evaluated candidates on the right side of the connection
  • presence of a join column in a narrow non-clustered index for the left side
  • a lot of RAM

For a large estimate of the number of candidates, a better choice than a hash join is only a merge join, and this requires the input to be pre-edited. If both the left side can offer an access path that guarantees order in the joined column, and the right side has a similar opportunity, then you can complete the merge join, which is the fastest join.

+4
source share

This blog post is likely to answer your first question.

As for your second, index scans can be chosen by the optimizer in a number of situations. Above my head:

  • If the index is very small
  • If most of the rows in the index are selected by the query

  • If you use functions in the where clause of your request

In the first two cases, it is more efficient to scan, so the optimizer selects it for search. In the third case, the optimizer has no choice.

+4
source share

1 / A Hash Match means that it accepts a hash of columns used in the equality join, but must include all other columns involved in the join (for>, etc.) so that they can also be checked. There are residual columns.

2 / Index search can be performed if it can go directly to the rows you want. Perhaps you apply calculations to columns and use this? Then it will use the index as a smaller version of the data, but it will still need to check each row (using calculations for each of them).

+2
source share

Check out great articles on implementation plans at simple-talk.com:

They also have free e-book SQL Server execution plans for download.

+2
source share

All Articles