How can I optimize the join with sorting multiple tables in T-SQL?

How can I optimize the following query?

SELECT TOP 50 * FROM A LEFT JOIN B ON A.b_id = B.id ORDER BY A.number, B.name DESC 

I created a non-clustered index on (A.number asc, A.creation_date desc), which includes all columns from A, and another non-clustered index on B.origination_date desc, which includes all columns from B (except for text columns). None of these indexes are used according to the actual execution plan from SQL Server Management Studio.

What seems to be causing a performance hit is B.origination_date sorting. When I look at the actual execution plan in SQL Server Management Studio, I see that “Top N Sort” in these three fields takes 91% of the execution time. If I omit the sort on B.origination_date, the query will end almost instantly using the index on A.

Edit: Updated the query to provide a better, simpler example.

+4
source share
3 answers

Because you are sorting columns from two different tables, SQL Server must join the tables and then sort. After joining tables, indexes in individual tables do not help sorting. An indexed view may be your best bet.

+1
source

I would suggest that A.number, like "%%" , is your problem. What does it mean? You should not use a character with a pattern as the first character if you want to use indexes. As it stands, it seems that filtering is in vain, as there is nothing between the wildcards.

+5
source

Without practical access, it’s hard to find complex and quick solutions. Some ideas and suggestions:

Without joining in table B, all SQL queries (with index on A.Number) pass until they find the first 50 rows matching your pattern. If the Number values ​​are relatively unique (not many duplicates [this is power]), the lower the Creation_Date value in the index.

Why is the left outer join in B? Is this one value [zero or one], or one - [zero or many]? If the power is small (many duplicates in A), then a connection is required to clearly find the "first 50", otherwise one would think that the connection would not affect performance beyond the need to connect). I can’t see any index on B (in addition, on the column identifier) ​​that matters here. Um, you have an index on B.Id, right? If not, this can slow down a lot (assuming B has a significant number of lines, of course).

For more information, Id wants to look at the connection power and column order and look very carefully at the "with connection" query execution plan.


additions

If A has low power (many duplicates), then the query optimizer may “think” that he will have to use the set B.Id to resolve the order (what needs to be done to find the Top 50). This may explain why he is doing what he is doing.

If they lead to 100% equivalent results, I would recommend replacing the LEFT connection with an INNER connection. In general, query plans can become much simpler when more restrictive join conditions exist.

+1
source

All Articles