SQL Server - why is a scan performed twice for the same table?

Question

SQL Server - why is a scan performed twice for the same table?

Does anyone know why sql server prefers to query the table twice? Is there any explanation? Can this be done with just one table?

This is a sample code:

DECLARE @id1stBuild INT = 1 ,@number1stBuild INT = 2 ,@idLastBuild INT = 5 ,@numberLastBuild INT = 1; DECLARE @nr TABLE (nr INT); INSERT @nr VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10); CREATE TABLE building ( id INT PRIMARY KEY identity(1, 1) ,number INT NOT NULL ,idStreet INT NOT NULL ,surface INT NOT NULL ) INSERT INTO building (number,idStreet,surface) SELECT bl.b ,n.nr ,abs(convert(BIGINT, convert(VARBINARY, NEWID()))) % 500 FROM ( SELECT ROW_NUMBER() OVER (ORDER BY n1.nr) b FROM @nr n1 CROSS JOIN @nr n2 CROSS JOIN @nr n3 ) bl CROSS JOIN @nr n --***** execution plan for the select below SELECT * FROM building b WHERE b.id = @id1stBuild AND b.number = @number1stBuild OR b.id = @idLastBuild AND b.number = @numberLastBuild DROP TABLE building

The execution plan for this is always the same: two clustered indexes are merged through a merge join (Concatenation). The rest is less important. Here is the execution plan:

enter image description here

+5

sql sql-server sql-execution-plan

Emarian Jan 19 '15 at 11:35

source share

3 answers

It does not scan twice. He is looking twice.

Your query is semantically the same as below.

 SELECT * FROM building b WHERE b.id = @id1stBuild AND b.number = @number1stBuild UNION SELECT * FROM building b WHERE b.id = @idLastBuild AND b.number = @numberLastBuild

And the execution plan fulfills two requests and combines the result.

+5

Martin smith Jan 19 '15 at 11:56

source share

Why is scanning performed twice for the same table?

It is not a scan, it is a search, and it makes a difference.

Implementation of OR as UNION, and then implementation of UNION via MERGE JOIN. Called 'union join :

Union association
Now modify the query a bit:
 select a from T where b = 1 or c = 3 |--Stream Aggregate(GROUP BY:([T].[a])) |--Merge Join(Concatenation) |--Index Seek(OBJECT:([T].[Tb]), SEEK:([T].[b]=(1)) ORDERED FORWARD) |--Index Seek(OBJECT:([T].[Tc]), SEEK:([T].[c]=(3)) ORDERED FORWARD) 
Instead of concatenating and sorting various operators, we now have a merge union (concatenation) and a stream aggregate. What happened? A merge join (concatenation) or “join join” is not really a join. It is implemented by the same iterator as the merge join, but it does perform the join, while preserving the order of the input lines. Finally, we use a stream aggregate to eliminate duplicates. (See this post to learn more about using a thread aggregate to eliminate duplicates.) This plan is generally the best choice , since sorting uses memory and can spill data to disk if it runs out of memory and the stream does not use aggregate memory.

+3

Remus Rusanu Jan 19 '15 at 11:59

source share

Steve ford · Accepted Answer · 2015-01-19T12:05:15+0000

You can try the following, which provides only one search and a slight performance improvement. Since @Martin_Smith says that what you encoded is equivalent to Union

 SELECT * FROM building b WHERE b.id IN (@id1stBuild , @idLastBuild) AND ( (b.id = @id1stBuild AND b.number = @number1stBuild) OR (b.id = @idLastBuild AND b.number = @numberLastBuild) )

SQL Server - why is a scan performed twice for the same table?

More articles: