Sql server: approximate number of rows

I get weird execution plan behavior from SQL Server (2005).

TableName: LOG
... contains about 1000 lines

  • ID int
  • Name varchar (50)

Query:

SELECT * FROM (SELECT ROW_NUMBER() OVER (ORDER BY ID DESC) as Row, ID, Name FROM Log) AS LogWithRowNumbers WHERE Row >= 1 AND Row <= 2 

It estimates the number of rows returned as 9 (although this is obviously 2 or less).
In addition, deleting "and Row <= 2" increases the execution time by approximately * 5. ("and Row <= 2" and "and Row <= 9999999999999" behave the same)

I updated the statistics. But still this behavior is strange. Adding the line <99999999999 will make the request faster? why?

+7
sql sql-server tsql
source share
2 answers

I am not an expert in optimizing SQL Server internal processes / queries, but here are my 2 pence (or cents, if you prefer).

I believe this is due to the use of the ROW_NUMBER () value in the WHERE clause. As an example, I created an example table filled with 1000 rows from ID 1 to 1000 (identifier as primary key), as you said.

If you select ROW_NUMBER () and execute a query based on the ID column, for example:

 Select * FROM ( SELECT ID, Name FROM Log ) as LogWithRowNumbers WHERE ID>=1 and ID<=2 

Then it correctly shows the number of rows as 2 - as expected.

Now, working backwards, add ROW_NUMBER to the inner SELECT, but leave the WHERE clause as if:

 Select * FROM ( SELECT ROW_NUMBER() OVER (ORDER BY ID DESC) AS RowNo, ID, Name FROM Log ) as LogWithRowNumbers WHERE ID>=1 AND ID <=2 

This still shows the correct number of rows as 2.

Finally, set the WHERE clause to use RowNo as the column that is filtered instead of ID, this is when the counted number of rows moves to 9.

Therefore, I believe that this is using the ROW_NUMBER () function, which is filtered in the WHERE clause, which is the reason. Therefore, I would have imagined it because the actual columns of the table obviously have better / more accurate statistics than with this expression created by the function.

I hope this is at least a good starting point, hope it will be helpful!

+6
source share

AdaTheDev is true. The reason you see this behavior is because SQL Server must work out the row numbers for the table before using them in the where clause.

This should be a more efficient way to get the same results:

  SELECT TOP(2) ROW_NUMBER() OVER (ORDER BY ID DESC) as Row, ID, Name FROM Log ORDER BY ID DESC 
+2
source share

All Articles