SQL: Weird query performance after "pick in" a new table

I am experiencing a strange situation as shown below:

I have a huge table in a database called "Table1". Then I duplicate the exact same table with the following code.

Select * into Table2 from Table1

After that, I find that query performance is dramatically different.

Select count (distinct ID) from Table1 takes almost 2 minutes. (Old table)

At the same time, Select count (distinct ID) from Table2 only takes about 10 seconds to complete (New table)

By the way, I found that the data was reordered to newtable after "select into". Also, before you “select” into a new table, the column added to table 1 (old table) (which is a table change, add col1 as col2.)

So how is this going?

(NB: the original version of the question stated that the new table was slow. It was a mistake. In addition, she did not mention the data manipulation in table 1)


Responses to requests for more information

This is the result of Sebastian's code.

 SELECT QUOTENAME(OBJECT_SCHEMA_NAME(t.object_id)) + '.' + QUOTENAME(t.name) tbl, s.name stats_name, cols.cols, t.create_date table_date, STATS_DATE(s.object_id, s.stats_id) AS statistics_date, s.auto_created, s.user_created, s.no_recompute, s.has_filter, s.filter_definition FROM sys.tables t LEFT OUTER JOIN sys.stats s ON s.object_id = t.object_id OUTER APPLY ( SELECT STUFF((SELECT ',' + c.name FROM sys.stats_columns sc JOIN sys.columns c ON sc.column_id = c.column_id AND sc.object_id = c.object_id WHERE sc.object_id = s.object_id AND sc.stats_id = s.stats_id ORDER BY sc.stats_column_id FOR XML PATH(''), TYPE ).value('.', 'NVARCHAR(MAX)'), 1, 1, '') cols ) cols --Update Table Name(s) here: WHERE t.OBJECT_ID IN ( OBJECT_ID('[Sales].[SpecialOffer]'), OBJECT_ID('[Sales].[SalesOrderDetail]') ); 

and

 SELECT name, compatibility_level, is_auto_close_on, is_auto_shrink_on, state_desc, is_auto_create_stats_on, is_auto_update_stats_on, is_auto_update_stats_async_on FROM sys.databases WHERE database_id = DB_ID(); 

Actually, I am copying a new table to another database. And the table name is actually called ID2000

The upper image refers to "Table 1" (Database 1) the lower image refers to "Table 2" (Database 2)

"Table1"

"Table2"


Well, since the XML code is too long, here is an alternative listing that followed Hamlet’s advice. I am using SET SHOWPLAN_ALL ON GO instead of pasting all the XML code. Hope this helps.

The red color is the “Table 1” plan, and the black color is the “Table 2” plan. The text in the image is a little small, but zooming out by increasing this page size will simply increase it.

Thank you very much! Figure 1


Result SELECT * FROM sys.dm_db_index_physical_stats(db_id(),object_id('YourTable'),NULL,NULL,'Detaile‌​d') .

Indeed, there is a huge difference between the two tables. The same, the red color refers to "Table 1" and the other refers to "Table 2"

This problem is quite annoying, driving me crazy because I keep asking myself if I should rebuild the whole table or not. :( enter image description here

This is actually rather strange, note that record_count is different. However, when I double-check select COUNT (ID) from id2000 (i.e. calculate the complete rows of data in this table) Both results: 2324798, which is record_count of table_2

In addition, “Table2” was created using the “select * into” statement, I believe that they should both be the same, but now I'm confused.

enter image description here The above table is the result of the code (Running stat) from Sebastian's code


Result SELECT * FROM sys.dm_db_index_physical_stats(db_id(),object_id('YourTable'),NULL,NULL,'Detaile‌​d') .

Indeed, there is a huge difference between the two tables. The same, the red color refers to "Table 1" and the other refers to "Table 2"

This problem is quite annoying, driving me crazy because I keep asking myself if I should rebuild the whole table or not. :( enter image description here

This is actually rather strange, note that record_count is different. However, when I double-check select COUNT (ID) from id2000 (i.e. calculate the complete rows of data in this table) Both results: 2324798, which is record_count of table_2

In addition, “Table2” was created using the “select * into” statement, I believe that they should both be the same, but now I'm confused.

+4
source share
3 answers

So, now that we have found out that the old table was slow and not new, everything indicates an extremely large number of records being forwarded.

To delete forwarded entries, you can use this query:

 ALTER TABLE dbo.Table2 REBUILD; 

Adding a column to the heap is likely to cause each row to move frequently, resulting in a very large number of records being forwarded. The forwarded_records_count column returned by sys.dm_db_index_physical_stats DMV shows the number of redirects - almost all rows in your case.

A SELECT * INTO does not copy forwarders, but instead reorganizes it. Therefore, the difference in performance that you saw.

While we are talking about forwards, in most cases it is a very good idea to have a clustered index in a table. This avoids such problems.

In your case, the identifier column appears to be a candidate for the cluster primary key (if it is unique), but I will need to learn more about the model to give you a recommendation here.

+3
source

Another example: run this and post the text as well as the query results. As always, be sure to replace tables 1 and table 2 with real names. You also need to replace the database names in this case.

 SET STATISTICS IO ON; SET STATISTICS TIME ON; GO SELECT COUNT(DISTINCT ID) FROM DB1.dbo.Table1 GO SELECT COUNT(DISTINCT ID) FROM DB2.dbo.Table2 GO SELECT COUNT(DISTINCT ID) FROM DB1.dbo.Table1 GO SELECT COUNT(DISTINCT ID) FROM DB2.dbo.Table2 GO SELECT COUNT(DISTINCT ID) FROM DB1.dbo.Table1 GO SELECT COUNT(DISTINCT ID) FROM DB2.dbo.Table2 GO SET STATISTICS TIME OFF; SET STATISTICS IO OFF; GO SELECT * FROM sys.dm_db_index_operational_stats(DB_ID('DB1'),OBJECT_ID('DB1.dbo.Table1'),NULL,NULL); SELECT * FROM sys.dm_db_index_operational_stats(DB_ID('DB2'),OBJECT_ID('DB2.dbo.Table2'),NULL,NULL); 
+1
source

I assume this is due to outdated statistics. But we need more information about your environment. Could you run these two queries and post the results? Make sure you use the names of two tables instead of the two supplied.

 SELECT QUOTENAME(OBJECT_SCHEMA_NAME(t.object_id)) + '.' + QUOTENAME(t.name) tbl, s.name stats_name, cols.cols, t.create_date table_date, STATS_DATE(s.object_id, s.stats_id) AS statistics_date, s.auto_created, s.user_created, s.no_recompute, s.has_filter, s.filter_definition FROM sys.tables t LEFT OUTER JOIN sys.stats s ON s.object_id = t.object_id OUTER APPLY ( SELECT STUFF((SELECT ',' + c.name FROM sys.stats_columns sc JOIN sys.columns c ON sc.column_id = c.column_id AND sc.object_id = c.object_id WHERE sc.object_id = s.object_id AND sc.stats_id = s.stats_id ORDER BY sc.stats_column_id FOR XML PATH(''), TYPE ).value('.', 'NVARCHAR(MAX)'), 1, 1, '') cols ) cols --Update Table Name(s) here: WHERE t.OBJECT_ID IN ( OBJECT_ID('[Sales].[SpecialOffer]'), OBJECT_ID('[Sales].[SalesOrderDetail]') ); 

and

 SELECT name, compatibility_level, is_auto_close_on, is_auto_shrink_on, state_desc, is_auto_create_stats_on, is_auto_update_stats_on, is_auto_update_stats_async_on FROM sys.databases WHERE database_id = DB_ID(); 
0
source

All Articles