Best way to update table schema for huge tables (SQL Server)

I have some huge tables on a SQL 2005 production database that require a schema update. This is basically adding columns with default values ​​and some column type changes that require a simple conversion. All this can be done with a simple "SELECT INTO", where the goal is a table with a new schema.

Our tests so far show that even this simple operation, performed entirely inside the server (not fetching or clicking any data), can take hours, if not days, on a table with many millions of lines.

Is there a better update strategy for such tables?

edit 1: We are still experimenting without final output. What happens if one of my conversions to a new table involves merging each five rows into one. There is some code that must be executed on every conversion. The best performance we could get has led us to a speed that takes at least several days to convert a 30M row table.

Will I use SQLCLR in this case (doing the conversion with code running inside the server), give me significant speedup?

+4
source share
5 answers

We have a similar problem, and I found that the fastest way to do this is to export the data to delimited files (in chunks - depending on the size of the lines - in our case, each file had 500,000 lines), making any conversion to export time, discarding and re-creating the table with the new schema, and then import bcp from files.

A table of 30 million rows took a couple of hours using this method, where an alter table took more than 30 hours.

+3
source

Do you apply indexes immediately or in the secondary phase? It should go much faster without indexing during build.

+3
source

Have you tried using the alter table instead of moving data to a new table? Why don't you choose Select? Just change your current structure.

+3
source

Add a column that allows null, then manually update the default value, and then modify the table to add the default value. This way you can manage updates and make them in small pieces.

0
source

I have a similar probing problem that is common enough.

Our database caches the results of a remote stored procedure, which sometimes expands with new fields.

This table represents millions of rows (and now up to about 80 fields) with multiple indexes and played with #temp tables, etc. (even using bcp for temporary files); I am using the select parameter in a new table:

  • create a new table with a new structure
  • make a choice in this table
  • delete the original
  • rename the new table to the old name
0
source

All Articles