Accelerate sql INSERT

I have the following way to insert millions of rows of data into a table (I am using SQL 2008) and it seems slow, is there a way to speed up INSERT?

Here is a snippet of code - I use the corporate MS library

public void InsertHistoricData(List<DataRow> dataRowList) { string sql = string.Format( @"INSERT INTO [MyTable] ([Date],[Open],[High],[Low],[Close],[Volumn]) VALUES( @DateVal, @OpenVal, @High, @Low, @CloseVal, @Volumn )"); DbCommand dbCommand = VictoriaDB.GetSqlStringCommand( sql ); DB.AddInParameter(dbCommand, "DateVal", DbType.Date); DB.AddInParameter(dbCommand, "OpenVal", DbType.Currency); DB.AddInParameter(dbCommand, "High", DbType.Currency ); DB.AddInParameter(dbCommand, "Low", DbType.Currency); DB.AddInParameter(dbCommand, "CloseVal", DbType.Currency); DB.AddInParameter(dbCommand, "Volumn", DbType.Int32); foreach (NasdaqHistoricDataRow dataRow in dataRowList) { DB.SetParameterValue( dbCommand, "DateVal", dataRow.Date ); DB.SetParameterValue( dbCommand, "OpenVal", dataRow.Open ); DB.SetParameterValue( dbCommand, "High", dataRow.High ); DB.SetParameterValue( dbCommand, "Low", dataRow.Low ); DB.SetParameterValue( dbCommand, "CloseVal", dataRow.Close ); DB.SetParameterValue( dbCommand, "Volumn", dataRow.Volumn ); DB.ExecuteNonQuery( dbCommand ); } } 
+6
sql-server-2008
source share
4 answers

Use a volume insert instead.

SqlBulkCopy allows you to efficiently load a SQL Server table with data from another source. The SqlBulkCopy class can be used to write data only to SQL Server Tables. However, the data source is not limited to SQL Server; any data source can be used for how long since the data can be loaded into a DataTable Instance or read using an IDataReader Instance. For this example, the file will contain approximately 1000 records, but this code can process large amounts of data.

In this example, a DataTable is first created and populated with data. This is stored in memory.

 DataTable dt = new DataTable(); string line = null; bool firstRow = true; using (StreamReader sr = File.OpenText(@"c:\temp\table1.csv")) { while ((line = sr.ReadLine()) != null) { string[] data = line.Split(','); if (data.Length > 0) { if (firstRow) { foreach (var item in data) { dt.Columns.Add(new DataColumn()); } firstRow = false; } DataRow row = dt.NewRow(); row.ItemArray = data; dt.Rows.Add(row); } } } 

Then we push the DataTable to the server at a time.

 using (SqlConnection cn = new SqlConnection(ConfigurationManager.ConnectionStrings["ConsoleApplication3.Properties.Settings.daasConnectionString"].ConnectionString)) { cn.Open(); using (SqlBulkCopy copy = new SqlBulkCopy(cn)) { copy.ColumnMappings.Add(0, 0); copy.ColumnMappings.Add(1, 1); copy.ColumnMappings.Add(2, 2); copy.ColumnMappings.Add(3, 3); copy.ColumnMappings.Add(4, 4); copy.DestinationTableName = "Censis"; copy.WriteToServer(dt); } } 
+10
source share

One general tip for any relational database when performing a large number of inserts, or even with any changes to the data, is to delete all secondary indexes first and then recreate them.

Why does it work? Well, with secondary indexes, the index data will be in a different place on the disk than the data, therefore, in the best case, they force an additional read / write update for each record written to the index table. In fact, this can be much worse, as the database decides from time to time that it needs to conduct a more serious index reorganization operation.

When you recreate the index at the end of the insert run, the database will only perform one full table scan to read and process the data. You not only get a more organized index on the disk, but the total amount of work required will be less.

When does it cost? It depends on your database, index structure, and other factors (for example, if you have your own indexes on a separate disk for your data), but my rule of thumb should take this into account if I process more than 10% of the records in a table of a million records or more - and then check with test inserts to see if it's worth it.

Of course, specialized database insertion procedures will be provided in any particular database, and you should also look at them.

+1
source share

FYI - a cycle for a set of records and the implementation of millionth + inserts in a relational database - the worst case scenario when loading a table. Some languages ​​now offer objects with installed kits. For maximum performance SMINK is correct, use BULK INSERT. Millions of rows are loaded in minutes, not hours. Orders magnitude higher than any other method.

As an example, I worked on an eCommerce project that required updating the product list every night. 100,000 rows inserted into the high-performance Oracle database took 10 hours. If I remember correctly, the maximum speed when performing string inserts is 10 rivers / sec. The disease is slow and completely unnecessary. With a volume insert - 100 thousand lines should take less than a minute.

Hope this helps.

+1
source share

Where did the data come from? Could you start volume insertion? If so, this is the best option you could make.

0
source share

All Articles