Quick import of csv table to database table

I implemented an import function that takes data from a csv file in an Asp.Net application. File size can vary from a few kilobytes to 10 MB.

However, when an import occurs and a file size> 50,000 takes about 20 MIN. This is too much time. I need to import about 300,000 records in 2-3 minutes.

I know that importing to the database also depends on the physical memory of the db server. I create bulk insert scripts and execute. I also know that using SqlBulkCopy will also be another option, but in my case itโ€™s just not pasting the product that takes place, but also updating and deleting, this is a field called โ€œFUNCTION CODEโ€ that determines whether to insert, update or delete.

Any suggestions on how to do this would be greatly appreciated.

One approach to this would be to introduce multiple threads that would execute processes at the same time, but I never executed threads until a date and, therefore, I am not aware of the complications that I will incur when implementing them.

Thanks and Regards, Francis P.

+2
source share
5 answers

SqlBulkCopy will definitely be the fastest. I would approach this by inserting data into a temporary table in the database. Once the data is in the temp table, you can use SQL to merge / insert / delete accordingly.

+4
source

I assume you are using SQL Server ...

If you are using 2005/2008, consider using SSIS to process the file. Technet

Importing a huge amount of data into an asp.net process is not the best thing you can do. You can upload a file and start a process that does the magic for you.

+1
source

If this is a repeating process, and the file is downloaded via asp.net plus you are making some kind of data decision to decide whether to insert / update or delete, try http://www.codeproject.com/KB/database/CsvReader.aspx - It is a fast csv reader. Its pretty fast and economical with memory

0
source

You fulfill all your database queries with 1 connection in series. Therefore, for each insert / update / delete, you send a command through the wire, wait for db to do this, and then wake up again when something is sent back.

Databases are optimized for intensive concurrent access. Thus, there are 2 simple routes for significant acceleration:

Open X-connections to the database (where you need to configure X, but start with 5) and either: expand 5 threads, each of which does a piece of the same work that you did. or: use asynchronous calls when the callback arrives at the next request.

0
source

I suggest using XML functionality in SQL Server 2005/2008, which will allow you to perform bulk insert and bulk update. I would do the following approach:

  • Process the entire file in the data structure in memory.
  • Create one XML document from this structure to go to the saved process.
  • Create a stored procedure to load data from an XML document into a temporary table, then perform inserts and updates. The following is a guide to creating a stored procedure.

There are many advantages to this approach:

  • The whole operation completes in one database call, although if your data set is really large, you might want to load it.
  • You can easily transfer all database records into a single transaction and roll back if something fails.
  • You are not using dynamic SQL that could pose a security risk.
  • You can return the identifiers of inserted, updated, and / or deleted records using the OUTPUT clause.

For a stored procedure, you will need the following:

CREATE PROCEDURE MyBulkUpdater ( @p_XmlData VARCHAR(MAX) ) AS DECLARE @hDoc INT EXEC sp_xml_preparedocument @hDoc OUTPUT, @p_XmlData -- Temporary table, should contain the same schema as the table you want to update CREATE TABLE #MyTempTable ( -- ... ) INSERT INTO #MyTempTable ( [Field1], [Field2] ) SELECT XMLData.Field1, XMLData.Field2 FROM OPENXML (@hdoc, 'ROOT/MyRealTable', 1) WITH ( [Field1] int, [Field2] varchar(50), [__ORDERBY] int ) AS XMLData EXEC sp_xml_removedocument @hDoc 

Now you can simply insert, update and delete the real table from your temporary table, if required, for example

 INSERT INTO MyRealTable (Field1, Field2) SELECT Field1, Field2 FROM #MyTempTable WHERE ... UPDATE MyRealTable SET rt.Field2 = tt.Field2 FROM MyRealTable rt JOIN MyTempTable tt ON tt.Field1 = MyRealTable.Field1 WHERE ... 

For the XML example to be passed, you can do:

 SELECT TOP 1 *, 0 AS __ORDERBY FROM MyRealTable AS MyRealTable FOR XML AUTO, ROOT('ROOT') 

See OPENXML , sp_xml_preparedocument, and sp_xml_removedocument for more information.

0
source

All Articles