Slow write to MSAccess

I am trying to write a VB.Net program that saves 1-2 million records of 5 fields (plus an indexed identifier) ​​in the MSAccess table every day. The conservation process currently takes 13-20 hours, which obviously cannot be right.

Its a flat table with minimal indexing, currently only 156 MB. With the exception of one double field, the fields are small strings, dates, or long. The disk itself is 15,000 SATA, which are used only for this file. The computer and the program do nothing during the save procedure. The save procedure is a simple FOR-NEXT loop that issues a short and simple INSERT statement for each record in the data set.

Anyone have any ideas on what I need to change to make this work better?

+2
source share
20 answers

A trick that can work in any DBMS to significantly speed up the insertion is to temporarily disable indexes, foreign keys and restrictions before mass loading data - then turn them on again after your data in the database.

In particular, indexes can be performance killers for sequential insertion, faster, at least in the order of (sometimes 2!) Values, to fill the table first, and then create an index from data already filled in than to insert it into the place. In this case, you may need to drop the index and then recreate it.

Then, as many other posters have said, it’s actually a waste of time to insert material into a string if you can do this in bundles. You will get a slight speed improvement if you open the table without locking at all or only optimistic locking.

And then you can get another tiny gain by using DAO recordsets instead of ADO. I noticed this again in the days when I developed in VB6, probably now this is not the case with ADO.NET

+2
source

You really have to control the volume insert. Each insertion has a bunch of overhead and by doing one line at a time in the next cycle, you lose more than 2 / 3rds of computer power. If the data comes in one row at a time, you will need to create a buffer to collect it before inserting the array into the database. Kibby suggested writing data to a csv file and then dumping it into a database, and if you need to write data, this is a good method. I would recommend collecting data in memory for several minutes at a time.

+1
source

Okay, back from a long lunch.

PAUL, PINEDA, your suggestion that PC indexing was a problem was correct. Get rid of the index and suddenly it stores 40,000 records per minute, fast enough to do the whole day for an hour. And this does not affect the speed of applications that use data at all.

The rest of you, generous people ... I will leave all your suggestions for the rest of the day, and hope this is even better.

You were extremely helpful. I all get you a beer.

+1
source

Do you have auto-commit enabled?

This would really slow you down, since each insert had to be phsyically written to disk before the next could be processed.

Try manually entering about 1000 inserts.

+1
source

(Disclaimer: I don't know anything about access)

What do profiling experts say? (the task manager will give you a few tips - add more columns to the display to see the inputs / outputs, VM usage, etc.)? Does it make a lot of disk access, or is it the whole processor? Does it consume a huge amount of memory?

Mention of the index concerns me, since it is potentially necessary for updating after each INSERT - can you turn off the index, create, and then index the full file?

Is it linear in time? (i.e. if you create a file 10% of the size, it takes 10% of the time)?

0
source

Thanks for your questions, Paul.

Using the page file is 600 MB, the processor is about 5% in most of the time, with spikes up to 80% of the range every 20 seconds or so. Memory: only 2G, 1.3G available, 1G system cache.

Yes, it seems linear, the first 15,000 records take 10 minutes.

AS to the index, I have not tried this, but Access always complains if you do not index at least the ID field.

I / O readings seem a lot, though, almost 6M after 20 minutes of run time and only 25,000 records.

0
source

First try using one of the many import options in Access. (Where does the data come from? Is it differentiated or the length of the face? How do you parse it with VB?)

You should be able to create a table without an index. by rejecting the Access clause to add it. But first do the import.

0
source

Thanks, Doofledorfer.

The data comes from the Internet from the data provider in real time in its own "vector" format, which I analyze in the data set. I view the entire file before starting the save procedure. Yes, it would be nice if I could "import" the data set directly into the database, and not save each record separately. But I do not know how to do this.

0
source

You can try to write the data to a CSV file, and then the data should be easily imported into access with a single command. This can significantly speed up the process.

0
source

How many connections did you open for the MSAccess database (mdb)? From what I saw, single-user mode is significantly faster than multi-user mode. Any open MS Access program is calculated as a single connection (you can view the ldb file using ldbview ).

Do you use row or page level locking? Starting with some version of Access (2000?), The default is row-level locking; I assume page-level locking will be faster.

You do not have an antivirus? They can intercept file operations and noticeably slow down the whole process.

0
source

Only one connection opens, and this is a single-user system.

Tried to disable my antivirus, but it only saves 800 entries per second.

I am not familiar with “blocking”, but my OLEDB connection string contains “Blocking mode = 1”

By the way, if that matters, the size of the virtual machine grew after 157 minutes to 157,000 K.

0
source

Here are some notes on the append request for a delimited text file, I'm afraid, but that might help.

Set cn = CreateObject("ADODB.Connection") strFile="C:\ltd.mdb" strCon="Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" _ & strFile & ";" cn.Open strCon strSQL="INSERT INTO tableX ( Name1,Name2 ) " _ & "SELECT Name1,Name2 " _ & "FROM [ltd.txt] IN '' [Text;Database=c:\docs\;HDR=YES;]" cn.Execute strSQL 
0
source

Try lock mode = 0 - this is the page level. 800 rps (records per second) gives 480,000 records in 10 minutes - did you mean 800 rpm?

0
source

Mike, the virus scanner turned off improved things, but something still sounds due to noise ... Agree with @Remou that volumetric insertion will be much better if you can do it.

Assuming you can't embed an array, there was just an example application (C #, sorry - but VB.NET would be similar) to create a directory, a table, and do some inserts. At the moment, I have missed the PK restriction on ID.

This gave me about 1000 lines in 550 ms working in a virtual machine with an anti-virus scanner on a laptop running. You can easily handle this fast drive. Where are the differences?

One thought - how do you insert an identifier? Autogenerated and omitted from an INSERT statement? Or did you insert a value and the column is labeled PK? The latter will undoubtedly trigger an index search (your significant IO?) To check the uniqueness of the data in the table?

 using System; using System.Data; using System.Data.OleDb; namespace ConsoleApplication1 { class Program { static void Main(string[] args) { String jetConnection = "Provider=Microsoft.Jet.OLEDB.4.0; Data Source=C:\\jetsample.mdb;"; ADOX.CatalogClass cat = new ADOX.CatalogClass(); cat.Create(jetConnection); using(OleDbConnection conn = new OleDbConnection(jetConnection)) { conn.Open(); using(OleDbCommand cmd = new OleDbCommand("CREATE TABLE test ([ID] INTEGER, [TestDouble] DOUBLE, [TestText] TEXT, [TestDate] DATE, [TestInt] INTEGER)",conn)) { cmd.CommandType = CommandType.Text; cmd.ExecuteNonQuery(); } using (OleDbCommand cmd = new OleDbCommand("INSERT INTO [Test] VALUES (@id, @testDouble, @testText, @testDate, @testInt)", conn)) { OleDbParameter id = cmd.Parameters.Add("@id", OleDbType.Integer); OleDbParameter testDouble = cmd.Parameters.Add("@testDouble", OleDbType.Double); OleDbParameter testText = cmd.Parameters.Add("@testText", OleDbType.VarWChar); OleDbParameter testDate = cmd.Parameters.Add("@testDate", OleDbType.Date); OleDbParameter testInt = cmd.Parameters.Add("@testInt", OleDbType.Integer); DateTime start = DateTime.Now; for (int index = 1; index <= 2000000; index++) { if (index % 1000 == 0) { System.Diagnostics.Debug.WriteLine(((TimeSpan)(DateTime.Now - start)).Milliseconds); start = DateTime.Now; } id.Value = index; testDouble.Value = index; testText.Value = String.Format("{0} DBL", index); testDate.Value = DateTime.Now.AddMilliseconds(index); testInt.Value = index; cmd.ExecuteNonQuery(); } } } } } } 
0
source

ARVO; Disabling the anti-virus scanner and switching to lock mode = 0 helped ... its up to 1300 entries per minute (yes, I meant minutes), but it is still pretty slow.

STEPHBU: my C skills are minimal, but as far as I can see, you are doing the same thing as me. Your "JetConnection" line is much simpler than mine ... mine here

  Me.connSPY.ConnectionString = "Jet OLEDB:Global Partial Bulk Ops=2;" & _ "Jet OLEDB:Registry Path=;Jet OLEDB:" & _ "Database Locking Mode=0;" & _ "Data Source=""E:\SPIRE.mdb"";" & _ "Mode=Share Deny None;" & _ "Jet OLEDB:Engine Type=5;" & _ "Provider=""Microsoft.Jet.OLEDB.4.0"";" & _ "Jet OLEDB:System database=;" & _ "Jet OLEDB:SFP=False" & _ ";persist security info=False;" & _ "Extended Properties=;" & _ "Jet OLEDB:Compact Without Replica Repair=False;" & _ "Jet OLEDB:Encrypt Database=False;" & _ "Jet OLEDB:Create System Database=False;" & _ "Jet OLEDB:Don't Copy Locale on Compact=False;" & _ "User ID=Admin;" & _ "Jet OLEDB:Global Bulk Transactions=1" 
0
source

ops, skipped one of your STEPHBU questions ... I allow the indexed identifier to automatically increment, rather than trying to assign it in an INSERT statement. Good point though!

0
source

Doofledorfer: here are 5 lines of sample input, as you requested, although I really think I'm on the right track now with an insert block suggestion and Primary key not indexed

 INSERT INTO Ticks (Symbol, TickDate, TickTime, TickPRice, TickVolume) VALUES ('SPY', #11/28/2008#, #09:30:00#, 88.63, 200); INSERT INTO Ticks (Symbol, TickDate, TickTime, TickPRice, TickVolume) VALUES ('SPY', #11/28/2008#, #09:30:00#, 88.62, 400); INSERT INTO Ticks (Symbol, TickDate, TickTime, TickPRice, TickVolume) VALUES ('SPY', #11/28/2008#, #09:30:00#, 88.62, 100); INSERT INTO Ticks (Symbol, TickDate, TickTime, TickPRice, TickVolume) VALUES ('SPY', #11/28/2008#, #09:30:00#, 88.62, 300); INSERT INTO Ticks (Symbol, TickDate, TickTime, TickPRice, TickVolume) VALUES ('SPY', #11/28/2008#, #09:30:00#, 88.62, 127); 
0
source

1-2 million records of 5 fields (plus an indexed identifier) ​​to the MSAccess table every day.

risking to state what should be obvious. You are solving the wrong problem.

Dump MS-Access and pull them to the MS-SQL server. If you REALLY need to access the data later through MS-Access, simply create a table link for the MS-SQL server table.

Even with 5 bytes per record, you will exceed the MS Access 2003 2GB Table / DB limit in less than a year (unfortunately, this is the same story for MS Access 2007).

0
source

CodeSlave .... This is a valid moment, and if all this works, I will have to spend money on SQL Server or something similar, not to mention a few more computers. So far I do not want to invest money or a learning curve.

Mr. Anderson ... I have not tried it yet, and I will do it. But at the moment, other offers have my 10-20 hour time savings of up to 15 minutes, so I'm a pretty happy tourist.

0
source

I suggest using the indexed sequential access method - ISAM. It allows you to directly transfer data from a connected recording source to a second dynamically connected data source.

  • Use OleDbConnection or a similar object to open a connection

  • Run a connection request using ISAM format instead of FROM

Sytax is as follows:

 private void PopulateMDB(string ExportPath, int iID) { string cnnStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + ExportPath; OleDbConnection oConn = new OleDbConnection(cnnStr); string q = @"INSERT INTO PensionData ([ID] ,[Recipient Name] ,[Gross Amt] ,[Retirement Date] ,[Plan]) select id as [ID] ,name as [Recipient Name] ,gross_amt as [Gross Amt] ,eff_dt as [Retirement Date] ,pln as [Plan] FROM [ODBC;Driver=SQL Server;SERVER=euddbms.d;DATABASE=DBName;UID=;PWD=;].tableName WHERE id = " + iID; oConn.Open(); try { OleDbCommand oCmd = new OleDbCommand(q, oConn); oCmd.ExecuteNonQuery(); } catch (Exception ex) { throw ex; } finally { oConn.Close(); oConn = null; } } 

ISAM Material ::

CSV [Text; Database = C: _path \; HDR = Yes]. [FILE.CSV]

Access [MS Access; Database = C: \ Path \ File.mdb]. [AccessTableName]

Excel [Excel 8.0; HDR = Yes; IMEX = 1; MaxScanRows = 16; Database = C: \ Path \ File.xls]. [Table $]

SQL Server [ODBC; Driver = SQL Server; SERVER =; DATABASE =; UID =; PWD =;]. [Table]

http://support.microsoft.com/kb/321686 http://support.microsoft.com/kb/200427

0
source

All Articles