Why is this code slowing down?

I am currently trying to convert multiple Access databases to Xml files. I have done this before, and I still have the code from the previous project. However, this code will not allow me to structure the xml as I please, and this is what I need to do this time. I use XDocument with for -loops to achieve this, but it gets incredibly slow after a few 1000 lines of data.

Reading information about how XDocument works, says that XElement.Add actually copies all the xml code and adds a new element when it inserts everything back into the file. If true, then probably where the problem is.

This is the part that reads and writes data from Access to Xml, views and sees if there is a way to save it. Converting a database with 27 columns and 12,256 rows takes almost 30 minutes, and the smaller one, only 500 rows, takes about 5 seconds.

 private void ReadWrite(string file) { using (_Connection = new OleDbConnection(string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Mode=12;Data Source={0}", pathAccess))) { _Connection.Open(); //Gives me values from the AccessDB: tableName, columnName, colCount, rowCount and listOfTimeStamps. GetValues(pathAccess); XDocument doc = new XDocument(new XDeclaration("1.0", "utf-8", "true"), new XElement(tableName)); for (int rowInt = 0; rowInt < rowCount; rowInt++) { XElement item = new XElement("Item", new XAttribute("Time", listOfTimestamps[rowInt].ToString().Replace(" ", "_"))); doc.Root.Add(item); //colCount"-1" prevents the timestamp from beeing written again. for (int colInt = 0; colInt < colCount - 1; colInt++) { using (OleDbCommand cmnd = new OleDbCommand(string.Format("SELECT {0} FROM {1} Where TimeStamp = #{2}#", columnName[colInt] , tableName, listOfTimestamps[rowInt]), _Connection)) { XElement value = new XElement(columnName[colInt], cmnd.ExecuteScalar().ToString()); item.Add(value); } } //Updates progressbar backgroundWorker1.ReportProgress(rowInt); } backgroundWorker1.ReportProgress(0); doc.Save(file); } } 

This is the code from my old converter. This code is quite independent of the size of the database, database 12,556 takes just a second to convert. Could there be a way to merge the two?

 public void ReadWrite2(string file) { DataSet dataSet = new DataSet(); using (_Connection = new OleDbConnection(string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Mode=12;Data Source={0}", file))) { _Connection.Open(); DataTable schemaTable = _Connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" }); foreach (DataRow dataTableRow in schemaTable.Rows) { string tableName = dataTableRow["Table_Name"].ToString(); DataTable dataTable = dataSet.Tables.Add(tableName); using (OleDbCommand readRows = new OleDbCommand("SELECT * from " + tableName, _Connection)) { OleDbDataAdapter adapter = new OleDbDataAdapter(readRows); adapter.Fill(dataTable); } } } dataSet.WriteXml(file.Replace(".mdb", ".xml")); } 

EDIT:. To clarify, the application slows down as it runs. As in the first, 500 takes 5 seconds no matter how large the database is.

UPDATE: Okay, so I came back after the weekend now, and I made a small adjustment to the code to separate reading and writing, filling the jagged array with values ​​in one loop and writing them in another. This proved my theory wrong, and it introduces a reading that takes so long. Any ideas on how to populate an array with values ​​without getting into the database inside the loop?

UPDATE2: This is the end result after switching to DataReader.Read() -loop and immediately collecting all the data.

 public void ReadWrite3(string Save, string Load) { using (_Connection = new OleDbConnection(string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Mode=12;Data Source={0}", Load))) { _Connection.Open(); GetValues(_Connection); _Command = new OleDbCommand(String.Format("SELECT {0} FROM {1}", strColumns, tables), _Connection); XDocument doc = new XDocument(new XDeclaration("1.0", "utf-8", "true"), new XElement("plmslog", new XAttribute("machineid", root))); using (_DataReader = _Command.ExecuteReader()) { for (int rowInt = 0; _DataReader.Read(); rowInt++ ) { for (int logInt = 0; logInt < colCount; logInt++) { XElement log = new XElement("log"); doc.Root.Add(log); elementValues = updateElementValues(rowInt, logInt); for (int valInt = 0; valInt < elements.Length; valInt++) { XElement value = new XElement(elements[valInt], elementValues[valInt]); log.Add(value); } } } } doc.Save(Save); } } 
+4
source share
3 answers

Forgive me, but I think you are making your life more complicated than it should be. If you use the OleDbDataReader object, you can simply open it and read the Access table in turn, without caching the row data in the array (since you already have this in the DataReader).

For example, I have some sample data

 dbID dbName dbCreated ---- ------ --------- bar barDB 2013-04-08 14:19:27 foo fooDB 2013-04-05 11:23:02 

and the following code goes through the table ...

 static void Main(string[] args) { OleDbConnection conn = new OleDbConnection(@"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\Documents and Settings\Administrator\Desktop\Database1.accdb;"); conn.Open(); OleDbCommand cmd = new OleDbCommand("SELECT * FROM myTable", conn); OleDbDataReader rdr = cmd.ExecuteReader(); int rowNumber = 0; while (rdr.Read()) { rowNumber++; Console.WriteLine("Row " + rowNumber.ToString() + ":"); for (int colIdx = 0; colIdx < rdr.FieldCount; colIdx++) { string colName = rdr.GetName(colIdx); Console.WriteLine(" rdr[\"" + colName + "\"]: " + rdr[colName].ToString()); } } rdr.Close(); conn.Close(); Console.WriteLine("Done."); } 

... and produces the result ...

 Row 1: rdr["dbID"]: foo rdr["dbName"]: fooDB rdr["dbCreated"]: 2013-04-05 11:23:02 Row 2: rdr["dbID"]: bar rdr["dbName"]: barDB rdr["dbCreated"]: 2013-04-08 14:19:27 Done. 
+3
source

You get to the database from inside a nested loop (all rows and columns)

  using (OleDbCommand cmnd = new OleDbCommand(string.Format("SELECT {0} FROM {1} 

You might be better off not storing the data in an array or collection and then accessing the database.

+2
source

a simple mathematical calculation will tell you why. (amount of data)

27 * 12256 = 330912

27 * 500 = 13500

330912/13500 = 24512

so your big expression is 24512 times bigger!

(time)

30 * 60 = 1800

1800/5 = 360

therefore your time is 360 times longer!

you can see that your code seems to be working poorly.

+1
source

All Articles