Very poor performance for batch insertion with SQL Server CE 4.0 and Entity Framework 4.2

I am inserting a lot of data into SQL Server CE 4.0 using Entity Framework 4.2 (with code at the beginning), and the performance is excessive compared to directly inserting SQL.

The model is very simple:

public class DocMember { public DocMember() { this.Items = new List<DocItem>(); } public int Id { get; set; } public string Name { get; set; } public string MemberType { get; set; } public string AssemblyName { get; set; } public virtual IList<DocItem> Items { get; set; } } public class DocItem { public int Id { get; set; } public DocMember Member { get; set; } public string PartType { get; set; } public string PartName { get; set; } public string Text { get; set; } } 

I have 2623 DocMembers and a total of 7747 DocItems to insert, and I get the following runtime:

 With SQL: 00:00:02.8 With EF: 00:03:02.2 

I can understand that with EF there is a bit of overhead, but 65 times slower than SQL!

Maybe there is a problem in my code, but it's pretty simple, and I don't see what might be wrong:

  private TimeSpan ImportMembersEF(IList<DocMember> members) { using (var db = new DocEntities()) { db.Database.CreateIfNotExists(); var sw = Stopwatch.StartNew(); foreach (var m in members) { db.Members.Add(m); } db.SaveChanges(); sw.Stop(); return sw.Elapsed; } } 

I also tried calling SaveChanges for every element inserted, or every 100 or 200 elements to no avail (this actually makes it worse).

Is there a way to improve performance, or do I need to use SQL for batch inserts?


EDIT: for completeness, here's the code for inserting SQL: http://pastebin.com/aeaC1KcB

+8
performance entity-framework sql-server-ce-4
source share
2 answers

You can use my SqlCeBulkCopy library to load massive data, it mimics the SqlBulkCopy api: http://sqlcebulkcopy.codeplex.com

+6
source share

It is slow because it does not dose the insert.

When you insert an identifier on db, it must select a result identifier after each element to assign the model to. This makes it very slow.

Your adhoc SQL address does not select an identifier, therefore, after loading, you can immediately send all the statements.

Altho is written from NHibernate:

http://www.philliphaydon.com/2011/09/the-benefits-of-letting-the-orm-generate-the-identity-part-1/

I wrote about using ORM generated identifiers and SQL generated identifiers.

+1
source share

All Articles