How to improve performance when deleting entities from a database?

I started an ASP.NET project with Entity Framework 4 for my DAL using SQL Server 2008. In my database, I have a Users table that should have many rows (for example, 5.000.000).

Initially, I had a table of my users designed as follows:

 Id uniqueidentifier Name nvarchar(128) Password nvarchar(128) Email nvarchar(128) Role_Id int Status_Id int 

I changed the table and added the MarkedForDeletion column:

 Id uniqueidentifier Name nvarchar(128) Password nvarchar(128) Email nvarchar(128) Role_Id int Status_Id int MarkedForDeletion bit 

Should I delete every object every time or use the MarkedForDeletion attribute. This means that I need to update the value and at some point in time delete all users with the value set to true using a stored procedure or something like that.

Will updating the MarkedForDeletion attribute match the delete operation?

+1
source share
2 answers

Depending on the requirements / needs / future needs of your system, move your β€œdeleted” objects to a new table. Set an audit table to store deleted ones. Consider the case when someone wants to "restore" something.

To your productivity question: will the upgrade be at the same cost as the removal? Not. Updating would be much easier, especially if you had an index on PK (errrr, this is guid, not int). The fact is that updating a bit field is much cheaper. A (bulk) deletion will result in data permutation. Perhaps this work belongs during a downtime or a short period.

As for performance: compare it to see what happens! Given that the table has 5 million rows, it would be nice to see how your SQL Server works in its current state of indexes, swap, etc. With both scenarios. Back up your database and restore it to the new database. Here you can use the sandbox as you like. Script start and execution time:

  • bulk removal vs
  • update bit or smalldatetime vs. field
  • go to the audit table.

In terms of books, try:

  • this answer is re: books
  • recommendation for a book by Adam Mechanic
  • another question in database books .
+2
source

This may depend on what you want to do with the information. For example, you can mark a user for deletion, but don’t share all of his child entries (say, something like forum posts), in case you should mark the deletion or use the division date field. If you do this, create a view for all active users (called ActiveUsers), and then make sure that the view is used in any login request or where you want to see active users. This will help to avoid query errors when you forget to exclude inactive ones. If your system is active, do not make this change without going through and setting up all the queries that should use the new view.

Another reason for using the second version is to prevent slowdown when processing a large number of child records. They no longer need to be deleted if you use a remote flag. This can help performance because less resources are required. In addition, you can mark records for deltion, and then divide them in the middle of the night (or move to the history table) to reduce the main tables, but still not affect performance during peak hours.

+2
source

All Articles