Improving Sql Delete Performance

Question

Improving Sql Delete Performance

We have a request to delete some rows from a table based on the id field (primary key). This is a pretty simple request:

delete all from OUR_TABLE where ID in (123, 345, ...)

The problem is that identifiers can be huge (e.g. 70k), so the request takes a lot of time. Is there any way to optimize this? (We use sybase - if that matters).

+6

sql database sybase query-optimization

amit Feb 23 '09 at 10:44

source share

8 answers

There are two ways to make statements like this:

Create a new table and copy everything except the rows you want to delete. Then change the tables ( alter table name ... ). I suggest trying, even if that sounds silly. Some databases are much faster when copying than when deleting.
Separate your tables. Create N tables and use the view to combine them into one. Sort rows into different tables, grouped by deletion criteria. The idea is to delete the whole table instead of deleting individual rows.

+4

Aaron digulla Feb 23 '09 at 10:59

source share

I am wondering if parsing an IN clause with 70K elements. Have you tried a temporary table with a join?

+3

Johnoppincar Feb 23 '09 at 10:55

source share

Can Sybase handle 70K arguments in an IN section? All the databases that I worked with have some limitation on the number of arguments for the IN clause. For example, Oracle has a limit of about 1000.

Can you create a subselect instead of an IN clause? This will reduce sql. Maybe this can help for so many values in the IN section. Something like that:

  DELETE FROM OUR_TABLE WHERE ID IN (SELECT ID FROM somewhere WHERE some_condition)

Deleting a large number of records can be accelerated with some database interventions, if the database model allows. Here are a few strategies:

you can speed up the process by lowering the indexes, deleting the records again and re-creating the indexes. This will remove the revaluation of index trees when deleting records.
- delete all indexes in the table
- delete entries
- recreate indices
- If you have a lot of relationships with this table, try disabling constraints if you are absolutely sure that the delete command will not violate the integrity constraint. Delete will be much faster because the database will not check integrity. Enable restrictions after deletion.
- disable integrity constraints, disable control constraints
- delete entries
- enable restrictions
- disable triggers in the table, if any, and if your business rules allow it. Delete the entries, then turn on the triggers.
- last, do as the others suggested - make a copy of the table that contains the rows that should not be deleted, then cancel the original, rename the copy and recreate the integrity constraints, if any.

I would try a combination of 1, 2 and 3. If this does not work, then 4. If everything is slow, I will look for a larger box - more memory, faster disks.

+2

zendar Feb 23 '09 at 11:12

source share

Find out what performance is using!

In many cases, you can use one of the proposed solutions. But there may be others (based on Oracle knowledge, so everything will be different in other databases) Editing: just saw that you mentioned sybase):

Do you have foreign keys on this table? Make sure link identifiers are indexed.
Do you have indexes on this table? It is possible that deletion before deletion and reconstitution after deletion may be faster.
check the execution plan. Is an index used where scanning a full table can be faster? Or vice versa? HINTS can help
Instead of making a selection in new_table, as suggested above, create a table, because the selection can be even faster.

But remember: first find out what uses performance.

When you use DDL statements, make sure that you understand and accept the consequences that may have implications for transactions and backups.

+2

Jens schauder Feb 23 '09 at 11:30

source share

Try sorting the identifier you pass in to in the same order as the table, or the index is stored. Then you can get more disk cache hits.

Entering an identifier to be deleted in a temporary table that has identifiers sorted in the same order as the main table can allow the database to perform a simple check in the main table.

You can try to use more than one connection and swing the work on the connections to use all the processors on the database server, however, consider which locks will be removed, etc. at first.

+1

Ian ringrose Feb 23 '09 at 11:59

source share

I also think that a temporary table is probably the best solution.

If you were to do "delete from .. where ID in (select id from ...)", it can still be slow with large queries. Therefore, I suggest deleting with a connection - many people are not aware of this functionality.

So, given this table of examples:

  -- set up tables for this example if exists (select id from sysobjects where name = 'OurTable' and type = 'U') drop table OurTable go create table OurTable (ID integer primary key not null) go insert into OurTable (ID) values (1) insert into OurTable (ID) values (2) insert into OurTable (ID) values (3) insert into OurTable (ID) values (4) go

Then we can write our removal code as follows:

  create table #IDsToDelete (ID integer not null) go insert into #IDsToDelete (ID) values (2) insert into #IDsToDelete (ID) values (3) go -- ... etc ... -- Now do the delete - notice that we aren't using 'from' -- in the usual place for this delete delete OurTable from #IDsToDelete where OurTable.ID = #IDsToDelete.ID go drop table #IDsToDelete go -- This returns only items 1 and 4 select * from OurTable order by ID go

+1

oskarpearson Mar 09 '11 at 17:04

source share

Does our_table have a link to delete the cascade?

0

Peter Miehle Feb 23 '09 at 10:58

source share

Hlgem · Accepted Answer · 2009-02-23T14:35:23+0000

Consider running this packages. A cycle with 1000 records at a time can be much faster than a single query that does everything and, in addition, will not keep the table locked for other users for a long time.

If you have cascading deletions (and many affected foreign keys) or triggers involved, you may need to perform even smaller batches. You will need to experience which one is best for your situation. I had tables where I had to delete batches of 100 and others, where 50,000 worked (good luck in this case, when I deleted a million records).

But in any case, I would put my key values, which I intend to delete, in the temp table and delete from there.

Improving Sql Delete Performance

More articles: