I am testing a process that deletes many, many records at once. It cannot TRUNCATE TABLE , because there are records that must remain there.
Due to the volume, I broke delete into a loop like this:
-- Do not block if records are locked. SET LOCK_TIMEOUT 0 -- This process should be chosen as a deadlock victim in the case of a deadlock. SET DEADLOCK_PRIORITY LOW SET NOCOUNT ON DECLARE @Count SET @Count = 1 WHILE @Count > 0 BEGIN TRY BEGIN TRANSACTION -- added per comment below DELETE TOP (1000) FROM MyTable WITH (ROWLOCK, READPAST) WHERE MyField = SomeValue SET @Count == @@ROWCOUNT COMMIT END TRY BEGIN CATCH exec sp_lock -- added to display the open locks after the timeout exec sp_who2 -- shows the active processes IF @@TRANCOUNT > 0 ROLLBACK RETURN -- ignoring this error for brevity END CATCH
MyTable is a clustered table. MyField is in the first column in the clustered index. It indicates a logical grouping of records, so MyField = SomeValue often selects many records. I donโt care in which order they are deleted if one group is processed at a time. There are no other indexes in this table.
I added a ROWLOCK hint to try to avoid escalating the locks we saw during production. I added a READPAST hint to avoid deleting records blocked by other processes. It should never be, but I try to be safe.
Problem: sometimes this cycle hits the lock timeout 1222 "Request lock time out" when it only works.
I am sure that there is no other activity in this system while I am testing this process, because it is my own developer block, no one is connected, there are no other processes on it, and the profiler does not show any activity.
I can restart the same script after a second, and it picks up the place where it stopped, happily deleting entries - until the next lock timeout.
I tried a BEGIN TRY / BEGIN CATCH ignore error 1222 and try to delete again, but it again crashes again with the same lock timeout error. It also fails if I add a short delay before retrying.
I assume blocking timeouts are due to something like page splitting, but I'm not sure why this would contradict the current loop iteration. The previous delete statement should already be completed, and I thought that meant that all page separators were also completed.
Why does the DELETE loop hit the lock timeout?
Is there a way that a process can avoid this lockout timeout or find that it can be resumed?
This is on SQL Server 2005.
- EDIT -
I added a lock event: timeout in the profiler. It disables PAGELOCK during uninstall:
Event Class: Lock:Timeout TextData: 1:15634 (one example of several) Mode: 7 - IU Type: 6 - PAGE
DBCC PAGE reports that these pages are outside the range of the main database (ID 1).
- EDIT 2 -
I added a BEGIN TRY / BEGIN CATCH and ran exec sp_lock in a catch block. Here is what I saw:
spid dbid ObjId IndId Type Resource Mode Status 19 2 1401108082 1 PAG 1:52841 X GRANT (tempdb.dbo.MyTable) 19 2 1401108082 0 TAB IX GRANT (tempdb.dbo.MyTable) Me 2 1401108082 0 TAB IX GRANT (tempdb.dbo.MyTable) Me 1 1115151018 0 TAB IS GRANT (master..spt_values) (?)
SPID 19 is the manager of TASK MANAGER. Why will one of these task managers get locks on MyTable?