Sql Query Pervious Row Optimization

Question

Sql Query Pervious Row Optimization

Here is my table structure

MyTable ----------- ObjectID int (Identity), -- Primary Key FileName varchar(10), CreatedDate datetime ........... ........... ...........

I need to spend time creating an entry in a file ... i.e. .... Time elapsed between a previous entry in the same file and the current entry in the same file

ie ... If the entries

 ObjectID FileName CreatedDate (just showing the time part here) -------- -------- ----------- 1 ABC 10:23 2 ABC 10:25 3 DEF 10:26 4 ABC 10:30 5 DEF 10:31 6 DEF 10:35

Required Conclusion ...

 ObjectID FileName CreatedDate PrevRowCreatedDate -------- -------- ----------- --------------- 1 ABC 10:23 NULL 2 ABC 10:25 10:23 3 DEF 10:26 NULL 4 ABC 10:30 10:25 5 DEF 10:31 10:26 6 DEF 10:35 10:31

So far I have received this request, but it takes a very long time than expected ... Is there a better way to do this ...

  Select A.ObjectID, A.FileName A.CreatedDate as CreatedDate, B.PrevRowCreatedDate, datediff("SS", '1900-01-01 00:00:00', Coalesce((A.CreatedDate - B.PrevRowCreatedDate),0)) as secondsTaken from MyTable as A Cross Apply ( (Select PrevRowCreatedDate = Max(CreatedDate) from MyTable as BB where BB.FileName = A.FileName and BB.CreatedDate < A.CreatedDate ) ) as B

Please let me know that you need more information.

thank

+2

sql sql-server-2005 query-optimization

The King Aug 6 '10 at 14:19

source share

2 answers

I think Michael's answer really should be more efficient. Evaluating the effectiveness, I just want to pay attention to the problem with the cost of the request (relative to the package) shown in Management Studio.

I set up a test table with 23174 rows and fulfilled the query in the question and Michael. Considering the “request cost (relative to the batch)” in the actual execution plan, the initial request was 1%, and Michael's cost was 99%, and therefore it seems to be ineffective.

However, actual statistics tell a completely different story.

Intersection Approach

SQL Server Runtime: CPU time = 0 ms, elapsed time = 0 ms.
The table is "MyTable". Scan number 23175 , logical read 49335 , physical read 0, read-ahead read 0, logical read lob 0, physical read lob 0, lob read-ahead read 0.

approach ROW_NUMBER

SQL Server Runtime: CPU time = 391 ms, elapsed time = 417 ms.
Table "Desktop". Number of scans 0, logical read 0, physical read 0, read-ahead reads 0, logical reads lob 0, physical reads lob 0, lob read-ahead reads 0.
The table is "MyTable". Scan number 2, logical readings 148, physical read 0, read forward 0, logical read forehead 0, physical read 0, read read 0.

In the ROW_NUMBER plan ROW_NUMBER Merge Join on rownumber=rownumber+1 has 23174 rows going in both directions. This value is unique, and the actual rows are 23.174. However, SQL Server estimates that the rows created from this connection will be 34,812,000, and therefore its estimated cost for insertion later in the plan is wildly inaccurate.

Script test

 BEGIN TRAN CREATE TABLE MyTable ( [ObjectID] [INT] IDENTITY(1, 1) NOT NULL PRIMARY KEY CLUSTERED, [FileName] [VARCHAR](50) NULL, [CreatedDate] [DATETIME] NULL ) GO INSERT INTO MyTable SELECT ISNULL(type, NEWID()), DATEADD(DAY, CAST(RAND(CAST(NEWID() AS VARBINARY)) * 10000 AS INT), GETDATE()) FROM master.dbo.spt_values, (SELECT TOP 10 1 AS X FROM master.dbo.spt_values) V DELETE FROM MyTable WHERE EXISTS(SELECT * FROM MyTable m2 WHERE MyTable.CreatedDate = m2.CreatedDate AND MyTable.FileName = m2.FileName AND MyTable.ObjectID < m2.ObjectID) CREATE UNIQUE NONCLUSTERED INDEX [IX_MyTable] ON MyTable ([FileName] ASC, [CreatedDate] ASC) SET STATISTICS IO ON SET STATISTICS TIME ON SELECT A.ObjectID, A.FileName, A.CreatedDate AS CreatedDate, B.PrevRowCreatedDate, DATEDIFF("SS", '1900-01-01 00:00:00', COALESCE(( A.CreatedDate - B.PrevRowCreatedDate ), 0)) AS secondsTaken INTO #A FROM MyTable AS A CROSS APPLY ((SELECT PrevRowCreatedDate = MAX(CreatedDate) FROM MyTable AS BB WHERE BB.FileName = A.FileName AND BB.CreatedDate < A.CreatedDate)) AS B; WITH t(ObjectID, FileName, CreatedDate, OrderNo) AS (SELECT ObjectID, FileName, CreatedDate, RANK() OVER(PARTITION BY FileName ORDER BY CreatedDate) AS OrderNo FROM MyTable) SELECT t1.ObjectID, t1.FileName, t1.CreatedDate, t2.CreatedDate AS PrevCreatedDate, DATEDIFF("SS", '1900-01-01 00:00:00', COALESCE(( t1.CreatedDate - t2.CreatedDate ), 0)) AS secondsTaken INTO #B FROM t t1 LEFT JOIN t t2 ON ( t1.FileName = t2.FileName AND t1.OrderNo = t2.OrderNo + 1 ) /*Test the 2 queries give the same result*/ SELECT * FROM #A EXCEPT SELECT * FROM #B SELECT * FROM #B EXCEPT SELECT * FROM #A ROLLBACK

+3

Martin Smith Aug 6 '10 at 17:50

source share

Michael Pakhantsov · Accepted Answer · 2010-08-06 14:28

 SELECT t1.FileName, t1.CreatedDate, t2.CreatedDate as PrevCreatedDate FROM (SELECT FileName, CreateDate, ROW_NUMBER() OVER(PARTITION BY FileName ORDER BY CreatedDate) AS OrderNo FROM MyTable) t1 LEFT JOIN (SELECT FileName, CreateDate, ROW_NUMBER() OVER(PARTITION BY FileName ORDER BY CreatedDate) AS OrderNo FROM MyTable) t2 ON (t1.FileName = t2.FileName AND t1.OrderNo = t2.OrderNo - 1)

Or it is better to use 'WITH' because the queries are identical:

 WITH t(ObjectID, FileName, CreatedDate, OrderNo) AS (SELECT ObjectID, FileName, CreatedDate, ROW_NUMBER() OVER(PARTITION BY FileName ORDER BY CreatedDate) AS OrderNo FROM MyTable) SELECT t1.ObjectID, t1.FileName, t1.CreatedDate, t2.CreatedDate AS PrevCreatedDate, DATEDIFF("SS", '1900-01-01 00:00:00', COALESCE((t1.CreatedDate - t2.CreatedDate),0)) AS secondsTaken FROM t t1 LEFT JOIN t t2 ON (t1.FileName = t2.FileName AND t1.OrderNo = t2.OrderNo + 1)

Sql Query Pervious Row Optimization

Intersection Approach

approach ROW_NUMBER

Script test

More articles: