Inconsistent results with calculated column NEWID () and PERSISTED

Question

Inconsistent results with calculated column NEWID () and PERSISTED

I get odd results when using NEWID () in combination with a constant computed column. Am I using any function incorrectly?

Non-use is saved when the column is created and, therefore, calculates the values when selected, will return the correct values. Updating the column (col1) will also return the correct values.

DECLARE @test TABLE ( Col1 INT, Contains2 AS CASE WHEN 2 IN (Col1) THEN 1 ELSE 0 END PERSISTED) INSERT INTO @test (Col1) VALUES (ABS(CHECKSUM(NEWID()) % 5)), (ABS(CHECKSUM(NEWID()) % 5)), (ABS(CHECKSUM(NEWID()) % 5)), (ABS(CHECKSUM(NEWID()) % 5)), (ABS(CHECKSUM(NEWID()) % 5)) SELECT * FROM @test UPDATE @test SET Col1 = Col1*1 SELECT * FROM @test /* Col1 Contains2 2 0 2 0 0 1 4 0 3 0 Col1 Contains2 2 1 2 1 0 0 4 0 3 0 */

+6

sql-server tsql case calculated-columns newid

Kristofer Jun 30 '16 at 10:38

source share

2 answers

During testing, I removed functions that are not related to NEWID, and showed the results if NEWID were calculated in advance. This may be helpful to others.

 DECLARE @test TABLE ( InsertType VARCHAR(30), Col1 VARCHAR(5), Contains2 AS CASE WHEN (Col1) LIKE '%2%' THEN 1 ELSE 0 END) --depends on Col1 INSERT INTO @test (InsertType, Col1) VALUES ('Compute With Insert', LEFT(NEWID(), 5)), ('Compute With Insert', LEFT(NEWID(), 5)), ('Compute With Insert', LEFT(NEWID(), 5)), ('Compute With Insert', LEFT(NEWID(), 5)), ('Compute With Insert', LEFT(NEWID(), 5)) SELECT * FROM @test DECLARE @A VARCHAR(5) = LEFT(NEWID(), 5); DECLARE @B VARCHAR(5) = LEFT(NEWID(), 5); DECLARE @C VARCHAR(5) = LEFT(NEWID(), 5); DECLARE @D VARCHAR(5) = LEFT(NEWID(), 5); DECLARE @E VARCHAR(5) = LEFT(NEWID(), 5); SELECT @A, @B, @C, @D, @E; INSERT INTO @Test (InsertType, Col1) VALUES ('Compute Before Insert', @A), ('Compute Before Insert', @B), ('Compute Before Insert', @C), ('Compute Before Insert', @D), ('Compute Before Insert', @E) SELECT * FROM @test InsertType Col1 Contains2 Compute With Insert C5507 0 Compute With Insert C17D7 0 Compute With Insert D9087 1 Compute With Insert E2DB0 0 Compute With Insert 7D1AF 1 Compute Before Insert 31050 0 Compute Before Insert 2954C 1 Compute Before Insert 9E205 1 Compute Before Insert DDF05 0 Compute Before Insert ED708 0

+1

John Jun 30 '16 at 16:42

source share

Vladimir Baranov · Accepted Answer · 2016-06-30T13:17:18+0000

Apparently, the query engine computes a random number twice for each row.

The first time for Col1 , the second time for the CASE statement of a saved column.

The optimizer does not know or care in this case that NEWID is a non-deterministic function and calls it twice.

Actually, he may not even have a choice. Do you want the optimizer to create a temporary table behind the scenes, populate Col1 results of an expression that generates random numbers, then read this temporary table and use these stored intermediate results to calculate the result of the CASE expression, then do the final INSERT ? In this case, the optimizer is cheaper to calculate the expression twice without writing intermediate results to disk. In some other cases (for example, if you do not have 5, but 5 billion rows or additional indexes), the estimated costs can be different, and this behavior will change.

I don’t think you can do much. Just be aware of this behavior. Always explicitly save the generated set of random numbers to a table, and then perform further calculations based on them.

I reproduced it in SQL Server 2008 and 2014. Here is the execution plan that I got in SQL Server 2008, but it is not very interesting. In 2014, the plan is the same, except for the Top operator.

Operator

Constant Scan lists Union1009 , which is used by Compute Scalar later. I think it comes down to the implementation details of the Constant Scan and / or Compute Scalar operators.

The observed behavior tells us that newid() is called here twice on a line.

Inconsistent results with calculated column NEWID () and PERSISTED

More articles: