T-SQL situation statement weird behavior with newid () as a source of randomness

I am using SQL Server 2012.

If I do the following to get a list of random numbers in the range [1,3], it works fine:

SELECT TOP 100 ABS(CHECKSUM(NEWID()))%3 + 1 [value_of_rand] FROM sys.objects 

and I get such nice things (all between 1 and 3).

 3 2 2 2 1 ....etc. 

But if I then put the output of the same function with a chain random value in a CASE statement, it apparently does not produce only the values โ€‹โ€‹1,2,3.

 SELECT TOP 100 CASE (ABS(CHECKSUM(NEWID()))%3 + 1) WHEN 1 THEN 'one' WHEN 2 THEN 'two' WHEN 3 THEN 'three' ELSE 'that is strange' END [value_of_case] FROM sys.objects 

It outputs:

 three that is strange that is strange one two ...etc 

What am I doing wrong here?

+7
sql-server tsql
source share
3 answers

Your

 SELECT TOP 100 CASE (ABS(CHECKSUM(NEWID()))%3 + 1) WHEN 1 THEN 'one' WHEN 2 THEN 'two' WHEN 3 THEN 'three' ELSE 'that is strange' END [value_of_case] FROM sys.objects 

Actually performed:

 SELECT TOP 100 CASE WHEN (ABS(CHECKSUM(NEWID()))%3 + 1) = 1 THEN 'one' WHEN (ABS(CHECKSUM(NEWID()))%3 + 1) = 2 THEN 'two' WHEN (ABS(CHECKSUM(NEWID()))%3 + 1) = 3 THEN 'three' ELSE 'that is strange' END [value_of_case] FROM sys.objects; 

Basically your expression is not deterministic, and is evaluated each time, so you can get ELSE clause . Thus, there is no mistake or catch, you just use it with a variable expression, and this is absolutely normal behavior.

This is the same class as COALESCE syntactic-sugar

The COALESCE expression is a syntax shortcut for the CASE expression. That is, the COALESCE code (expression1, ... n) was rewritten by the query optimizer as the following CASE expression:

CASE

WHEN (expression1 is NOT NULL) THEN expression1

WHEN (expression2 is not NULL) THEN expression2

...

ELSE expressionN

END

This means that the input values โ€‹โ€‹(expression1, expression2, expressionN, etc.) will be evaluated several times. Also in accordance with the SQL standard, the expression of a value that contains a subquery is considered non-deterministic and the subquery is evaluated twice. In any case, different results may be returned between the first assessment and subsequent assessments .

EDIT:

Solution: SqlFiddle

 SELECT TOP 100 CASE t.col WHEN 1 THEN 'one' WHEN 2 THEN 'two' WHEN 3 THEN 'three' ELSE 'that is strange' END [value_of_case] FROM sys.objects CROSS APPLY ( SELECT ABS(CHECKSUM(NEWID()))%3 + 1 ) AS t(col) 
+7
source share

I think the problem you are facing here is that (ABS(CHECKSUM(NEWID()))%3 + 1) not a value, it is an expression, and SQL has the ability to re-evaluate it whenever it wants. You can try various syntax things, such as removing an extra parenthesis or CTE. This may make it go away (for now), but it may not happen, because logically it looks like the same request to the optimizer.

I think the only reliable, supported way to stop this is to first save it (to a temporary table or a real one), and then use the second query to refer to the stored values.

+2
source share

I canโ€™t say why, this is really strange, but I can give you a workaround. Select random values โ€‹โ€‹in cte before trying to use them.

 ;with rndsrc(value_of_rand) as ( SELECT TOP 100 ABS(CHECKSUM(NEWID()))%3 + 1 FROM sys.objects ) SELECT TOP 100 CASE value_of_rand WHEN 1 THEN 'one' WHEN 2 THEN 'two' WHEN 3 THEN 'three' ELSE 'that is strange' END [value_of_case] from rndsrc 

There is nothing strange "

+1
source share

All Articles