Is there a way to get different results for the same SQL query if the data remains unchanged?

Question

Is there a way to get different results for the same SQL query if the data remains unchanged?

I get a different set of results for this query intermittently when I run it ... sometimes it gives 1363, sometimes 1365, and sometimes 1366. The data does not change. What could be causing this and is there a way to prevent this? The query looks something like this:

SELECT * FROM ( SELECT RC.UserGroupId, RC.UserGroup, RC.ClientId AS CLID, CASE WHEN T1.MultipleClients = 1 THEN RC.Salutation1 ELSE RC.DisplayName1 END AS szDisplayName, T1.MultipleClients, RC.IsPrimaryRecord, RC.RecordTypeId, RC.ClientTypeId, RC.ClientType, RC.IsDeleted, RC.IsCompany, RC.KnownAs, RC.Salutation1, RC.FirstName, RC.Surname, Relationship, C.DisplayName Client, RC.DisplayName RelatedClient, E.Email, RC.DisplayName + ' is the ' + R.Relationship + ' of ' + C.DisplayName Description, ROW_NUMBER() OVER (PARTITION BY E.Email ORDER BY Relationship DESC) AS sequence_id FROM SSDS.Client.ClientExtended C INNER JOIN SSDS.Client.ClientRelationship R WITH (NOLOCK)ON C.ClientId = R.ClientID INNER JOIN SSDS.Client.ClientExtended RC WITH (NOLOCK)ON R.RelatedClientId = RC.ClientId LEFT OUTER JOIN SSDS.Client.Email E WITH (NOLOCK)ON RC.ClientId = E.ClientId LEFT OUTER JOIN SSDS.Client.UserDefinedData UD WITH (NOLOCK)ON C.ClientId = UD.ClientId AND C.UserGroupId = UD.UserGroupId INNER JOIN ( SELECT E.Email, CASE WHEN (COUNT(DISTINCT RC.DisplayName) > 1) THEN 1 ELSE 0 END AS MultipleClients FROM SSDS.Client.ClientExtended C INNER JOIN SSDS.Client.ClientRelationship R WITH (NOLOCK)ON C.ClientId = R.ClientID INNER JOIN SSDS.Client.ClientExtended RC WITH (NOLOCK)ON R.RelatedClientId = RC.ClientId LEFT OUTER JOIN SSDS.Client.Email E WITH (NOLOCK)ON RC.ClientId = E.ClientId LEFT OUTER JOIN SSDS.Client.UserDefinedData UD WITH (NOLOCK)ON C.ClientId = UD.ClientId AND C.UserGroupId = UD.UserGroupId WHERE Relationship IN ('z-Group Principle', 'z-Group Member ') AND E.Email IS NOT NULL GROUP BY E.Email ) T1 ON E.Email = T1.Email WHERE Relationship IN ('z-Group Principle', 'z-Group Member ') AND E.Email IS NOT NULL ) T WHERE sequence_id = 1 AND T.UserGroupId IN (Select * from iCentral.dbo.GetSubUserGroups('471b9cbd-2312-4a8a-bb20-35ea53d30340',0)) AND T.IsDeleted = 0 AND T.RecordTypeId = 1 AND T.ClientTypeId IN ( '1', --Client '-1652203805' --NTU ) AND T.CLID NOT IN ( SELECT DISTINCT UDDF.CLID FROM SLacsis_SLM.dbo.T_UserDef UD WITH (NOLOCK) INNER JOIN SLacsis_SLM.dbo.T_UserDefData UDDF WITH (NOLOCK) ON UD.UserDef_ID = UDDF.UserDef_ID INNER JOIN SLacsis_SLM.dbo.T_Client CLL WITH (NOLOCK) ON CLL.CLID = UDDF.CLID AND CLL.UserGroup_CLID = UD.UserID WHERE UD.UserDef_ID in ( 'F68F31CE-525B-4455-9D50-6DA77C66FEE5', 'A7CECB03-866C-4F1F-9E1A-CEB09474FE47' ) AND UDDF.Data = 'NO' ) ORDER BY T.Surname

EDIT:

I deleted all NOLOCK (including those in the views and UDF) and I still have the same problem. I get the same results every time for a nested select (T), and if I put the result set T in a temporary table at the beginning of the query and join the temp table instead of the nested select, then the final result set is the same every time I run the query.

EDIT2:

I read another ROW_NUMBER () ... I split by email (of which there are duplicates) and sorted by Relationship (where there is only 1 of 2 relationships). Can this make the request non-deterministic and can it be fixed?

EDIT3:

Here are real execution plans if anyone is interested in http://www.mediafire.com/?qo5gkh5dftxf0ml . Is it possible to see that it works as read from the execution plan? I compared files using WinMerge, and the only differences seem to be the counts (ActualRows = "").

EDIT4:

It works:

 SELECT * FROM ( SELECT *, ROW_NUMBER() OVER (PARTITION BY B.Email ORDER BY Relationship DESC) AS sequence_id FROM ( SELECT DISTINCT RC.UserGroupId, ... ) B...

EDIT5:

When I execute the same query ROW_NUMBER () (T in the original question, just selecting RC.DisplayName and ROW_NUMBER) two times in a row I get different ranks for some people:

enter image description here

Does anyone have a good explanation / example of why and how ROW_NUMBER () on a result set that contains duplicates can be evaluated differently each time it is executed, and ultimately change the number of results?

EDIT6:

OK, I think that makes sense to me now. This happens when 2 people have the same email address (for example, a couple of husband and wife) and relationship. I think in this case their rating ROW_NUMBER () is arbitrary and can change every time it starts.

+4

sql tsql sql-server-2008

woggles Aug 19 '11 at 14:25

source share

6 answers

Your use of NOLOCK in everything means that you are doing dirty reads and see uncommitted data, data that will be rolled back, temporary and inconsistent data, etc.

Take it off, try again, report your requests

Edit: some options with NOLOCKS removal

Data is really changing.
A parameter or filter is modified (e.g. GETDATE)
Some float comparisons performed on different cores each time
See This at dba.se https://dba.stackexchange.com/q/4810/630
Built-in NOLOCK in udfs or views (e.g. iCentral.dbo.GetSubUserGroups)
...

+10

gbn Aug 19 '11 at 14:30

source share

I think your problem in the first line above the section is not deterministic. I suspect email and relationships are not unique.

  ROW_NUMBER() OVER (PARTITION BY E.Email ORDER BY Relationship DESC) AS sequence_id

Later you examine the first line of the section.

  WHERE T.sequence_id = 1 AND T.UserGroupId ...

If this first line is arbitrary, you will get an arbitrary comparison. You need to add to ORDER BY to enable the complete unique key. If there is no unique key, you need to make one or live with arbitrary results. Even in a table with clustered PK, row order selection is not guaranteed unless all PK is in the sort clause.

+4

paparazzo Aug 24 '11 at 14:48

source share

This is probably due to ordering. You have a sequence_id defined as row_number ordered by Relationship. You will always get a reasonable relationship order, but other than that your row_number will be random. This way you can get different strings with sequence_id 1 each time. This, in turn, will affect your where clause, and you can get a different number of results. To fix this, to get a consistent result, add another field to your row_number order. Use a primary key to ensure consistent results.

+3

user12861 Aug 24 '11 at 14:05

source share

In a recent KB that fixes problems with ROW_NUMBER () ... see FIX: you get the wrong result when you run a query that uses the row_number function in SQL Server 2008 for details.

However, this KB indicates that this is a problem when parallelism is called for execution, and looking at your execution plans, I do not see this. But the fact that MS discovered a problem with it in one situation, I am a little cautious - i.e. can the same problem arise for a fairly complex request (and your execution plan looks big enough).

So it might be worth checking out the SQL Server 2008 patch levels.

+2

Chris j Aug 24 '11 at 14:03

source share

U Use only

 Order by

without an indication.

 ROW_NUMBER() OVER (ORDER BY Relationship DESC) AS sequence_id

0

mohsen Mar 29 '13 at 8:22

source share

Martin smith · Accepted Answer · 2011-08-24T14:11:57+0000

As I said yesterday in the comments, line numbering for lines with duplicate E.Email, Relationship values E.Email, Relationship will be arbitrary.

To make it deterministic, you need to do PARTITION BY B.Email ORDER BY Relationship DESC, SomeUniqueColumn . Interestingly, it changes between runs using the same execution plan. I assume this is a consequence of the hash join.

Is there a way to get different results for the same SQL query if the data remains unchanged?

More articles: