How is this script update table when using LEFT JOINs?

Question

How is this script update table when using LEFT JOINs?

The result in this script is correct, but I don't seem to understand why the BetaStatus column remains "NOK"

As for Beta, the first line (Beta = NOK) will update @Summary.BetaStatus to NOK. But then I thought that the last two lines in @testTable would update BetaStatus from NOK to OK. I just want to make sure that it is not a coincidence that "NOK" is actually the last processed string and therefore the value.

 declare @testTable table ( id int, Pgroup varchar(10), Pstatus varchar(3) ) insert into @testTable select 3, 'Alpha', 'OK' insert into @testTable select 3, 'Beta', 'NOK' insert into @testTable select 3, 'Gamma', 'OK' insert into @testTable select 3, 'Beta', 'OK' insert into @testTable select 3, 'Beta', 'OK' declare @Summary table ( id int, AlphaStatus varchar(3), BetaStatus varchar(3), GammaStatus varchar(3) ) insert into @Summary (id) select 3 update @Summary set AlphaStatus = ISNULL(rA.Pstatus, AlphaStatus), BetaStatus = ISNULL(rB.Pstatus, BetaStatus), GammaStatus = ISNULL(rG.Pstatus, GammaStatus) from @Summary t left join @testTable rA on rA.id = t.ID AND rA.Pgroup = 'Alpha' left join @testTable rB on rB.id = t.ID AND rB.Pgroup = 'Beta' left join @testTable rG on rG.id = t.ID AND rG.Pgroup = 'Gamma' select * from @summary

The reason I ask is because for each id, AlphaStatus, BetaStatus, GammaStatus should not be changed back to “OK” if it was previously “NOK”. When it is upgraded to "NOK", it remains so no matter what comes next.

An alternative was to upgrade @Summary with "OK" values, and then re-upgrade with "NOK". Thus, I know that "NOK" will not be replaced. But if that works, I would rather use that.

And as a second question, why does UPDATE not work correctly if I use INNER JOIN?

Thanks.

+4

sql tsql

fdkgfosfskjdlsjdlkfsf Sep 03 '15 at 21:24

source share

1 answer

Lukasz Szozda · Answer 1 · 2015-09-03T22:24:28+0000

Let's see that return select instead of update

 select AlphaStatus = ISNULL(rA.Pstatus, AlphaStatus), BetaStatus = ISNULL(rB.Pstatus, BetaStatus), GammaStatus = ISNULL(rG.Pstatus, GammaStatus) from @Summary t left join @testTable rA on rA.id = t.ID AND rA.Pgroup = 'Alpha' left join @testTable rB on rB.id = t.ID AND rB.Pgroup = 'Beta' left join @testTable rG on rG.id = t.ID AND rG.Pgroup = 'Gamma'

Result:

 AlphaStatus BetaStatus GammaStatus OK NOK OK OK OK OK OK OK OK

Now you are trying to do UPDATE

 update @Summary set AlphaStatus = ISNULL(rA.Pstatus, AlphaStatus), BetaStatus = ISNULL(rB.Pstatus, BetaStatus), GammaStatus = ISNULL(rG.Pstatus, GammaStatus) from @Summary t left join @testTable rA on rA.id = t.ID AND rA.Pgroup = 'Alpha' left join @testTable rB on rB.id = t.ID AND rB.Pgroup = 'Beta' left join @testTable rG on rG.id = t.ID AND rG.Pgroup = 'Gamma'

After the update, the @Summary table contains:

 id AlphaStatus BetaStatus GammaStatus 3 OK NOK OK

I suppose you wanted to get:

 id AlphaStatus BetaStatus GammaStatus 3 OK OK OK

But UPDATE does not work this way when they match several times, the result may be inconsistent and partly based on the ordering of the tables or the actual execution plan.

See also: DO NOT UPDATE UPDATED FROM! Hugo Cornelis

Correctness? Bach, who cares?
Well, most do. That is why we are testing.
If I ruin the join criteria in a SELECT query so that too many rows from the second table match, I will see it as soon as I check because I get more rows than expected. If you spoil the criteria of a subquery in a standard UPDATE ANSI query in this way, I will see it even earlier, because SQL Server will throw an error if the subquery returns more than one value. But with the proprietary UPDATE FROM syntax, I can ruin the connection and never notice - SQL Server will happily update the same row again and again if it matches more than one row in the joined table, with the result of only the last of these updates sticking out. And there is no way to find out which row it will be, since it depends on the query execution plan that will be selected. In the worst case, a situation may arise when the execution plan will lead to the expected result in all tests on a single-processor development server, and then, after deployment on a four-processor dual-core production server, our valuable data suddenly hits the fan ...

Also look at this inconsistency instead of using a table variable and create clustered indexes:

SqlFiddleDemo

 CREATE TABLE testTable(id int, Pgroup varchar(10), Pstatus varchar(3)); CREATE CLUSTERED INDEX clx_name ON testTable(PStatus DESC); /* or */ CREATE CLUSTERED INDEX clx_name ON testTable(PStatus ASC);

If you are using MERGE, for example:

 ;WITH cte as (SELECT ra.id ,AlphaStatus = rA.Pstatus ,BetaStatus = rB.Pstatus ,GammaStatus = rG.Pstatus from @Summary t left join @testTable rA on rA.id = t.ID AND rA.Pgroup = 'Alpha' left join @testTable rB on rB.id = t.ID AND rB.Pgroup = 'Beta' left join @testTable rG on rG.id = t.ID AND rG.Pgroup = 'Gamma' ) MERGE @Summary AS TGT USING (SELECT * FROM cte ) AS SRC ON TGT.id = SRC.id WHEN MATCHED THEN UPDATE SET AlphaStatus = ISNULL(src.AlphaStatus, tgt.AlphaStatus), BetaStatus = ISNULL(src.BetaStatus, tgt.BetaStatus), GammaStatus = ISNULL(src.GammaStatus, tgt.GammaStatus);

You will get a clear error message that this is not allowed:

The MERGE statement tried to UPDATE or DELETE the same line more than once . This happens when the target line matches more than one source line. The MERGE statement cannot UPDATE / DELETE the same row of the target table several times. Refine the ON clause to ensure that the target row matches at most one source row, or use the GROUP BY clause to group the source rows.

How is this script update table when using LEFT JOINs?

More articles: