CASE vs Multiple UPDATE Queries for Large Datasets - Performance

For performance, which option would be better for large datasets that need to be updated?

Using a CASE statement or individual update requests?

CASE example:

UPDATE tbl_name SET field_name = CASE WHEN condition_1 THEN 'Blah' WHEN condition_2 THEN 'Foo' WHEN condition_x THEN 123 ELSE 'bar' END AS value 

Individual request example:

 UPDATE tbl_name SET field_name = 'Blah' WHERE field_name = condition_1 UPDATE tbl_name SET field_name = 'Foo' WHERE field_name = condition_2 UPDATE tbl_name SET field_name = 123 WHERE field_name = condition_x UPDATE tbl_name SET field_name = 'bar' WHERE field_name = condition_y 

NOTE. About 300,000 records will be updated, and the CASE statement will have about 10,000 WHEN clauses. If you use individual requests, it is also about 10,000.

+8
performance sql sql-update postgresql case
source share
3 answers

CASE version.

This is because there is a good chance that you change the same line several times with separate statements. If line 10 has both condition_1 and condition_y , then it will need to be read and changed twice. If you have a clustered index, this means that two clustered index updates on top of all the other changed fields (s) were.

If you can do this as a single statement, each line will be read only once, and it should run much faster.

I changed a similar process about a year ago in which dozens of UPDATE used sequentially for use with UPDATE with CASE , and processing time was reduced by about 80%.

+13
source share

It seems logical to me that with the first version, SQL Server will go through the table only once and for each row, it will evaluate the condition.

In the second case, he will have to go through the entire table 4 times

So, for a table with 1000 rows, according to the first option in the best case, we are talking about 1000 ratings and the worst case, 3000. On the second, we will always have 4000 ratings

This way option 1 will be faster.

0
source share

As Mitch pointed out, try making a temp table that fills it with all the necessary data, create another temp table for each column (field) that you want to change. You should also add an index to the topic table to further improve performance.

Thus, your update statement becomes (more or less):

 UPDATE tbl_name SET field_name = COALESCE((SELECT value FROM temp_tbl WHERE tbl_name.conditional_field = temp_tbl.condition_value), field_name), field_name2 = COALESCE((SELECT value FROM temp_tbl2 WHERE tbl_name.conditional_field2 = temp_tbl2.condition_value), field_name2) 

etc.

This should give you good performance while scaling for large volumes of updates.

0
source share

All Articles