How does an IN clause affect oracle performance?

UPDATE table1 SET col1 = 'Y' WHERE col2 in (select col2 from table2) 

In the above query, imagine that an internal query returns 10,000 rows. Does this query with an IN clause affect performance?

If so, what can be done for faster execution?

+7
source share
3 answers

if the subquery returns a large number of rows compared to the number of rows in table 1, the optimizer will most likely create such a plan:

 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time -------------------------------------------------------------------------------- | 0 | UPDATE STATEMENT | | 300K| 24M| | 1581 (1)| 00:0 | 1 | UPDATE | TABLE1 | | | | | |* 2 | HASH JOIN SEMI | | 300K| 24M| 9384K| 1581 (1)| 00:0 | 3 | TABLE ACCESS FULL| TABLE1 | 300K| 5860K| | 355 (2)| 00:0 | 4 | TABLE ACCESS FULL| TABLE2 | 168K| 10M| | 144 (2)| 00:0 -------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("COL2"="COL2") 

It scans both tables once and updates only the rows in table 1 that are common to both tables. This is a very effective plan if you need to update many lines.

Sometimes an internal query will have several rows compared to the number of rows in table 1. If you have an index on TABLE1(col2) , you can get a plan similar to this:

 ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | UPDATE STATEMENT | | 93 | 4557 | 247 (1)| 00:00:03 | | 1 | UPDATE | TABLE1 | | | | | | 2 | NESTED LOOPS | | 93 | 4557 | 247 (1)| 00:00:03 | | 3 | SORT UNIQUE | | 51 | 1326 | 142 (0)| 00:00:02 | | 4 | TABLE ACCESS FULL| TABLE2 | 51 | 1326 | 142 (0)| 00:00:02 | |* 5 | INDEX RANGE SCAN | IDX1 | 2 | 46 | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - access("T1"."COL2"="T2"."COL2") 

In this case, Oracle will read the rows from TABLE2 and for each (unique) row, access the index in TABLE1.

Which access faster depends on the selectivity of the internal query and the clustering of the index in TABLE1 (are these rows with a similar col2 in table 1 next to each other or randomly distributed?) In any case, in terms of performance, if you need to perform this update, this query is one of the fastest ways to do this.

+11
source
 UPDATE table1 outer SET col1 = 'Y' WHERE EXISTS (select null from table2 WHERE col2 = outer.col2) 

It could be better

To better understand the idea, see the implementation plan.

+3
source

From Oracle:

11.5.3.4 Using EXISTS vs. IN for Subqueries

In certain circumstances, it is better to use IN rather than EXISTS. In general, if a selective predicate is in a subquery, then use IN. If the selective predicate is in the parent query, then use EXISTS.

In my experience, I've seen better plans using EXISTS, where a large number of rows are returned in a subquery.

See here for a more in-depth discussion with Oracle.

+2
source

All Articles