Optimizing an Oracle query by removing "exists" and "does not exist"

I recently translated part of the code into production in an oracle database, where one of the more experienced developers who reviewed this said that I had too many exists and not exists statements and that there should be a way to delete them, but it took too long. since he had to use it and didn’t remember much about how it works. Currently, I’ll be back and make part of the code more convenient, as it can be changed several times in the following years as the business logic / requirements change, and I would like to continue and optimize it, making it more convenient for maintenance.

I tried looking for it, but all I can find is recommendations for replacing not in with not exists and not returning valid results.

As such, I wonder what can be done to optimize exists / not exists or if there is a way to write exists / not exists so that the oracle optimizes it internally (probably with better than I can).

For example, how can you optimize the following?

 UPDATE SCOTT.TABLE_N N SET N.VALUE_1 = 'Data!' WHERE N.VALUE_2 = 'Y' AND EXISTS ( SELECT 1 FROM SCOTT.TABLE_Q Q WHERE N.ID = Q.N_ID ) AND NOT EXISTS ( SELECT 1 FROM SCOTT.TABLE_W W WHERE N.ID = W.N_ID ) 
+4
source share
3 answers

Your expression seems to me perfectly beautiful.

In any optimization task, do not think about templates. Do not think that: " (not) exists bad and slow, (not) in super cool and fast."

Think about how much the database works at each step and how you can measure it?

A simple example:

- NOT IN:

 23:59:41 HR@sandbox > alter system flush buffer_cache; System altered. Elapsed: 00:00:00.03 23:59:43 HR@sandbox > set autotrace traceonly explain statistics 23:59:49 HR@sandbox > select country_id from countries where country_id not in (select country_id from locations); 11 rows selected. Elapsed: 00:00:00.02 Execution Plan ---------------------------------------------------------- Plan hash value: 1748518851 ------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 6 | 4 (0)| 00:00:01 | |* 1 | FILTER | | | | | | | 2 | NESTED LOOPS ANTI SNA| | 11 | 66 | 4 (75)| 00:00:01 | | 3 | INDEX FULL SCAN | COUNTRY_C_ID_PK | 25 | 75 | 1 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | LOC_COUNTRY_IX | 13 | 39 | 0 (0)| 00:00:01 | |* 5 | TABLE ACCESS FULL | LOCATIONS | 1 | 3 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter( NOT EXISTS (SELECT 0 FROM "LOCATIONS" "LOCATIONS" WHERE "COUNTRY_ID" IS NULL)) 4 - access("COUNTRY_ID"="COUNTRY_ID") 5 - filter("COUNTRY_ID" IS NULL) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 11 consistent gets 8 physical reads 0 redo size 446 bytes sent via SQL*Net to client 363 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 11 rows processed 

- DOES NOT EXIST

 23:59:57 HR@sandbox > alter system flush buffer_cache; System altered. Elapsed: 00:00:00.17 00:00:02 HR@sandbox > select country_id from countries c where not exists (select 1 from locations l where l.country_id = c.country_id ); 11 rows selected. Elapsed: 00:00:00.30 Execution Plan ---------------------------------------------------------- Plan hash value: 840074837 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 11 | 66 | 1 (0)| 00:00:01 | | 1 | NESTED LOOPS ANTI| | 11 | 66 | 1 (0)| 00:00:01 | | 2 | INDEX FULL SCAN | COUNTRY_C_ID_PK | 25 | 75 | 1 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN| LOC_COUNTRY_IX | 13 | 39 | 0 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("L"."COUNTRY_ID"="C"."COUNTRY_ID") Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 5 consistent gets 2 physical reads 0 redo size 446 bytes sent via SQL*Net to client 363 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 11 rows processed 

In this example, NOT IN reads twice as many database blocks and performs more complex filtering - ask yourself why you selected it from NOT EXISTS?

+7
source

There is no reason to avoid using EXISTS or NOT EXISTS when this is what you need. In the example you provided, this is probably exactly what you want to use.

A typical dilemma is whether to use IN / NOT IN or EXISTS / NOT EXISTS. They are evaluated in completely different ways, and each one can be faster or slower depending on your specific circumstances.

See here for more details than you probably want.

+2
source

I don't know if this is much faster, but here is a way to write it without EXISTS / NOT EXISTS :

 MERGE INTO TABLE_N T USING ( SELECT N.ID, 'Data!' AS NEW_VALUE_1 FROM SCOTT.TABLE_N N INNER JOIN SCOTT.TABLE_Q Q ON Q.N_ID = N.ID LEFT JOIN SCOTT.TABLE_W W ON W.N_ID = N.ID WHERE N.VALUE_2 = 'Y' AND W.ID IS NULL ) X ON ( T.ID = X.ID ) WHEN MATCHED THEN UPDATE SET T.VALUE_1 = X.NEW_VALUE_1; 
+1
source

All Articles