Why does this query perform a full table scan?

Request:

SELECT tbl1.* FROM tbl1 JOIN tbl2 ON (tbl1.t1_pk = tbl2.t2_fk_t1_pk AND tbl2.t2_strt_dt <= sysdate AND tbl2.t2_end_dt >= sysdate) JOIN tbl3 on (tbl3.t3_pk = tbl2.t2_fk_t3_pk AND tbl3.t3_lkup_1 = 2577304 AND tbl3.t3_lkup_2 = 1220833) where tbl2.t2_lkup_1 = 1020000002981587; 

Data:

  • Oracle XE
  • tbl1.t1_pk is the primary key.
  • tbl2.t2_fk_t1_pk is the foreign key in this t1_pk column.
  • tbl2.t2_lkup_1 is indexed.
  • tbl3.t3_pk is the primary key.
  • tbl2.t2_fk_t3_pk is the foreign key in this t3_pk column.

Explain the database plan with 11,000 rows in tbl1 and 3,500 rows in tbl2 shows that it performs a full table scan on tbl1. It seems to me that it should be faster if it can make an index query on tbl1.

Explain the database plan with 11,000 rows in tbl1 and 3,500 rows in tbl2 shows that it performs a full table scan on tbl1. It seems to me that it should be faster if it can make an index query on tbl1.

Update: I tried the hint suggested by some of you, and the cost of explanation has become much worse! Now I'm really confused.

Further update: finally, I got access to a copy of the production database, and "explain the plan" showed it using indexes and with a much lower cost query. I think more data (over 100,000 rows in tbl1 and 50,000 rows in tbl2) is what it takes for him to decide that indexes are worth it. Thanks to everyone who helped. I still think that tuning Oracle performance is a black art, but I'm glad that some of you understand this.

Further update: I updated the question at the request of my former employer. They don’t like the names of their tables displayed in Google queries. I should have known better.

+4
source share
8 answers

It would be helpful to see optimizer row count estimates that are not in the SQL Developer output you posted.

I note that the two indexes it executes are RANGE SCAN, not UNIQUE SCAN. Thus, his estimates of the number of returned rows can be very distant (whether the statistics are relevant or not).

My guess is that his estimate of the total number of rows from TABLE ACCESS TBL2 is quite high, so he believes that he will find a large number of matches in TBL1 and therefore decides to do a full scan / hash join than scanning a nested loop / index .

For some real fun, you can run a query with event 10053 turned on and get a trace showing the calculations performed by the optimizer.

+3
source

Easy answer: since the optimizer expects more rows to find, it really will.

Check the statistics, are they updated? Check the expected power in terms of explanation, are they consistent with actual results? If you do not correct the statistics related to this step.

Bar charts for connected columns can help. Oracle will use them to evaluate the power resulting from the connection.

Of course, you can always force a hinted index

+5
source

Oracle is trying to return a result set with minimal I / O (this usually makes sense because I / O is slow). Indexes occupy at least 2 I / O calls. one to the index and one to the table. Usually more, depending on the size of the index and tables, the number of returned records, where they are in the data file, ...

Here you can find statistics. Suppose your query is designed to return 10 records. The optimizer can calculate that using an index will require 10 I / O calls. Let's say your table, according to statistics, is in 6 blocks in the data file. Oracle will perform a full scan faster (6 I / O), then read the index, read the table, read the index for the next matching key, read the table, etc.

So, in your case, the table can be very small. Statistics may be disabled.

I use the following to collect statistics and customize it for my exact needs:

 begin DBMS_STATS.GATHER_TABLE_STATS(ownname => '&owner' ,tabname => '&table_name', estimate_percent => dbms_stats.AUTO_SAMPLE_SIZE,granularity => 'ALL', cascade => TRUE); -- DBMS_STATS.GATHER_TABLE_STATS(ownname => '&owner' ,tabname => '&table_name',partname => '&partion_name',granularity => 'PARTITION', estimate_percent => dbms_stats.AUTO_SAMPLE_SIZE, cascade => TRUE); -- DBMS_STATS.GATHER_TABLE_STATS(ownname => '&owner' ,tabname => '&table_name',partname => '&partion_name',granularity => 'PARTITION', estimate_percent => dbms_stats.AUTO_SAMPLE_SIZE, cascade => TRUE,method_opt => 'for all indexed columns size 254'); end; 
+2
source

You can only find out by looking at the query plan created by the SQL optimizer / executor. This will, at least in part, be based on index statistics, which cannot be predicted only from the definition (and therefore can change over time).

SQL Management Studio for SQL Server 2005/2008, Query Analyzer for earlier versions.

(It is not possible to call the correct tool names for Oracle.)

+1
source

Try adding an index hint.

 SELECT /*+ index(tbl1 tbl1_index_name) */ ..... 

Sometimes Oracle just does not know which index to use.

+1
source

Apparently this query gives the same plan:

 SELECT tbl1.* FROM tbl1 JOIN tbl2 ON (tbl1.t1_pk = tbl2.t2_fk_t1_pk) JOIN tbl3 on (tbl3.t3_pk = tbl2.t2_fk_t3_pk) where tbl2.t2_lkup_1 = 1020000002981587 AND tbl2.t2_strt_dt <= sysdate AND tbl2.t2_end_dt >= sysdate AND tbl3.t3_lkup_1 = 2577304 AND tbl3.t3_lkup_2 = 1220833; 

What happens if you rewrite this query:

 SELECT tbl1.* FROM tbl1 , tbl2 , tbl3 where tbl2.t2_lkup_1 = 1020000002981587 AND tbl1.t1_pk = tbl2.t2_fk_t1_pk AND tbl3.t3_pk = tbl2.t2_fk_t3_pk AND tbl2.t2_strt_dt <= sysdate AND tbl2.t2_end_dt >= sysdate AND tbl3.t3_lkup_1 = 2577304 AND tbl3.t3_lkup_2 = 1220833; 
0
source

It looks like the index for table tbl1 is not matched. Make sure you have an index for the t2_lkup_1 column, and it should not be a multi-column, otherwise the index is not applicable.

(in addition to what Matt is commenting on) At your request, I believe that you are joining because you want to filter out entries not to make JOINs, which can increase the power for a set of results from tbl1 if there are matches. See Jeff Atwood Commentary

Try this, which uses the existence function and join (which happens very quickly with the oracle)

  select *
   from tbl1 
  where tbl2.t2_lkup_1 = 1020000002981587 and
        exists (
          select *
            from tbl2, tbl3 
           where tbl2.t2_fk_t1_pk = tbl1.t1_pk and
                 tbl2.t2_fk_t3_pk = tbl3.t3_pk and
                 sysdate between tbl2.t2_strt_dt and tbl2.t2_end_dt and
                 tbl3.t3_lkup_1 = 2577304 and
                 tbl3.t3_lkup_2 = 1220833);

0
source

Depending on your expected size of the result, you can play arround with some session parameters:

 SHOW PARAMETER optimizer_index_cost_adj; [...] ALTER SESSION SET optimizer_index_cost_adj = 10; SHOW PARAMETER OPTIMIZER_MODE; [...] ALTER SESSION SET OPTIMIZER_MODE=FIRST_ROWS_100; 

and do not forget to check the real time of execution, sometimes the plan is not the real world;)

0
source

All Articles