oracle中not exists对外层查询的影响详解

前言

最近同事发现了一个问题，在12c中跑的buffer get很高，但是在10g中跑的buffer很低。怀疑是不是12c的优化器有问题。

这个10g的环境和12c的环境，数据量大致一样，只是有很少部分的不同，但是就是这个很少部分不同，造成了not exists中的子查询返回不同的值，进而对外层查询产生不同的影响。

我们来用如下的代码模拟一下。

初始化数据：

								
									 --10g 

									 drop   table   t1; 

									 drop   table   t2; 

									 create   table   t1 (id number,  name   varchar2(20),dep_id varchar2(10)); 

									 create   table   t2 (id number,  name   varchar2(20),dep_id varchar2(10)); 

									 insert   into   t1   select   rownum,  'a'  ,  'kk'   from   dual   connect   by   level   <=3000000; 

									 insert   into   t2   select   rownum,  'a'  ,  'kk'   from   dual   connect   by   level   <=1000000; 

									 insert   into   t2   select   rownum,  'a'  ,  'mm'   from   dual; 

									 commit  ; 

									 --12c 

									 drop   table   t1; 

									 drop   table   t2; 

									 create   table   t1 (id number,  name   varchar2(20),dep_id varchar2(10)); 

									 create   table   t2 (id number,  name   varchar2(20),dep_id varchar2(10)); 

									 insert   into   t1   select   rownum,  'a'  ,  'kk'   from   dual   connect   by   level   <=3000000; 

									 insert   into   t2   select   rownum,  'a'  ,  'kk'   from   dual   connect   by   level   <=1000000; 

									 commit  ;

我们看到，12c的数据和10g只是有很少的差别，t1表12c和10g都一样，t2表在12c只是少了一行数据。

								
									 --10g 

									 SQL>   select   dep_id,  count  (*)   from   t1   group   by   dep_id; 

									 DEP_ID       COUNT  (*) 

									 -------------------- ---------- 

									 kk      3000000 

									 SQL>   select   dep_id,  count  (*)   from   t2   group   by   dep_id; 

									 DEP_ID       COUNT  (*) 

									 -------------------- ---------- 

									 mm       1 

									 kk      1000000 

									 SQL> 

									 --12c 

									 SQL>   select   dep_id,  count  (*)   from   t1   group   by   dep_id; 

									 DEP_ID       COUNT  (*) 

									 -------------------- ---------- 

									 kk      3000000 

									 SQL>   select   dep_id,  count  (*)   from   t2   group   by   dep_id; 

									 DEP_ID       COUNT  (*) 

									 -------------------- ---------- 

									 kk      1000000 

									 SQL>

我们将要执行的sql语句是：

select count (*)

from t1, t2

where t1.id = t2.id

and t1.dep_id = 'kk'

and not exists ( select 1

from t1, t2

where t1.id = t2.id

and t2.dep_id = 'mm' );

我们先来看执行情况的差距，10g的bufferget小，12c多：

								
									 --10g 

									 SQL>   select   /*+ gather_plan_statistics */   count  (*)   from   t1,t2   where   t1.id=t2.id   and   t1.dep_id=  'kk'   and   not   exists (  select   1   from   t1,t2   where   t1.id=t2.id   and   t2.dep_id=  'mm'  ); 

									    COUNT  (*) 

									 ---------- 

									      0 

									 SQL>   select  *   from   table  (dbms_xplan.display_cursor(  null  ,  null  ,  'ALLSTATS LAST'  )); 

									 PLAN_TABLE_OUTPUT 

									 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 

									 SQL_ID 22t5mb43w55pr, child number 0 

									 ------------------------------------- 

									 select   /*+ gather_plan_statistics */   count  (*)   from   t1,t2   where   t1.id=t2.id   and   t1.dep_id=  'kk'   and   not 

									 exists (  select   1   from   t1,t2   where   t1.id=t2.id   and   t2.dep_id=  'mm'  ) 

									 Plan hash value: 3404612428 

									 ------------------------------------------------------------------------------------------------------------------ 

									 | Id | Operation   |   Name   | Starts | E-  Rows   | A-  Rows   | A-  Time   | Buffers | OMem | 1Mem | Used-Mem | 

									 ------------------------------------------------------------------------------------------------------------------ 

									 | 0 |   SELECT   STATEMENT  |  |  1 |  |  1 |0.02 | 2086 |  |  |   | 

									 | 1 | SORT AGGREGATE  |  |  1 |  1 |  1 |0.02 | 2086 |  |  |   | 

									 |* 2 | FILTER    |  |  1 |  |  0 |0.02 | 2086 |  |  |   | 

									 |* 3 | HASH   JOIN     |  |  0 | 901K|  0 |0.01 |  0 | 39M| 5518K|   | 

									 | 4 |    TABLE   ACCESS   FULL  | T2 |  0 | 901K|  0 |0.01 |  0 |  |  |   | 

									 |* 5 |    TABLE   ACCESS   FULL  | T1 |  0 | 2555K|  0 |0.01 |  0 |  |  |   | 

									 |* 6 | HASH   JOIN     |  |  1 |  23 |  1 |0.02 | 2086 | 1517K| 1517K| 612K (0)| 

									 |* 7 |    TABLE   ACCESS   FULL  | T2 |  1 |  23 |  1 |0.02 | 2082 |  |  |   | 

									 | 8 |    TABLE   ACCESS   FULL  | T1 |  1 | 2555K|  1 |0.01 |  4 |  |  |   | 

									 ------------------------------------------------------------------------------------------------------------------ 

									 Predicate Information (identified   by   operation id): 

									 --------------------------------------------------- 

									    2 - filter(   IS   NULL  ) 

									    3 - access(  "T1"  .  "ID"  =  "T2"  .  "ID"  ) 

									    5 - filter(  "T1"  .  "DEP_ID"  =  'kk'  ) 

									    6 - access(  "T1"  .  "ID"  =  "T2"  .  "ID"  ) 

									    7 - filter(  "T2"  .  "DEP_ID"  =  'mm'  ) 

									 Note 

									 ----- 

									    -   dynamic   sampling used   for   this statement 

									 34   rows   selected. 

									 SQL> 

									 --12c 

									 SQL>   select   /*+ gather_plan_statistics */   count  (*)   from   t1,t2   where   t1.id=t2.id   and   t1.dep_id=  'kk'   and   not   exists (  select   1   from   t1,t2   where   t1.id=t2.id   and   t2.dep_id=  'mm'  ); 

									    COUNT  (*) 

									 ---------- 

									    1000000 

									 SQL>   select  *   from   table  (dbms_xplan.display_cursor(  null  ,  null  ,  'ALLSTATS LAST'  )); 

									 PLAN_TABLE_OUTPUT 

									 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 

									 SQL_ID 22t5mb43w55pr, child number 0 

									 ------------------------------------- 

									 select   /*+ gather_plan_statistics */   count  (*)   from   t1,t2   where 

									 t1.id=t2.id   and   t1.dep_id=  'kk'   and   not   exists (  select   1   from   t1,t2 

									 where   t1.id=t2.id   and   t2.dep_id=  'mm'  ) 

									 Plan hash value: 1692274438 

									 -------------------------------------------------------------------------------------------------------------------- 

									 | Id | Operation    |   Name   | Starts | E-  Rows   | A-  Rows   | A-  Time   | Buffers | OMem | 1Mem | Used-Mem | 

									 -------------------------------------------------------------------------------------------------------------------- 

									 | 0 |   SELECT   STATEMENT  |  |  1 |  |  1 |0.79 | 10662 |  | |  | 

									 | 1 | SORT AGGREGATE  |  |  1 |  1 |  1 |0.79 | 10662 |  | |  | 

									 |* 2 | FILTER    |  |  1 |  | 1000K|0.74 | 10662 |  | |  | 

									 |* 3 | HASH   JOIN     |  |  1 | 1215K| 1000K|0.52 | 8579 | 43M| 6111K| 42M (0)| 

									 | 4 |    TABLE   ACCESS   FULL   | T2 |  1 | 1215K| 1000K|0.01 | 2083 |  | |  | 

									 |* 5 |    TABLE   ACCESS   FULL   | T1 |  1 | 2738K| 3000K|0.07 | 6496 |  | |  | 

									 |* 6 | HASH   JOIN   RIGHT   SEMI|  |  1 |  35 |  0 |0.02 | 2083 | 1245K| 1245K| 461K (0)| 

									 |* 7 |    TABLE   ACCESS   FULL   | T2 |  1 |  23 |  0 |0.02 | 2083 |  | |  | 

									 | 8 |    TABLE   ACCESS   FULL   | T1 |  0 | 2738K|  0 |0.01 |  0 |  | |  | 

									 -------------------------------------------------------------------------------------------------------------------- 

									 Predicate Information (identified   by   operation id): 

									 --------------------------------------------------- 

									    2 - filter(   IS   NULL  ) 

									    3 - access(  "T1"  .  "ID"  =  "T2"  .  "ID"  ) 

									    5 - filter(  "T1"  .  "DEP_ID"  =  'kk'  ) 

									    6 - access(  "T1"  .  "ID"  =  "T2"  .  "ID"  ) 

									    7 - filter(  "T2"  .  "DEP_ID"  =  'mm'  ) 

									 Note 

									 ----- 

									    -   dynamic   statistics   used:   dynamic   sampling (  level  =2) 

									 35   rows   selected. 

									 SQL> 

									 SQL>

可以看到第23，24行，在10g中运行时，buffers是0，而在12c中，即78，79行，buffer是2083+6496。

也就是说在10g中，外层查询不进行t1和t2的扫描，直接返回结果了，而在12c中，外层查询还要进行t1表和t2表层扫描才返回结果。

这其实不是10g和12c的差别，而是not exists的返回数据对外层的影响。子查询要返回0行记录，才满足not exist的条件，从而返回外层查询结果。

在10g中，子查询返回了一行记录

--10g

SQL> select 1 from t1,t2 where t1.id=t2.id and t2.dep_id= 'mm' ;

----------

SQL>

不满足not exists（即0行才满足），所以，也就不用在外层继续查询了。直接返回记录0行。

在12c中，子查询返回0行记录，满足not exist的条件，所以还需要在外层查询中继续查询。

								
									 --12c 

									 SQL>   select   count  (*)   from   t1,t2   where   t1.id=t2.id   and   t2.dep_id=  'kk'  ; 

									    COUNT  (*) 

									 ---------- 

									    1000000 

									 SQL>   set   line 1000 

									 SQL>   set   pages 1000 

									 SQL> col PLAN_TABLE_OUTPUT   for   a250 

									 SQL> 

									 SQL> 

									 SQL>   select   /*+ gather_plan_statistics */   count  (*)   from   t1,t2   where   t1.id=t2.id   and   t1.dep_id=  'kk'   and   not   exists (  select   1   from   t1,t2   where   t1.id=t2.id   and   t2.dep_id=  'kk'  ); 

									    COUNT  (*) 

									 ---------- 

									      0 

									 SQL>   select  *   from   table  (dbms_xplan.display_cursor(  null  ,  null  ,  'ALLSTATS LAST'  )); 

									 PLAN_TABLE_OUTPUT 

									 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 

									 SQL_ID c5hj2p2jt1fxf, child number 0 

									 ------------------------------------- 

									 select   /*+ gather_plan_statistics */   count  (*)   from   t1,t2   where 

									 t1.id=t2.id   and   t1.dep_id=  'kk'   and   not   exists (  select   1   from   t1,t2 

									 where   t1.id=t2.id   and   t2.dep_id=  'kk'  ) 

									 Plan hash value: 1692274438 

									 -------------------------------------------------------------------------------------------------------------------- 

									 | Id | Operation    |   Name   | Starts | E-  Rows   | A-  Rows   | A-  Time   | Buffers | OMem | 1Mem | Used-Mem | 

									 -------------------------------------------------------------------------------------------------------------------- 

									 | 0 |   SELECT   STATEMENT  |  |  1 |  |  1 |0.28 | 2087 |  | |  | 

									 | 1 | SORT AGGREGATE  |  |  1 |  1 |  1 |0.28 | 2087 |  | |  | 

									 |* 2 | FILTER    |  |  1 |  |  0 |0.28 | 2087 |  | |  | 

									 |* 3 | HASH   JOIN     |  |  0 | 1215K|  0 |0.01 |  0 | 69M| 7428K|   | 

									 | 4 |    TABLE   ACCESS   FULL   | T2 |  0 | 1215K|  0 |0.01 |  0 |  | |  | 

									 |* 5 |    TABLE   ACCESS   FULL   | T1 |  0 | 2738K|  0 |0.01 |  0 |  | |  | 

									 |* 6 | HASH   JOIN   RIGHT   SEMI|  |  1 | 2738K|  1 |0.28 | 2087 | 43M| 6111K| 42M (0)| 

									 |* 7 |    TABLE   ACCESS   FULL   | T2 |  1 | 1215K| 1000K|0.12 | 2083 |  | |  | 

									 | 8 |    TABLE   ACCESS   FULL   | T1 |  1 | 2738K|  1 |0.01 |  4 |  | |  | 

									 -------------------------------------------------------------------------------------------------------------------- 

									 Predicate Information (identified   by   operation id): 

									 --------------------------------------------------- 

									    2 - filter(   IS   NULL  ) 

									    3 - access(  "T1"  .  "ID"  =  "T2"  .  "ID"  ) 

									    5 - filter(  "T1"  .  "DEP_ID"  =  'kk'  ) 

									    6 - access(  "T1"  .  "ID"  =  "T2"  .  "ID"  ) 

									    7 - filter(  "T2"  .  "DEP_ID"  =  'kk'  ) 

									 Note 

									 ----- 

									    -   dynamic   statistics   used:   dynamic   sampling (  level  =2) 

									 35   rows   selected. 

									 SQL>

可以看到第38，39行的buffer为0.

总结

以上就是这篇文章的全部内容了，希望本文的内容对大家的学习或者工作能带来一定的帮助，如果有疑问大家可以留言交流。

原文链接：https://oracleblog.org/study-note/not-exists-effect-outer-side-select/

声明：本文来自网络，不代表【好得很程序员自学网】立场，转载请注明出处：http://www.haodehen.cn/did227521

更新时间：2023-05-14 阅读：43次