Thread: RE:Re: BUG #7556 addition info
I downloaded the latest version,and the cost of "not in" is much higher than that of "not exist". please see attachment for detail.
As the time of query is very long,I didn't get the explain analyze result.
As the time of query is very long,I didn't get the explain analyze result.
I think the id columns of table a and b are not null, so the query of "not in" and "not exists" are equal,they should use similar plans.
I notice there is a "Materialize" step in my bad plans of "not in",but there isn't in your following plan. I wonder how to get the plan you posted here, are there any configure argument or command?
egress=# explain select max(a.info)from sli_test a where a.id not in(select
regress(# b.id from sli_test2 b where b.id<50000);
egress=# explain select max(a.info)from sli_test a where a.id not in(select
regress(# b.id from sli_test2 b where b.id<50000);
QUERY PLAN
---------------------------------------------------------------------------------
Aggregate (cost=38050.82..38050.83 rows=1 width=12)
---------------------------------------------------------------------------------
Aggregate (cost=38050.82..38050.83 rows=1 width=12)
-> Seq Scan on sli_test a (cost=18026.82..36800.82 rows=500000 width=12)
Filter: (NOT (hashed SubPlan 1))
SubPlan 1
Filter: (NOT (hashed SubPlan 1))
SubPlan 1
-> Seq Scan on sli_test2 b (cost=0.00..17906.00 rows=48329 width=4)
Filter: (id < 50000)
(6 rows)
Filter: (id < 50000)
(6 rows)
It runs in about 500ms here.
and I got following plan on kingbase 7.1,a modified version of postgres product.
TEST=# explain select max(a.name)from a where a.id not in(select b.id from b where b.id<50000);
QUERY PLAN
TEST=# explain select max(a.name)from a where a.id not in(select b.id from b where b.id<50000);
QUERY PLAN
--------------------------------------------------------------------------------
-----------------------
Aggregate (cost=72541.71..72541.72 rows=1 width=12)
-> Merge Anti Join (cost=0.00..70041.71 rows=1000000 width=12)
Merge Cond: (A.ID = B.ID)
-> Index Scan using A_PKEY on A (cost=0.00..32257.36 rows=1000000 width=16)
-> Index Scan using B_ID on B (cost=0.00..31117.69 rows=333333 width=4)
Index Cond: (ID < 50000)
(6 行)
-----------------------
Aggregate (cost=72541.71..72541.72 rows=1 width=12)
-> Merge Anti Join (cost=0.00..70041.71 rows=1000000 width=12)
Merge Cond: (A.ID = B.ID)
-> Index Scan using A_PKEY on A (cost=0.00..32257.36 rows=1000000 width=16)
-> Index Scan using B_ID on B (cost=0.00..31117.69 rows=333333 width=4)
Index Cond: (ID < 50000)
(6 行)
时间: 20.095 ms
l1t@tom.com writes: > I think the id columns of table a and b are not null, so the query of "not in" and "not exists" are equal,they shoulduse similar plans. NOT IN and NOT EXISTS are *not* equivalent. Per SQL standard, NOT IN has different (and usually not very desirable) behavior with NULL values. The spec-mandated behavior makes it a lot harder to optimize nicely --- and since you usually don't want what it does anyway, we haven't tried hard. regards, tom lane