IF I run the following with the a < 2900 condition first, the more
expensive EXISTS only gets executed when needed, but if I change the
order of the OR's, the EXISTS is always executed. It would be good if
the optimizer could re-order the OR conditions based on estimated
cost (granted, this wouldn't work very well if you've got functions
in the OR, but it'd still be useful):
select * from a where a < 2900 or exists (select * from b where b.a =
a.a);
Here's a full example. Note the loops count for the Subplan between
both cases:
decibel=# create table a as select * from generate_series(1,3000) a;
SELECT
decibel=# create table b as select a,b from a, generate_series(1,100)
b where a > 10;
SELECT
decibel=# create index b__a on b(a);
CREATE INDEX
decibel=# explain analyze select * from a where a < 2900 or exists
(select * from b where b.a = a.a); QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------
Seq Scan on a (cost=0.00..8080.41 rows=1997 width=4) (actual
time=0.014..1.784 rows=3000 loops=1) Filter: ((a < 2900) OR (subplan)) SubPlan -> Index Scan using b__a on b
(cost=0.00..4006.44rows=1495
width=8) (actual time=0.009..0.009 rows=1 loops=101) Index Cond: (a = $0)
Total runtime: 2.151 ms
(6 rows)
decibel=# explain analyze select * from a where exists (select * from
b where b.a = a.a) or a < 2000; QUERY PLAN
------------------------------------------------------------------------
-------------------------------------------------
Seq Scan on a (cost=0.00..8080.41 rows=1997 width=4) (actual
time=0.067..37.011 rows=3000 loops=1) Filter: ((subplan) OR (a < 2000)) SubPlan -> Index Scan using b__a on b
(cost=0.00..4006.44rows=1495
width=8) (actual time=0.011..0.011 rows=1 loops=3000) Index Cond: (a = $0)
Total runtime: 37.497 ms
(6 rows)
decibel=#
(This is on HEAD as of a few minutes ago)
--
Jim Nasby jim.nasby@enterprisedb.com
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)