On Thu, 4 Apr 2019 at 06:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I wrote:
> > I think we'd have to redesign those cost calculations completely to
> > get ideal answers, and I don't especially want to do that right now.
> > So I'm wondering about the attached as a stopgap. It seems like it's
> > an improvement over what we have, even if not a full fix.
>
> > Thoughts?
>
> Ping? If nobody has any further ideas here, I'm going to clean up
> the regression test issue and push this.
Going back to the original example:
create function expensive_func(int) returns int as $$ begin return 1; end $$
language plpgsql stable cost 10000;
create table unique_inner(a int primary key);
insert into unique_inner select generate_series(1, 10000);
analyze unique_inner;
The problem query was:
explain verbose select * from unique_inner gs1(i) join
generate_series(1, 10) gs2(i) using (i) where expensive_func(gs1.i +
gs2.i) > 0;
It is my expectation that the plan for the above should be the same as
the plan for the query below.
explain verbose select * from generate_series(1,10) gs2(i) where
exists (select i from unique_inner gs1(i) where gs1.i = gs2.i and
expensive_func(gs1.i + gs2.i) > 0);
and it is.
However, I'm a bit surprised at the following not having the plan come
in cheaper when the unique rel is on the inner side:
create table t1 (a int primary key);
create table t2 (a int);
create index on t2(a);
insert into t1 select generate_Series(1,10000);
insert into t2 select generate_Series(1,10000);
analyze t1,t2;
explain verbose select * from t1 inner join t2 on t1.a=t2.a;
QUERY PLAN
---------------------------------------------------------------------------
Hash Join (cost=270.00..552.50 rows=10000 width=8)
Output: t1.a, t2.a
Hash Cond: (t1.a = t2.a)
-> Seq Scan on public.t1 (cost=0.00..145.00 rows=10000 width=4)
Output: t1.a
-> Hash (cost=145.00..145.00 rows=10000 width=4)
Output: t2.a
-> Seq Scan on public.t2 (cost=0.00..145.00 rows=10000 width=4)
Output: t2.a
(9 rows)
-- again but with the relations written in the opposite order.
explain verbose select * from t2 inner join t1 on t1.a=t2.a;
QUERY PLAN
---------------------------------------------------------------------------
Hash Join (cost=270.00..552.50 rows=10000 width=8)
Output: t2.a, t1.a
Inner Unique: true
Hash Cond: (t2.a = t1.a)
-> Seq Scan on public.t2 (cost=0.00..145.00 rows=10000 width=4)
Output: t2.a
-> Hash (cost=145.00..145.00 rows=10000 width=4)
Output: t1.a
-> Seq Scan on public.t1 (cost=0.00..145.00 rows=10000 width=4)
Output: t1.a
(10 rows)
If I try the same on master it always prefers t1 on the inside of the
join. It would be nice if the costs reflected that since there are
savings to be had during execution with the unique rel on the inner
side.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services