Home > mailing lists

Re: BUG #15383: Join Filter cost estimation problem in 10.5 - Mailing list pgsql-bugs

From	David Rowley
Subject	Re: BUG #15383: Join Filter cost estimation problem in 10.5
Date	April 4, 2019 04:37:03
Msg-id	CAKJS1f9tt2onS5uVrxi1G6E8M7ket+f+0pfx_=3v5jCr0RpsaA@mail.gmail.com Whole thread
In response to	Re: BUG #15383: Join Filter cost estimation problem in 10.5 (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-bugs

Tree view

On Thu, 4 Apr 2019 at 06:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I wrote:
> > I think we'd have to redesign those cost calculations completely to
> > get ideal answers, and I don't especially want to do that right now.
> > So I'm wondering about the attached as a stopgap.  It seems like it's
> > an improvement over what we have, even if not a full fix.
>
> > Thoughts?
>
> Ping?  If nobody has any further ideas here, I'm going to clean up
> the regression test issue and push this.

Going back to the original example:

create function expensive_func(int) returns int as $$ begin return 1; end $$
language plpgsql stable cost 10000;
create table unique_inner(a int primary key);
insert into unique_inner select generate_series(1, 10000);
analyze unique_inner;

The problem query was:

explain verbose select * from unique_inner gs1(i) join
generate_series(1, 10) gs2(i) using (i) where expensive_func(gs1.i +
gs2.i) > 0;

It is my expectation that the plan for the above should be the same as
the plan for the query below.

explain verbose select * from generate_series(1,10) gs2(i) where
exists (select i from unique_inner gs1(i) where gs1.i = gs2.i and
expensive_func(gs1.i + gs2.i) > 0);

and it is.

However, I'm a bit surprised at the following not having the plan come
in cheaper when the unique rel is on the inner side:

create table t1 (a int primary key);
create table t2 (a int);
create index on t2(a);
insert into t1 select generate_Series(1,10000);
insert into t2 select generate_Series(1,10000);
analyze t1,t2;
explain verbose select * from t1 inner join t2 on t1.a=t2.a;
                                QUERY PLAN
---------------------------------------------------------------------------
 Hash Join  (cost=270.00..552.50 rows=10000 width=8)
   Output: t1.a, t2.a
   Hash Cond: (t1.a = t2.a)
   ->  Seq Scan on public.t1  (cost=0.00..145.00 rows=10000 width=4)
         Output: t1.a
   ->  Hash  (cost=145.00..145.00 rows=10000 width=4)
         Output: t2.a
         ->  Seq Scan on public.t2  (cost=0.00..145.00 rows=10000 width=4)
               Output: t2.a
(9 rows)

-- again but with the relations written in the opposite order.
explain verbose select * from t2 inner join t1 on t1.a=t2.a;
                                QUERY PLAN
---------------------------------------------------------------------------
 Hash Join  (cost=270.00..552.50 rows=10000 width=8)
   Output: t2.a, t1.a
   Inner Unique: true
   Hash Cond: (t2.a = t1.a)
   ->  Seq Scan on public.t2  (cost=0.00..145.00 rows=10000 width=4)
         Output: t2.a
   ->  Hash  (cost=145.00..145.00 rows=10000 width=4)
         Output: t1.a
         ->  Seq Scan on public.t1  (cost=0.00..145.00 rows=10000 width=4)
               Output: t1.a
(10 rows)

If I try the same on master it always prefers t1 on the inside of the
join. It would be nice if the costs reflected that since there are
savings to be had during execution with the unique rel on the inner
side.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-bugs by date:

From: Michael Paquier
Date: 04 April 2019, 01:35:10
Subject: Re: BUG #15733: An insert destined at partition created after acolumn has been dropped from the parent table fails

From: Jeremy Evans
Date: 04 April 2019, 06:16:13
Subject: Re: BUG #15722: 9000x performance regression on query startingPostgreSQL 10 due to bad query plan

Re: BUG #15383: Join Filter cost estimation problem in 10.5 - Mailing list pgsql-bugs

Previous

Next