Re: Improve choose_custom_plan for initial partition prune case - Mailing list pgsql-hackers

From Andy Fan
Subject Re: Improve choose_custom_plan for initial partition prune case
Date
Msg-id CAKU4AWoTUpg4YUSEAPHppkhqiJqtex05F0mvRWn74Lf=G2nYkg@mail.gmail.com
Whole thread Raw
In response to Re: Improve choose_custom_plan for initial partition prune case  (Amit Langote <amitlangote09@gmail.com>)
List pgsql-hackers
Thank you Amit and Ashutosh for your reply!

On Wed, Oct 7, 2020 at 8:41 PM Amit Langote <amitlangote09@gmail.com> wrote:
Hi Ashutosh,

On Wed, Oct 7, 2020 at 8:40 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
> On Wed, Oct 7, 2020 at 11:20 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:
> > On Mon, Oct 5, 2020 at 9:27 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
> >> What happens when we
> >> execute plans with values that have estimates similar to the generic
> >> plan later when we moderate generic plan costs based on the custom
> >> plans?
> >>
> >
> > The example at the beginning of this thread,  I used the exact same values
> > every time, the custom plan will be chosen all the time, which is bad,
> > The main reason is the custom plan knows the exact value in Execute
> > message, so it run plan time partition prune, then the total cost is low, however
> > for the generic plan the partition prune happens at Runtime initial_partition prune
> > stage,  so the cost of the partitions which can be pruned at that stage is still
> > included the total cost,  so generic plans can't be chosen. that would be the
> > thing I want to fix.
>
> Something is wrong in the generic plan costing then.

That's right.

> IIUC, the
> selectivity estimate for only a single partition should come out >= 1.
> For all the other partitions, it should be 1 and that too because we
> clamp the row counts.

If the planner knows the exact parameter,  the planner can estimate 
other partitions as 1 row (actually 0 row).  However if it doesn't know that,
it will be  rows / ndistinctVals for  "partkey = $1" case. 
 
So the final costs for generic and custom plans
> shouldn't be far off unless there's large deviation in the selectivity
> of a partition key. I am assuming that there's an equality condition
> on a partition key. That's what I meant by the paragraph below.
> >
> >> If the table has good distribution of a partition key, which also
> >> results in good distribution of data across partitions, generic plan
> >> cost will be similar to the custom plan costs. If not that's something
> >> we want to fix.
>
> Can you please investigate on these lines?

The planner currently costs a generic plan assuming that *all*
partitions will be scanned, which is not the case, because run-time
pruning will ensure that scan nodes of all but one partition will be
discarded, assuming a simple query like `select * from parted_table
where a = $1`.  With N partitions, a generic plan for such a query is
N times as expensive as a custom plan, so will never win.

For example:

create table bar (a int, b int, c int) partition by range (a);
create table bar_1 partition of bar for values from (1) to (100001);
create table bar_2 partition of bar for values from (100001) to (200001);
create table bar_3 partition of bar for values from (200001) to (300001);
create table bar_4 partition of bar for values from (300001) to (400001);
create table bar_5 partition of bar for values from (400001) to (500001);
create table bar_6 partition of bar for values from (500001) to (600001);
create table bar_7 partition of bar for values from (600001) to (700001);
create table bar_8 partition of bar for values from (700001) to (800001);
create table bar_9 partition of bar for values from (800001) to (900001);
create table bar_10 partition of bar for values from (900001) to (1000001);
insert into bar select generate_series(1, 1000000);
create index on bar (a);
analyze bar;

set plan_cache_mode to force_custom_plan ;
prepare q as select * from bar where a = $1;
explain execute q (1);
                                  QUERY PLAN
------------------------------------------------------------------------------
 Index Scan using bar_1_a_idx on bar_1 bar  (cost=0.29..8.31 rows=1 width=12)
   Index Cond: (a = 1)
(2 rows)

deallocate q;
set plan_cache_mode to force_generic_plan ;
prepare q as select * from bar where a = $1;
explain execute q (1);
                                   QUERY PLAN
--------------------------------------------------------------------------------
 Append  (cost=0.29..83.15 rows=10 width=12)
   Subplans Removed: 9
   ->  Index Scan using bar_1_a_idx on bar_1  (cost=0.29..8.31 rows=1 width=12)
         Index Cond: (a = $1)
(4 rows)

As you can see, the cost of the generic plan is 83.13 which is 8.31 *
10, even though only one partition is actually scanned, so the actual
cost is the same as a custom plan.  The planner fails to consider that
the those 9 subplans will be removed during execution.

--
Amit Langote
EDB: http://www.enterprisedb.com

I also had some replies at the original runtime partition prune thread [1],   
I am very sorry that I mess up this topic with 2 threads,  I have some new 
understanding about this (might be old ideas for others),  I'd like to 
continue this discussion in that thread and add you to the cc list there. 

Thanks!

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: pg_dump bug for extension owned tables
Next
From: Anastasia Lubennikova
Date:
Subject: Re: [PATCH] Automatic HASH and LIST partition creation