Re: Improve choose_custom_plan for initial partition prune case - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Improve choose_custom_plan for initial partition prune case
Date
Msg-id CAExHW5tq+rgecxVLx=i+auqYX1tA2bye_kpTY8z8GdnjOHUjmg@mail.gmail.com
Whole thread Raw
In response to Re: Improve choose_custom_plan for initial partition prune case  (Andy Fan <zhihui.fan1213@gmail.com>)
Responses Re: Improve choose_custom_plan for initial partition prune case  (Amit Langote <amitlangote09@gmail.com>)
List pgsql-hackers
On Wed, Oct 7, 2020 at 11:20 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:
>
> Hi Ashutosh:
>
> Thanks for coming.
>
> On Mon, Oct 5, 2020 at 9:27 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
>>
>> On Thu, Oct 1, 2020 at 9:34 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
>> >
>> > Given the plan example:
>> >
>> > CREATE TABLE measurement (
>> >     city_id         int not null,
>> >     logdate         date not null,
>> >     peaktemp        int,
>> >     unitsales       int
>> > ) PARTITION BY RANGE (logdate);
>> >
>> > CREATE TABLE measurement_y2006m02 PARTITION OF measurement
>> >     FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
>> >
>> > CREATE TABLE measurement_y2006m03 PARTITION OF measurement
>> >     FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
>> >
>> > prepare s as select * from measurement where logdate = $1;
>> > execute s('2006-02-01').
>> >
>> > The generic plan will probably not be chosen because it doesn't reduce the cost
>> > which can be reduced at initial_prune while the custom plan reduces such cost
>> > at  planning time. which makes the cost comparison not fair.  I'm thinking if we can
>> > get an estimated cost reduction of initial_prunne for generic plan based on the
>> > partition pruned at plan time from custom plan and then reducing
>> > such costs from the generic plan.  I just went through the related code but
>> > didn't write anything now.  I'd like to see if this is a correct direction to go.
>>
>> What happens when we
>> execute plans with values that have estimates similar to the generic
>> plan later when we moderate generic plan costs based on the custom
>> plans?
>>
>
> The example at the beginning of this thread,  I used the exact same values
> every time, the custom plan will be chosen all the time, which is bad,
> The main reason is the custom plan knows the exact value in Execute
> message, so it run plan time partition prune, then the total cost is low, however
> for the generic plan the partition prune happens at Runtime initial_partition prune
> stage,  so the cost of the partitions which can be pruned at that stage is still
> included the total cost,  so generic plans can't be chosen. that would be the
> thing I want to fix.

Something is wrong in the generic plan costing then. IIUC, the
selectivity estimate for only a single partition should come out >= 1.
For all the other partitions, it should be 1 and that too because we
clamp the row counts. So the final costs for generic and custom plans
shouldn't be far off unless there's large deviation in the selectivity
of a partition key. I am assuming that there's an equality condition
on a partition key. That's what I meant by the paragraph below.

>
>> If the table has good distribution of a partition key, which also
>> results in good distribution of data across partitions, generic plan
>> cost will be similar to the custom plan costs. If not that's something
>> we want to fix.

Can you please investigate on these lines?

-- 
Best Wishes,
Ashutosh Bapat



pgsql-hackers by date:

Previous
From: Maksim Kita
Date:
Subject: [PATCH] ecpg: fix progname memory leak
Next
From: "iwata.aya@fujitsu.com"
Date:
Subject: RE: libpq debug log