Thread: Calculate total_table_pages after set_base_rel_sizes()

Calculate total_table_pages after set_base_rel_sizes()

From
David Rowley
Date:
I believe that we should be delaying the PlannerInfo's
total_table_pages calculation until after constraint exclusion and
partition pruning have taken place. Doing this calculation before we
determine which relations we don't need to scan can lead to
incorrectly applying random_page_cost to too many pages processed
during an Index Scan.

We already don't count relations removed by join removals from this
calculation, so counting pruned partitions seems like an omission.

The attached patch moves the calculation to after set_base_rel_sizes()
is called and before set_base_rel_pathlists() is called, where the
information is actually used.

I am considering this a bug fix, but I'm proposing this for PG12 only
as I don't think destabilising plans in the back branches is a good
idea. I'll add this to the September commitfest.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: Calculate total_table_pages after set_base_rel_sizes()

From
Edmund Horner
Date:
David Rowley said:
> I believe that we should be delaying the PlannerInfo's
> total_table_pages calculation until after constraint exclusion and
> partition pruning have taken place. Doing this calculation before we
> determine which relations we don't need to scan can lead to
> incorrectly applying random_page_cost to too many pages processed
> during an Index Scan.
> 
> We already don't count relations removed by join removals from this
> calculation, so counting pruned partitions seems like an omission.
> 
> The attached patch moves the calculation to after set_base_rel_sizes()
> is called and before set_base_rel_pathlists() is called, where the
> information is actually used.
> 
> I am considering this a bug fix, but I'm proposing this for PG12 only
> as I don't think destabilising plans in the back branches is a good
> idea. I'll add this to the September commitfest.

Hi David, I had a quick look at this.  (I haven't tested it so this isn't a full review.)

It looks like a fairly straightforward code move.  And I think it's correct to exclude the pages from partitions that
won'tbe read.
 

I have a small tweak.  In make_one_rel, we currently have:

    /*
     * Compute size estimates and consider_parallel flags for each base rel,
     * then generate access paths.
     */
    set_base_rel_sizes(root);
    set_base_rel_pathlists(root);

Your patch inserts code between the two lines.  I think the comment should be split too.

    /* Compute size estimates and consider_parallel flags for each base rel. */
    set_base_rel_sizes(root);

    // NEW CODE

    /* Generate access paths. */
    set_base_rel_pathlists(root);

Cheers,
Edmund

Re: Calculate total_table_pages after set_base_rel_sizes()

From
David Rowley
Date:
On Tue, 25 Sep 2018 at 10:23, Edmund Horner <ejrh00@gmail.com> wrote:
> I have a small tweak.  In make_one_rel, we currently have:
>
>     /*
>      * Compute size estimates and consider_parallel flags for each base rel,
>      * then generate access paths.
>      */
>     set_base_rel_sizes(root);
>     set_base_rel_pathlists(root);
>
> Your patch inserts code between the two lines.  I think the comment should be split too.
>
>     /* Compute size estimates and consider_parallel flags for each base rel. */
>     set_base_rel_sizes(root);
>
>     // NEW CODE
>
>     /* Generate access paths. */
>     set_base_rel_pathlists(root);

Thanks for looking at this.

I've changed that in the attached updated patch.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: Calculate total_table_pages after set_base_rel_sizes()

From
Edmund Horner
Date:
David Rowley said:
> I believe that we should be delaying the PlannerInfo's
> total_table_pages calculation until after constraint exclusion and
> partition pruning have taken place. Doing this calculation before we
> determine which relations we don't need to scan can lead to
> incorrectly applying random_page_cost to too many pages processed
> during an Index Scan.
>
> We already don't count relations removed by join removals from this
> calculation, so counting pruned partitions seems like an omission.
>
> The attached patch moves the calculation to after set_base_rel_sizes()
> is called and before set_base_rel_pathlists() is called, where the
> information is actually used.
>
> I am considering this a bug fix, but I'm proposing this for PG12 only
> as I don't think destabilising plans in the back branches is a good
> idea. I'll add this to the September commitfest.

I played with the new patched today with a set of large partitions.
It seems to work, though the effect is subtle.  The expected behaviour
of index_pages_fetched is a bit hard to fathom when the cache share
pushes it between cases A,B and C of the formula, but that's not
really down to this patch.

Basically I think it's ready for a committer to look at.  Should I
update the CF entry?

Edmund


Re: Calculate total_table_pages after set_base_rel_sizes()

From
David Rowley
Date:
On 6 October 2018 at 18:20, Edmund Horner <ejrh00@gmail.com> wrote:
> David Rowley said:
>> I am considering this a bug fix, but I'm proposing this for PG12 only
>> as I don't think destabilising plans in the back branches is a good
>> idea. I'll add this to the September commitfest.
>
> I played with the new patched today with a set of large partitions.
> It seems to work, though the effect is subtle.  The expected behaviour
> of index_pages_fetched is a bit hard to fathom when the cache share
> pushes it between cases A,B and C of the formula, but that's not
> really down to this patch.

Thanks for looking at this and testing it too.

The primary purpose of this patch was step 1 in delaying the creation
of RangeTblEntrys for partitions until after partition pruning has
taken place.  The code I had to do this caused problems around the
total_table_pages calculation due to the lack of RangeTblEntry for the
partitions at the time it was being calculated. But regardless of
that, I still believe where we currently calculate this number is
subtlely broken as it counts all partitions, even ones that will later
be pruned, thus decreasing the likelihood of an index being used as
random_page_cost would be applied to a higher number of index pages.

Amit Langote has since posted a patch to delay the RangeTblEntry
creation until after pruning. His patch happens to also move the
total_table_pages calculation, but I believe this change should be
made as an independent commit to anything else.  I've kept it in the
commitfest for that reason.

> Basically I think it's ready for a committer to look at.  Should I
> update the CF entry?

That sounds good, please do.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Calculate total_table_pages after set_base_rel_sizes()

From
Amit Langote
Date:
On 2018/10/07 17:43, David Rowley wrote:
> On 6 October 2018 at 18:20, Edmund Horner <ejrh00@gmail.com> wrote:
>> David Rowley said:
>>> I am considering this a bug fix, but I'm proposing this for PG12 only
>>> as I don't think destabilising plans in the back branches is a good
>>> idea. I'll add this to the September commitfest.
>>
>> I played with the new patched today with a set of large partitions.
>> It seems to work, though the effect is subtle.  The expected behaviour
>> of index_pages_fetched is a bit hard to fathom when the cache share
>> pushes it between cases A,B and C of the formula, but that's not
>> really down to this patch.
> 
> Thanks for looking at this and testing it too.
> 
> The primary purpose of this patch was step 1 in delaying the creation
> of RangeTblEntrys for partitions until after partition pruning has
> taken place.  The code I had to do this caused problems around the
> total_table_pages calculation due to the lack of RangeTblEntry for the
> partitions at the time it was being calculated. But regardless of
> that, I still believe where we currently calculate this number is
> subtlely broken as it counts all partitions, even ones that will later
> be pruned, thus decreasing the likelihood of an index being used as
> random_page_cost would be applied to a higher number of index pages.
> 
> Amit Langote has since posted a patch to delay the RangeTblEntry
> creation until after pruning. His patch happens to also move the
> total_table_pages calculation, but I believe this change should be
> made as an independent commit to anything else.  I've kept it in the
> commitfest for that reason.

Yeah, if this patch is a win independent of the other project of delaying
partition RTE creation, which seems to be true, I think we should go ahead
with applying this patch.

>> Basically I think it's ready for a committer to look at.  Should I
>> update the CF entry?
> 
> That sounds good, please do.

+1

Thanks,
Amit



Re: Calculate total_table_pages after set_base_rel_sizes()

From
Amit Langote
Date:
On 2018/10/11 13:45, Amit Langote wrote:
> On 2018/10/07 17:43, David Rowley wrote:
>> Amit Langote has since posted a patch to delay the RangeTblEntry
>> creation until after pruning. His patch happens to also move the
>> total_table_pages calculation, but I believe this change should be
>> made as an independent commit to anything else.  I've kept it in the
>> commitfest for that reason.
> 
> Yeah, if this patch is a win independent of the other project of delaying
> partition RTE creation, which seems to be true, I think we should go ahead
> with applying this patch.

Fwiw, I've incorporated David's patch in my own series, so that one of my
patches no longer has the code movement hunks that are in his.  I will
post the new version of my patch series like that.

Thanks,
Amit



Re: Calculate total_table_pages after set_base_rel_sizes()

From
David Rowley
Date:
On 12 October 2018 at 23:35, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
> On 2018/10/11 13:45, Amit Langote wrote:
>> On 2018/10/07 17:43, David Rowley wrote:
>>> Amit Langote has since posted a patch to delay the RangeTblEntry
>>> creation until after pruning. His patch happens to also move the
>>> total_table_pages calculation, but I believe this change should be
>>> made as an independent commit to anything else.  I've kept it in the
>>> commitfest for that reason.
>>
>> Yeah, if this patch is a win independent of the other project of delaying
>> partition RTE creation, which seems to be true, I think we should go ahead
>> with applying this patch.
>
> Fwiw, I've incorporated David's patch in my own series, so that one of my
> patches no longer has the code movement hunks that are in his.  I will
> post the new version of my patch series like that.

Thanks. I'll keep this open here anyway so the change can be
considered independently. I think counting pages of pruned partitions
is a bug that should be fixed. You can just drop that patch from your
set if this gets committed.


-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Calculate total_table_pages after set_base_rel_sizes()

From
Amit Langote
Date:
On Sat, Oct 13, 2018 at 7:36 David Rowley <david.rowley@2ndquadrant.com> wrote:
On 12 October 2018 at 23:35, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
> On 2018/10/11 13:45, Amit Langote wrote:
>> On 2018/10/07 17:43, David Rowley wrote:
>>> Amit Langote has since posted a patch to delay the RangeTblEntry
>>> creation until after pruning. His patch happens to also move the
>>> total_table_pages calculation, but I believe this change should be
>>> made as an independent commit to anything else.  I've kept it in the
>>> commitfest for that reason.
>>
>> Yeah, if this patch is a win independent of the other project of delaying
>> partition RTE creation, which seems to be true, I think we should go ahead
>> with applying this patch.
>
> Fwiw, I've incorporated David's patch in my own series, so that one of my
> patches no longer has the code movement hunks that are in his.  I will
> post the new version of my patch series like that.

Thanks. I'll keep this open here anyway so the change can be
considered independently. I think counting pages of pruned partitions
is a bug that should be fixed. You can just drop that patch from your
set if this gets committed.

Yeah, that was my plan anyway.

Thanks,
Amit

Re: Calculate total_table_pages after set_base_rel_sizes()

From
Tom Lane
Date:
David Rowley <david.rowley@2ndquadrant.com> writes:
> [ v2-0001-Calculate-total_table_pages-after-set_base_rel_si.patch ]

Pushed with cosmetic adjustments.  The reason it's okay to not check for
appendrels in this loop is now quite different from the reason it was okay
before, so I didn't like the fact that you'd just cut-and-pasted the same
comment.  Also, you did s/brel/rel/ in the transferred code, which I would
not have objected to except you were inserting it into a function that
(a) already had a local variable "rel", causing a shadowing situation,
and (b) had an identical loop just above this that used the "brel"
notation.  So I changed that back.

            regards, tom lane


Re: Calculate total_table_pages after set_base_rel_sizes()

From
David Rowley
Date:
On 8 November 2018 at 06:17, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Pushed with cosmetic adjustments.

Thanks for adjusting and pushing.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services