Thread: Calculate total_table_pages after set_base_rel_sizes()
I believe that we should be delaying the PlannerInfo's total_table_pages calculation until after constraint exclusion and partition pruning have taken place. Doing this calculation before we determine which relations we don't need to scan can lead to incorrectly applying random_page_cost to too many pages processed during an Index Scan. We already don't count relations removed by join removals from this calculation, so counting pruned partitions seems like an omission. The attached patch moves the calculation to after set_base_rel_sizes() is called and before set_base_rel_pathlists() is called, where the information is actually used. I am considering this a bug fix, but I'm proposing this for PG12 only as I don't think destabilising plans in the back branches is a good idea. I'll add this to the September commitfest. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
David Rowley said: > I believe that we should be delaying the PlannerInfo's > total_table_pages calculation until after constraint exclusion and > partition pruning have taken place. Doing this calculation before we > determine which relations we don't need to scan can lead to > incorrectly applying random_page_cost to too many pages processed > during an Index Scan. > > We already don't count relations removed by join removals from this > calculation, so counting pruned partitions seems like an omission. > > The attached patch moves the calculation to after set_base_rel_sizes() > is called and before set_base_rel_pathlists() is called, where the > information is actually used. > > I am considering this a bug fix, but I'm proposing this for PG12 only > as I don't think destabilising plans in the back branches is a good > idea. I'll add this to the September commitfest. Hi David, I had a quick look at this. (I haven't tested it so this isn't a full review.) It looks like a fairly straightforward code move. And I think it's correct to exclude the pages from partitions that won'tbe read. I have a small tweak. In make_one_rel, we currently have: /* * Compute size estimates and consider_parallel flags for each base rel, * then generate access paths. */ set_base_rel_sizes(root); set_base_rel_pathlists(root); Your patch inserts code between the two lines. I think the comment should be split too. /* Compute size estimates and consider_parallel flags for each base rel. */ set_base_rel_sizes(root); // NEW CODE /* Generate access paths. */ set_base_rel_pathlists(root); Cheers, Edmund
On Tue, 25 Sep 2018 at 10:23, Edmund Horner <ejrh00@gmail.com> wrote: > I have a small tweak. In make_one_rel, we currently have: > > /* > * Compute size estimates and consider_parallel flags for each base rel, > * then generate access paths. > */ > set_base_rel_sizes(root); > set_base_rel_pathlists(root); > > Your patch inserts code between the two lines. I think the comment should be split too. > > /* Compute size estimates and consider_parallel flags for each base rel. */ > set_base_rel_sizes(root); > > // NEW CODE > > /* Generate access paths. */ > set_base_rel_pathlists(root); Thanks for looking at this. I've changed that in the attached updated patch. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
David Rowley said: > I believe that we should be delaying the PlannerInfo's > total_table_pages calculation until after constraint exclusion and > partition pruning have taken place. Doing this calculation before we > determine which relations we don't need to scan can lead to > incorrectly applying random_page_cost to too many pages processed > during an Index Scan. > > We already don't count relations removed by join removals from this > calculation, so counting pruned partitions seems like an omission. > > The attached patch moves the calculation to after set_base_rel_sizes() > is called and before set_base_rel_pathlists() is called, where the > information is actually used. > > I am considering this a bug fix, but I'm proposing this for PG12 only > as I don't think destabilising plans in the back branches is a good > idea. I'll add this to the September commitfest. I played with the new patched today with a set of large partitions. It seems to work, though the effect is subtle. The expected behaviour of index_pages_fetched is a bit hard to fathom when the cache share pushes it between cases A,B and C of the formula, but that's not really down to this patch. Basically I think it's ready for a committer to look at. Should I update the CF entry? Edmund
On 6 October 2018 at 18:20, Edmund Horner <ejrh00@gmail.com> wrote: > David Rowley said: >> I am considering this a bug fix, but I'm proposing this for PG12 only >> as I don't think destabilising plans in the back branches is a good >> idea. I'll add this to the September commitfest. > > I played with the new patched today with a set of large partitions. > It seems to work, though the effect is subtle. The expected behaviour > of index_pages_fetched is a bit hard to fathom when the cache share > pushes it between cases A,B and C of the formula, but that's not > really down to this patch. Thanks for looking at this and testing it too. The primary purpose of this patch was step 1 in delaying the creation of RangeTblEntrys for partitions until after partition pruning has taken place. The code I had to do this caused problems around the total_table_pages calculation due to the lack of RangeTblEntry for the partitions at the time it was being calculated. But regardless of that, I still believe where we currently calculate this number is subtlely broken as it counts all partitions, even ones that will later be pruned, thus decreasing the likelihood of an index being used as random_page_cost would be applied to a higher number of index pages. Amit Langote has since posted a patch to delay the RangeTblEntry creation until after pruning. His patch happens to also move the total_table_pages calculation, but I believe this change should be made as an independent commit to anything else. I've kept it in the commitfest for that reason. > Basically I think it's ready for a committer to look at. Should I > update the CF entry? That sounds good, please do. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 2018/10/07 17:43, David Rowley wrote: > On 6 October 2018 at 18:20, Edmund Horner <ejrh00@gmail.com> wrote: >> David Rowley said: >>> I am considering this a bug fix, but I'm proposing this for PG12 only >>> as I don't think destabilising plans in the back branches is a good >>> idea. I'll add this to the September commitfest. >> >> I played with the new patched today with a set of large partitions. >> It seems to work, though the effect is subtle. The expected behaviour >> of index_pages_fetched is a bit hard to fathom when the cache share >> pushes it between cases A,B and C of the formula, but that's not >> really down to this patch. > > Thanks for looking at this and testing it too. > > The primary purpose of this patch was step 1 in delaying the creation > of RangeTblEntrys for partitions until after partition pruning has > taken place. The code I had to do this caused problems around the > total_table_pages calculation due to the lack of RangeTblEntry for the > partitions at the time it was being calculated. But regardless of > that, I still believe where we currently calculate this number is > subtlely broken as it counts all partitions, even ones that will later > be pruned, thus decreasing the likelihood of an index being used as > random_page_cost would be applied to a higher number of index pages. > > Amit Langote has since posted a patch to delay the RangeTblEntry > creation until after pruning. His patch happens to also move the > total_table_pages calculation, but I believe this change should be > made as an independent commit to anything else. I've kept it in the > commitfest for that reason. Yeah, if this patch is a win independent of the other project of delaying partition RTE creation, which seems to be true, I think we should go ahead with applying this patch. >> Basically I think it's ready for a committer to look at. Should I >> update the CF entry? > > That sounds good, please do. +1 Thanks, Amit
On 2018/10/11 13:45, Amit Langote wrote: > On 2018/10/07 17:43, David Rowley wrote: >> Amit Langote has since posted a patch to delay the RangeTblEntry >> creation until after pruning. His patch happens to also move the >> total_table_pages calculation, but I believe this change should be >> made as an independent commit to anything else. I've kept it in the >> commitfest for that reason. > > Yeah, if this patch is a win independent of the other project of delaying > partition RTE creation, which seems to be true, I think we should go ahead > with applying this patch. Fwiw, I've incorporated David's patch in my own series, so that one of my patches no longer has the code movement hunks that are in his. I will post the new version of my patch series like that. Thanks, Amit
On 12 October 2018 at 23:35, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > On 2018/10/11 13:45, Amit Langote wrote: >> On 2018/10/07 17:43, David Rowley wrote: >>> Amit Langote has since posted a patch to delay the RangeTblEntry >>> creation until after pruning. His patch happens to also move the >>> total_table_pages calculation, but I believe this change should be >>> made as an independent commit to anything else. I've kept it in the >>> commitfest for that reason. >> >> Yeah, if this patch is a win independent of the other project of delaying >> partition RTE creation, which seems to be true, I think we should go ahead >> with applying this patch. > > Fwiw, I've incorporated David's patch in my own series, so that one of my > patches no longer has the code movement hunks that are in his. I will > post the new version of my patch series like that. Thanks. I'll keep this open here anyway so the change can be considered independently. I think counting pages of pruned partitions is a bug that should be fixed. You can just drop that patch from your set if this gets committed. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Sat, Oct 13, 2018 at 7:36 David Rowley <david.rowley@2ndquadrant.com> wrote:
On 12 October 2018 at 23:35, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
> On 2018/10/11 13:45, Amit Langote wrote:
>> On 2018/10/07 17:43, David Rowley wrote:
>>> Amit Langote has since posted a patch to delay the RangeTblEntry
>>> creation until after pruning. His patch happens to also move the
>>> total_table_pages calculation, but I believe this change should be
>>> made as an independent commit to anything else. I've kept it in the
>>> commitfest for that reason.
>>
>> Yeah, if this patch is a win independent of the other project of delaying
>> partition RTE creation, which seems to be true, I think we should go ahead
>> with applying this patch.
>
> Fwiw, I've incorporated David's patch in my own series, so that one of my
> patches no longer has the code movement hunks that are in his. I will
> post the new version of my patch series like that.
Thanks. I'll keep this open here anyway so the change can be
considered independently. I think counting pages of pruned partitions
is a bug that should be fixed. You can just drop that patch from your
set if this gets committed.
Yeah, that was my plan anyway.
Thanks,
Amit
David Rowley <david.rowley@2ndquadrant.com> writes: > [ v2-0001-Calculate-total_table_pages-after-set_base_rel_si.patch ] Pushed with cosmetic adjustments. The reason it's okay to not check for appendrels in this loop is now quite different from the reason it was okay before, so I didn't like the fact that you'd just cut-and-pasted the same comment. Also, you did s/brel/rel/ in the transferred code, which I would not have objected to except you were inserting it into a function that (a) already had a local variable "rel", causing a shadowing situation, and (b) had an identical loop just above this that used the "brel" notation. So I changed that back. regards, tom lane
On 8 November 2018 at 06:17, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Pushed with cosmetic adjustments. Thanks for adjusting and pushing. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services