Re: Question about some changes in 11.3 - Mailing list pgsql-hackers
From | Mat Arye |
---|---|
Subject | Re: Question about some changes in 11.3 |
Date | |
Msg-id | CADsUR0CWmncsDxANvfe8Kzcvbfc1-wRsrpmQZTC7YuwwLWwRnQ@mail.gmail.com Whole thread Raw |
In response to | Re: Question about some changes in 11.3 (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Question about some changes in 11.3
|
List | pgsql-hackers |
Thanks for taking a look at this Tom.
On Mon, Jun 3, 2019 at 4:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Mat Arye <mat@timescale.com> writes:
> On Fri, May 24, 2019 at 5:05 PM Mat Arye <mat@timescale.com> wrote:
>> 11.3 included some change to partition table planning. Namely
>> commit 925f46f ("Fix handling of targetlist SRFs when scan/join relation is
>> known empty.") seems to redo all paths for partitioned tables
>> in apply_scanjoin_target_to_paths. It clears the paths in:
>>
>> ```
>> if (rel_is_partitioned)
>> rel->pathlist = NIL
>> ```
>>
>> Then the code rebuild the paths. However, the rebuilt append path never
>> gets the
>> set_rel_pathlist_hook called. Thus, the work that hook did previously gets
>> thrown away and the rebuilt append path can never be influenced by this
>> hook. Is this intended behavior? Am I missing something?
Hm. I'd say this was already broken by the invention of
apply_scanjoin_target_to_paths; perhaps 11-before-11.3 managed to
still work for you, but it's not hard to envision applications of
set_rel_pathlist_hook for which it would not have worked. The contract
for set_rel_pathlist_hook is supposed to be that it gets to editorialize
on all normal (non-Gather) paths created by the core code, and that's
no longer the case now that apply_scanjoin_target_to_paths can add more.
Yeah it worked for our cases because (I guess) out paths turned out to be lower cost,
but I see your point.
> I've attached a small patch to address this discrepancy for when the
> set_rel_pathlist_hook is called so that's it is called for actual paths
> used for partitioned rels. Please let me know if I am misunderstanding how
> this should be handled.
I'm not very happy with this patch either, as it makes the situation
even more confused, not less so. The best-case scenario is that the
set_rel_pathlist_hook runs twice and does useless work; the worst case
is that it gets confused completely by being called twice for the same
rel. I think we need to maintain the invariant that that hook is
called exactly once per baserel.
Yeah getting called once per baserel is a nice invariant to have.
I wonder whether we could fix matters by postponing the
set_rel_pathlist_hook call till later in the same cases where
we postpone generate_gather_paths, ie, it's the only baserel.
That would make its name pretty misleading, though.
How would simply delaying the hook make the name misleading? I am also wondering if
using the condition `rel->reloptkind == RELOPT_BASEREL &&
bms_membership(root->all_baserels) != BMS_SINGLETON` is sufficient. Is it really guaranteed that `apply_scanjoin_target_to_paths` will not be called in other cases?
Maybe we
should leave it alone and invent a separate hook to be called
by/after apply_scanjoin_target_to_paths? Although I don't
know if it'd be useful to add a new hook to v11 at this point.
Extensions would have a hard time knowing if they could use it.
I think for us, either approach would work. We just need a place to add/modify
some paths. FWIW, I think delaying the hook is easier to deal with on our end if it could work
since we don't have to deal with two different code paths but either is workable.
Certainly if we go with the new hook approach I think it should be added to v11 as well.
That way extensions that need the functionality can hook into it and deal with patch level
differences instead of having no way at all to get at this functionality.
I am more than happy to work on a new patch once we settle on an approach.
regards, tom lane
pgsql-hackers by date: