Home > mailing lists

Re: Expression errors with "FOR UPDATE" and postgres_fdw with partitionwise join enabled. - Mailing list pgsql-hackers

From	Etsuro Fujita
Subject	Re: Expression errors with "FOR UPDATE" and postgres_fdw with partitionwise join enabled.
Date	July 24, 2018 14:51:36
Msg-id	5B5712C8.9030404@lab.ntt.co.jp Whole thread Raw
In response to	Re: Expression errors with "FOR UPDATE" and postgres_fdw withpartition wise join enabled. (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Expression errors with "FOR UPDATE" and postgres_fdw withpartition wise join enabled.
List	pgsql-hackers

Tree view

(2018/07/24 3:09), Robert Haas wrote:
> On Mon, Jul 23, 2018 at 3:49 AM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp>  wrote:
>> I have to admit that that is not a good idea.  So, I'll update the patch so
>> that we don't assume the projection capability of the subplan anymore, if we
>> go this way.
>
> Isn't that assumption fundamental to your whole approach?

I don't think so.  What I mean here is: currently the subplan would be a 
scan/join node, but in future we might have eg, a Sort node atop the 
scan/join node, so it would be better to update the patch to handle such 
a case as well.

>>> Also, even today, this would fail if the subplan happened to be
>>> a Sort, and it's not very obvious why that couldn't happen.
>>
>> You mean the MergeAppend case?  In that case, the patch will adjust the
>> subplan before adding a Sort above the subplan, so that could not happen.
>
> That would only help if create_merge_append_plan() itself inserted a
> Sort node.  It wouldn't help if the Sort node came from a child path
> manufactured by create_sort_path().

As I mentioned above, I think we could handle such a case as well.

>>> I think that's a bad idea.  The target list affects lots
>>> of things, like costing.  If we don't insert a ConvertRowTypeExpr into
>>> the child's target list, the costing will be wrong to the extent that
>>> ConvertRowTypeExpr has any cost, which it does.
>>
>> Actually, this is not true at least currently, because set_append_rel_size
>> doesn't do anything about the costing:
>
> Why would it?  Append can't project, so the cost of any expressions
> that appear in its target list is irrelevant.  What is affected is the
> cost of the scans below the Append -- see e.g. cost_seqscan(), which
> uses the data produced by set_pathtarget_cost_width().

By set_rel_size()?

>> Some createplan.c routines already change the tlists of their nodes. For
>> example, create_merge_append_plan would add sort columns to each of its
>> subplans if necessary.  I think it would be similar to that to add a
>> ConvertRowtypeExpr above a child whole-row Var in the subplan's tlist.
>
> You have a point, but I think that code is actually not a very good
> idea, and I'd like to see us do less of that sort of thing, not more.
> Any case in which we change the plan while creating it has many of the
> same problems that I discussed in the previous email.  For example,
> create_merge_append_path() has to know that a Sort node might get
> inserted and set the costing accordingly.  If the callers guaranteed
> that the necessary sort path had already been inserted, then we
> wouldn't need that special handling.

I'm not sure that's a good idea, because I think we have a trade-off 
relation; the more we make create_plan simple, the more we need to make 
earlier states of the planner complicated.

And it looks to me like the partitionwise join code is making earlier 
(and later) stages of the planner too complicated, to make create_plan 
simple.

> Also, that code is adding additional columns, computed from the
> columns we have available, so that we can sort.  Those extra columns
> then get discarded at the next level of the Plan tree.  What you're
> trying to do is different.  Perhaps this is too harsh a judgement, but
> it looks to me like you're using a deliberately-wrong representation
> of the value that you ultimately want to produce and then patching it
> up after the fact.

When considering paritionwise joining, it would make things complicated 
to have a ConvertRowtypeExpr in a partrel's targetlist, because as 
discussed upthread, it deviates from the planner's assumption that a 
rel's targetlist would only include Vars and PHVs.  So, I propose to 
include a child whole-row Var in the targetlist instead, in which case, 
we need to fix the Var after the fact, but can avoid making many other 
parts of the planner complicated.

> That seems quite a bit worse than what the
> existing code is doing.

One idea to address that concern would be to adjust the pathtarget of 
each path created in generate_partitionwise_join_paths accordingly.  I'm 
missing something, though.

>> I reviewed his patch, and thought that that would fix the issue, but this is
>> about the current design on the child tlist handling in partitionise join
>> rather than that patch: it deviates from the assumption that we had for the
>> scan/join planning till PG10 that "a rel's targetlist would only include
>> Vars and PlaceHolderVars", and turns out that there are a lot of places
>> where we need to modify code to suppress that deviation, as partly shown in
>> that patch.  So, ISTM that the current design in itself is not that
>> localized.
>
> I used to think that was the assumption, too, but it seems to me that
> Tom told me a while back that it was never really true, and that
> assumption was in my head.  Unfortunately, I don't have a link to that
> email handy.  Either way, I think the solution is to get the tlist
> right from the start and cope with the consequences downstream, not
> start with the wrong thing and try to fix it later.

I'm not fully convinced that's really the right solution, but I think 
you know better about partitioning (as well as many other things) than 
me, so I think it's better for you to decide what to do for this issue.

Thanks for working on this!

Best regards,
Etsuro Fujita

pgsql-hackers by date:

From: Masahiko Sawada
Date: 24 July 2018, 14:38:30
Subject: Re: [WIP] [B-Tree] Retail IndexTuple deletion

From: Phil Florent
Date: 24 July 2018, 15:50:05
Subject: RE: Re: [Proposal] Add accumulated statistics for wait event

Re: Expression errors with "FOR UPDATE" and postgres_fdw with partitionwise join enabled. - Mailing list pgsql-hackers

Previous

Next