Re: [HACKERS] UPDATE of partition key - Mailing list pgsql-hackers

From Amit Langote
Subject Re: [HACKERS] UPDATE of partition key
Date
Msg-id c36905be-27be-bb4a-11cc-0be4b83aca4a@lab.ntt.co.jp
Whole thread Raw
In response to Re: [HACKERS] UPDATE of partition key  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] UPDATE of partition key
List pgsql-hackers
On 2017/06/21 3:53, Robert Haas wrote:
> On Tue, Jun 20, 2017 at 2:54 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
>>> I guess I don't see why it should work like this.  In the INSERT case,
>>> we must build withCheckOption objects for each partition because those
>>> partitions don't appear in the plan otherwise -- but in the UPDATE
>>> case, they're already there, so why do we need to build anything at
>>> all?  Similarly for RETURNING projections.  How are the things we need
>>> for those cases not already getting built, associated with the
>>> relevant resultRelInfos?  Maybe there's a concern if some children got
>>> pruned - they could turn out later to be the children into which
>>> tuples need to be routed. But the patch makes no distinction
>>> between possibly-pruned children and any others.
>>
>> Yes, only a subset of the partitions appear in the UPDATE subplans. I
>> think typically for updates, a very small subset of the total leaf
>> partitions will be there in the plans, others would get pruned. IMHO,
>> it would not be worth having an optimization where it opens only those
>> leaf partitions which are not already there in the subplans. Without
>> the optimization, we are able to re-use the INSERT infrastructure
>> without additional changes.
> 
> Well, that is possible, but certainly not guaranteed.  I mean,
> somebody could do a whole-table UPDATE, or an UPDATE that hits a
> smattering of rows in every partition; e.g. the table is partitioned
> on order number, and you do UPDATE lineitem SET product_code = 'K372B'
> WHERE product_code = 'K372'.
> 
> Leaving that aside, the point here is that you're rebuilding
> withCheckOptions and returningLists that have already been built in
> the planner.  That's bad for two reasons.  First, it's inefficient,
> especially if there are many partitions.  Second, it will amount to a
> functional bug if you get a different answer than the planner did.
> Note this comment in the existing code:
> 
>     /*
>      * Build WITH CHECK OPTION constraints for each leaf partition rel. Note
>      * that we didn't build the withCheckOptionList for each partition within
>      * the planner, but simple translation of the varattnos for each partition
>      * will suffice.  This only occurs for the INSERT case; UPDATE/DELETE
>      * cases are handled above.
>      */
> 
> The comment "UPDATE/DELETE cases are handled above" is referring to
> the code that initializes the WCOs generated by the planner.  You've
> modified the comment in your patch, but the associated code: your
> updated comment says that only "DELETEs and local UPDATES are handled
> above", but in reality, *all* updates are still handled above.  And
> then they are handled again here.  Similarly for returning lists.
> It's certainly not OK for the comment to be inaccurate, but I think
> it's also bad to redo the work which the planner has already done,
> even if it makes the patch smaller.

I guess this has to do with the UPDATE turning into DELETE+INSERT.  So, it
seems like WCOs are being initialized for the leaf partitions
(ResultRelInfos in the mt_partitions array) that are in turn are
initialized for the aforementioned INSERT.  That's why the term "...local
UPDATEs" in the new comment text.

If that's true, I wonder if it makes sense to apply what would be
WCO_RLS_UPDATE_CHECK to a leaf partition that the tuple will be moved into
by calling ExecInsert()?

> Also, I feel like it's probably not correct to use the first result
> relation as the nominal relation for building WCOs and returning lists
> anyway.  I mean, if the first result relation has a different column
> order than the parent relation, isn't this just broken?  If it works
> for some reason, the comments don't explain what that reason is.

Yep, it's more appropriate to use
ModifyTableState->rootResultRelationInfo->ri_RelationDesc somehow.  That
is, if answer to the question I raised above is positive.

Thanks,
Amit




pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: [HACKERS] Useless code in ExecInitModifyTable
Next
From: sanyam jain
Date:
Subject: Re: [HACKERS] Logical decoding on standby