Re: [HACKERS] UPDATE of partition key - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: [HACKERS] UPDATE of partition key |
Date | |
Msg-id | c36905be-27be-bb4a-11cc-0be4b83aca4a@lab.ntt.co.jp Whole thread Raw |
In response to | Re: [HACKERS] UPDATE of partition key (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: [HACKERS] UPDATE of partition key
|
List | pgsql-hackers |
On 2017/06/21 3:53, Robert Haas wrote: > On Tue, Jun 20, 2017 at 2:54 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote: >>> I guess I don't see why it should work like this. In the INSERT case, >>> we must build withCheckOption objects for each partition because those >>> partitions don't appear in the plan otherwise -- but in the UPDATE >>> case, they're already there, so why do we need to build anything at >>> all? Similarly for RETURNING projections. How are the things we need >>> for those cases not already getting built, associated with the >>> relevant resultRelInfos? Maybe there's a concern if some children got >>> pruned - they could turn out later to be the children into which >>> tuples need to be routed. But the patch makes no distinction >>> between possibly-pruned children and any others. >> >> Yes, only a subset of the partitions appear in the UPDATE subplans. I >> think typically for updates, a very small subset of the total leaf >> partitions will be there in the plans, others would get pruned. IMHO, >> it would not be worth having an optimization where it opens only those >> leaf partitions which are not already there in the subplans. Without >> the optimization, we are able to re-use the INSERT infrastructure >> without additional changes. > > Well, that is possible, but certainly not guaranteed. I mean, > somebody could do a whole-table UPDATE, or an UPDATE that hits a > smattering of rows in every partition; e.g. the table is partitioned > on order number, and you do UPDATE lineitem SET product_code = 'K372B' > WHERE product_code = 'K372'. > > Leaving that aside, the point here is that you're rebuilding > withCheckOptions and returningLists that have already been built in > the planner. That's bad for two reasons. First, it's inefficient, > especially if there are many partitions. Second, it will amount to a > functional bug if you get a different answer than the planner did. > Note this comment in the existing code: > > /* > * Build WITH CHECK OPTION constraints for each leaf partition rel. Note > * that we didn't build the withCheckOptionList for each partition within > * the planner, but simple translation of the varattnos for each partition > * will suffice. This only occurs for the INSERT case; UPDATE/DELETE > * cases are handled above. > */ > > The comment "UPDATE/DELETE cases are handled above" is referring to > the code that initializes the WCOs generated by the planner. You've > modified the comment in your patch, but the associated code: your > updated comment says that only "DELETEs and local UPDATES are handled > above", but in reality, *all* updates are still handled above. And > then they are handled again here. Similarly for returning lists. > It's certainly not OK for the comment to be inaccurate, but I think > it's also bad to redo the work which the planner has already done, > even if it makes the patch smaller. I guess this has to do with the UPDATE turning into DELETE+INSERT. So, it seems like WCOs are being initialized for the leaf partitions (ResultRelInfos in the mt_partitions array) that are in turn are initialized for the aforementioned INSERT. That's why the term "...local UPDATEs" in the new comment text. If that's true, I wonder if it makes sense to apply what would be WCO_RLS_UPDATE_CHECK to a leaf partition that the tuple will be moved into by calling ExecInsert()? > Also, I feel like it's probably not correct to use the first result > relation as the nominal relation for building WCOs and returning lists > anyway. I mean, if the first result relation has a different column > order than the parent relation, isn't this just broken? If it works > for some reason, the comments don't explain what that reason is. Yep, it's more appropriate to use ModifyTableState->rootResultRelationInfo->ri_RelationDesc somehow. That is, if answer to the question I raised above is positive. Thanks, Amit
pgsql-hackers by date: