Re: a misbehavior of partition row movement (?) - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: a misbehavior of partition row movement (?) |
Date | |
Msg-id | CA+HiwqFOQN17X+PsuQrA_BCe5OJF=Ep-um8MdzaSoqDVh_QNqA@mail.gmail.com Whole thread Raw |
In response to | Re: a misbehavior of partition row movement (?) (Amit Langote <amitlangote09@gmail.com>) |
Responses |
Re: a misbehavior of partition row movement (?)
|
List | pgsql-hackers |
On Wed, Jan 19, 2022 at 4:13 PM Amit Langote <amitlangote09@gmail.com> wrote: > On Wed, Jan 19, 2022 at 7:29 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2022-Jan-18, Alvaro Herrera wrote: > > > On 2022-Jan-18, Amit Langote wrote: > > > > > > > Would you like me to update the patch with the above renumbering or > > > > are you working on a new version yourself? > > > > > > I have a few very minor changes apart from that. Let me see if I can > > > get this pushed soon; if I'm unable to (there are parts I don't fully > > > grok yet), I'll post what I have. > > > > Here's v13, a series with your patch as 0001 and a few changes from me; > > the bulk is just pgindent, and there are a few stylistic changes and an > > unrelated typo fix and I added a couple of comments to your new code. > > Thank you. > > > I don't like the API change to ExecInsert(). You're adding two output > > arguments: > > - the tuple being inserted. But surely you have this already, because > > it's in the 'slot' argument you passed in. ExecInsert is even careful to > > set the ->tts_tableOid argument there. So ExecInsert's caller doesn't > > need to receive the inserted tuple as an argument, it can just read > > 'slot'. > > Hmm, ExecInsert()'s input slot belongs to either the source partition > or the "root" target relation, the latter due to the following stanza > in ExecCrossPartitionUpdate(): > > /* > * resultRelInfo is one of the per-relation resultRelInfos. So we should > * convert the tuple into root's tuple descriptor if needed, since > * ExecInsert() starts the search from root. > */ > tupconv_map = ExecGetChildToRootMap(resultRelInfo); > if (tupconv_map != NULL) > slot = execute_attr_map_slot(tupconv_map->attrMap, > slot, > mtstate->mt_root_tuple_slot); > > Though the slot whose tuple ExecInsert() ultimately inserts may be > destination partition's ri_PartitionTupleSlot due to the following > stanza in it: > > /* > * If the input result relation is a partitioned table, find the leaf > * partition to insert the tuple into. > */ > if (proute) > { > ResultRelInfo *partRelInfo; > > slot = ExecPrepareTupleRouting(mtstate, estate, proute, > resultRelInfo, slot, > &partRelInfo); > resultRelInfo = partRelInfo; > } > > It's not great that ExecInsert()'s existing header comment doesn't > mention that the slot whose tuple is actually inserted may not be the > slot it receives from the caller :-(. > > > - the relation to which the tuple was inserted. Can't this be obtained > > by looking at slot->tts_tableOid? We should be able to use > > ExecLookupResultRelByOid() to obtain it, no? (I suppose you may claim > > that this is wasteful, but I think this is not a common case anyway and > > it's worth keeping ExecInsert()'s API clean for the sake of the 99.99% > > of its other calls). > > ExecLookupResultRelByOid() is only useful when *all* relevant leaf > partitions are present in the ModifyTableState.resultRelInfo array > (due to being present in ModifyTable.resultRelations). Leaf > partitions that are only initialized by tuple routing (such as > destination partitions of cross-partition updates) are only present in > ModifyTableState.mt_partition_tuple_routing.partitions[] that are not > discoverable by ExecLookupResultRelByOid(). > > > I think the argument definition of ExecCrossPartitionUpdateForeignKey is > > a bit messy. I propose to move mtstate,estate as two first arguments; > > IIRC the whole executor does it that way. > > Okay, done. > > > AfterTriggerSaveEvent determines maybe_crosspart_update (by looking at > > mtstate->operation -- why doesn't it look at 'event' instead?) and later > > it determines new_event.ate_flags. Why can't it use > > maybe_crosspart_update to simplify part of that? Or maybe the other way > > around, not sure. It looks like something could be made simpler there. > > Actually, I remember disliking maybe_crosspart_update for sounding a > bit fuzzy and also having to add mtstate to a bunch of trigger.c > interface functions only to calculate that. > > I now wonder if passing an is_crosspart_update through > ExecAR{Update|Delete}Triggers() would not be better. Both > ExecDelete() and ExecCrossPartitionUpdateForeignKey() know for sure if > their ExecAR{Update|Delete}Triggers() invocation is for a > cross-partition update, so better to just pass that information down > to AfterTriggerSaveEvent() than pass 'mtstate' and have the latter > reverse-engineer only a fuzzy guess of whether that's the case. > > I like that interface better and have implemented it in the updated version. > > I've also merged your changes and made some of my own as mentioned > above to end up with the attached v14. I'm also attaching a delta > patch over v13 (0001+0002) to easily see the changes I made to get > v14. Oops, broke the cfbot's patch-applier again. Delta-diff reattached as txt. -- Amit Langote EDB: http://www.enterprisedb.com
Attachment
pgsql-hackers by date: