Re: partition routing layering in nodeModifyTable.c - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: partition routing layering in nodeModifyTable.c |
Date | |
Msg-id | CA+HiwqHtCWLdK-LO=NEsvOdHx+7yv4mE_zYK0i3BH7dXb-wxog@mail.gmail.com Whole thread Raw |
In response to | Re: partition routing layering in nodeModifyTable.c (Heikki Linnakangas <hlinnaka@iki.fi>) |
Responses |
Re: partition routing layering in nodeModifyTable.c
|
List | pgsql-hackers |
On Thu, Oct 15, 2020 at 11:59 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > On Wed, Oct 14, 2020 at 6:04 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > > I'll continue with the last couple of patches in this thread. > > I committed the move of the cross-partition logic to new > ExecCrossPartitionUpdate() function, with just minor comment editing and > pgindent. I left out the refactoring around the calls to AFTER ROW > INSERT/DELETE triggers. I stared at the change for a while, and wasn't > sure if I liked the patched or the unpatched new version better, so I > left it alone. Okay, thanks for committing that. > Looking at the last patch, "Revise child-to-root tuple conversion map > management", that's certainly an improvement. However, I find it > confusing that sometimes the mapping from child to root is in > relinfo->ri_ChildToRootMap, and sometimes in > relinfo->ri_PartitionInfo->pi_PartitionToRootMap. When is each of those > filled in? If both are set, is it well defined which one is initialized > first? It is ri_ChildToRootMap that is set first, because it's only set in child UPDATE target relations which are initialized in ExecInitModifyTable(), that is way before partition tuple routing comes into picture. ri_PartitionInfo and hence pi_PartitionToRootMap is set in tuple routing target partition's ResultRelInfos, which are lazily initialized when tuples land into them. If a tuple routing target partition happens to be an UPDATE target relation and we need to initialize the partition-to-root map, which for a tuple routing target partition is to be saved in pi_PartitionToRootMap, with the patch, we will try to reuse ri_ChildToRootMap because it would already be initialized. But as you say below, maybe we don't need to have two fields for the same thing, which I agree with. Having only ri_ChildToRootMap as you suggest below suffices. > In general, I'm pretty confused by the initialization of > ri_PartitionInfo. Where is initialized, and when? In execnodes.h, the > definition of ResultRelInfo says: > > > /* info for partition tuple routing (NULL if not set up yet) */ > > struct PartitionRoutingInfo *ri_PartitionInfo; > > That implies that the field is initialized lazily. But in > ExecFindPartition, we have this: > > > if (partidx == partdesc->boundinfo->default_index) > > { > > PartitionRoutingInfo *partrouteinfo = rri->ri_PartitionInfo; > > > > /* > > * The tuple must match the partition's layout for the constraint > > * expression to be evaluated successfully. If the partition is > > * sub-partitioned, that would already be the case due to the code > > * above, but for a leaf partition the tuple still matches the > > * parent's layout. > > * > > * Note that we have a map to convert from root to current > > * partition, but not from immediate parent to current partition. > > * So if we have to convert, do it from the root slot; if not, use > > * the root slot as-is. > > */ > > if (partrouteinfo) > > { > > TupleConversionMap *map = partrouteinfo->pi_RootToPartitionMap; > > > > if (map) > > slot = execute_attr_map_slot(map->attrMap, rootslot, > > partrouteinfo->pi_PartitionTupleSlot); > > else > > slot = rootslot; > > } > > > > ExecPartitionCheck(rri, slot, estate, true); > > } > > That check implies that it's not just lazily initialized, the code will > work differently if ri_PartitionInfo is set or not. > > I think all this would be more clear if ri_PartitionInfo and > ri_ChildToRootMap were both truly lazily initialized, the first time > they're needed. So, we initialize these maps when we initialize a partition's ResultRelInfo. I mean if the partition has a different tuple descriptor than root, we know we are going to need to convert tuples between them (in either direction), so we might as well initialize the maps when the ResultRelInfo is built, which we do lazily for tuple routing target relations at least. In that sense, at least root-to-partition maps are initialized lazily, that is only when a partition receives a tuple via routing. Partition-to-root maps' initialization though is not always lazy, because they are also needed by UPDATE target relations, whose ResultRelInfo are initialized in ExecInitModifyTable(), which is not lazy enough. That will change with my other patch though. :) > And if we removed > ri_PartitionInfo->pi_PartitionToRootMap, and always used > ri_ChildToRootMap for it. Done in the attached. > Maybe remove PartitionRoutingInfo struct altogether, and just move its > fields directly to ResultRelInfo. If we do that, we'll end up with 3 notations for the same thing across releases: In v10 and v11, PartitionRoutingInfos members are saved in arrays in ModifyTableState, totally detached from the partition ResultRelInfos. In v12 (3f2393edef), we moved them into ResultRelInfo but chose to add them into a sub-struct (PartitionRoutingInfo), which in retrospect was not a great decision. Now if we pull them into ResultRelInfo, we'll have invented the 3rd notation. Maybe that makes things hard when back-patching bug-fixes? Attached updated patch. -- Amit Langote EDB: http://www.enterprisedb.com
Attachment
pgsql-hackers by date: