Home > mailing lists
Re: ModifyTable overheads in generic plans - Mailing list pgsql-hackers

From	Amit Langote
Subject	Re: ModifyTable overheads in generic plans
Date	November 12, 2020 11:04:43
Msg-id	CA+HiwqGSeyQ0-AVzOo_4C9=X0BkZ8y2w6ucS8O4fTrRy-4BDsA@mail.gmail.com Whole thread Raw
In response to	Re: ModifyTable overheads in generic plans (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses	Re: ModifyTable overheads in generic plans
List	pgsql-hackers
Tree view
On Wed, Nov 11, 2020 at 10:14 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> I'm still a bit confused and unhappy about the initialization of
> ResultRelInfos and the various fields in them. We've made progress in
> the previous patches, but it's still a bit messy.
>
> >               /*
> >                * If transition tuples will be captured, initialize a map to convert
> >                * child tuples into the format of the table mentioned in the query
> >                * (root relation), because the transition tuple store can only store
> >                * tuples in the root table format.  However for INSERT, the map is
> >                * only initialized for a given partition when the partition itself is
> >                * first initialized by ExecFindPartition.  Also, this map is also
> >                * needed if an UPDATE ends up having to move tuples across
> >                * partitions, because in that case the child tuple to be moved first
> >                * needs to be converted into the root table's format.  In that case,
> >                * we use GetChildToRootMap() to either create one from scratch if
> >                * we didn't already create it here.
> >                *
> >                * Note: We cannot always initialize this map lazily, that is, use
> >                * GetChildToRootMap(), because AfterTriggerSaveEvent(), which needs
> >                * the map, doesn't have access to the "target" relation that is
> >                * needed to create the map.
> >                */
> >               if (mtstate->mt_transition_capture && operation != CMD_INSERT)
> >               {
> >                       Relation        relation = resultRelInfo->ri_RelationDesc;
> >                       Relation        targetRel = mtstate->rootResultRelInfo->ri_RelationDesc;
> >
> >                       resultRelInfo->ri_ChildToRootMap =
> >                               convert_tuples_by_name(RelationGetDescr(relation),
> >                                                                          RelationGetDescr(targetRel));
> >                       /* First time creating the map for this result relation. */
> >                       Assert(!resultRelInfo->ri_ChildToRootMapValid);
> >                       resultRelInfo->ri_ChildToRootMapValid = true;
> >               }
>
> The comment explains that AfterTriggerSaveEvent() cannot use
> GetChildToRootMap(), because it doesn't have access to the root target
> relation. But there is a field for that in ResultRelInfo:
> ri_PartitionRoot. However, that's only set up when we do partition routing.
>
> How about we rename ri_PartitionRoot to e.g ri_RootTarget, and set it
> always, even for non-partition inheritance? We have that information
> available when we initialize the ResultRelInfo, so might as well.

Yeah, I agree it's better to use ri_PartitionRoot more generally like
you describe here.

> Some code currently checks ri_PartitionRoot, to determine if a tuple
> that's been inserted, has been routed. For example:
>
> >               /*
> >                * Also check the tuple against the partition constraint, if there is
> >                * one; except that if we got here via tuple-routing, we don't need to
> >                * if there's no BR trigger defined on the partition.
> >                */
> >               if (resultRelationDesc->rd_rel->relispartition &&
> >                       (resultRelInfo->ri_PartitionRoot == NULL ||
> >                        (resultRelInfo->ri_TrigDesc &&
> >                         resultRelInfo->ri_TrigDesc->trig_insert_before_row)))
> >                       ExecPartitionCheck(resultRelInfo, slot, estate, true);
>
> So if we set ri_PartitionRoot always, we would need some other way to
> determine if the tuple at hand has actually been routed or not. But
> wouldn't that be a good thing anyway? Isn't it possible that the same
> ResultRelInfo is sometimes used for routed tuples, and sometimes for
> tuples that have been inserted/updated "directly"?
> ExecLookupUpdateResultRelByOid() sets that field lazily, so I think it
> would be possible to get here with ri_PartitionRoot either set or not,
> depending on whether an earlier cross-partition update was routed to the
> table.

ri_RelationDesc != ri_PartitionRoot gives whether the result relation
is the original target relation of the query or not, so checking that
should be enough here.

> The above check is just an optimization, to skip unnecessary
> ExecPartitionCheck() calls, but I think this snippet in
> ExecConstraints() needs to get this right:
>
> >                               /*
> >                                * If the tuple has been routed, it's been converted to the
> >                                * partition's rowtype, which might differ from the root
> >                                * table's.  We must convert it back to the root table's
> >                                * rowtype so that val_desc shown error message matches the
> >                                * input tuple.
> >                                */
> >                               if (resultRelInfo->ri_PartitionRoot)
> >                               {
> >                                       AttrMap    *map;
> >
> >                                       rel = resultRelInfo->ri_PartitionRoot;
> >                                       tupdesc = RelationGetDescr(rel);
> >                                       /* a reverse map */
> >                                       map = build_attrmap_by_name_if_req(orig_tupdesc,
> >                                                                                                          tupdesc);
> >
> >                                       /*
> >                                        * Partition-specific slot's tupdesc can't be changed, so
> >                                        * allocate a new one.
> >                                        */
> >                                       if (map != NULL)
> >                                               slot = execute_attr_map_slot(map, slot,
> >
MakeTupleTableSlot(tupdesc,&TTSOpsVirtual));
 
> >                               }
>
> Is that an existing bug, or am I missing?

What it's doing is converting a routed tuple in the partition's tuple
format back into the original target relation's format before showing
the tuple in the error message.  Note that we do this reverse
conversion only for tuple routing target relations, not all child
result relations, so in that sense it's a bit inconsistent.  Maybe we
don't need to be too pedantic about showing the exact same tuple as
the user inserted (that is, one matching the "root" table's column
order), so it seems okay to just remove these reverse-conversion
blocks that are repeated in a number of places that show an error
message after failing a constraint check.

Attached new 0002 which does these adjustments.  I went with
ri_RootTargetDesc to go along with ri_RelationDesc.

Also, I have updated the original 0002 (now 0003) to make
GetChildToRootMap() use ri_RootTargetDesc instead of
ModifyTableState.rootResultRelInfo.ri_RelationDesc, so that even
AfterTriggerSaveEvent() can now use that function.  This allows us to
avoid having to initialize ri_ChildToRootMap anywhere but inside
GetChildRootMap(), with that long comment defending doing so. :-)

-- 
Amit Langote
EDB: http://www.enterprisedb.com
Attachment

pgsql-hackers by date:
From: Simon Riggs
Date: 12 November 2020, 10:52:59
Subject: Re: Detecting File Damage & Inconsistencies
From: Michael Paquier
Date: 12 November 2020, 11:19:09
Subject: Re: Skip ExecCheckRTPerms in CTAS with no data
Re: ModifyTable overheads in generic plans - Mailing list pgsql-hackers

Attachment

Previous

Next