Re: ModifyTable overheads in generic plans - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: ModifyTable overheads in generic plans
Date
Msg-id ac2b8afd-e51f-e61b-1a41-ea675d480cc8@iki.fi
Whole thread Raw
In response to Re: ModifyTable overheads in generic plans  (Amit Langote <amitlangote09@gmail.com>)
Responses Re: ModifyTable overheads in generic plans
List pgsql-hackers
I'm still a bit confused and unhappy about the initialization of 
ResultRelInfos and the various fields in them. We've made progress in 
the previous patches, but it's still a bit messy.

>         /*
>          * If transition tuples will be captured, initialize a map to convert
>          * child tuples into the format of the table mentioned in the query
>          * (root relation), because the transition tuple store can only store
>          * tuples in the root table format.  However for INSERT, the map is
>          * only initialized for a given partition when the partition itself is
>          * first initialized by ExecFindPartition.  Also, this map is also
>          * needed if an UPDATE ends up having to move tuples across
>          * partitions, because in that case the child tuple to be moved first
>          * needs to be converted into the root table's format.  In that case,
>          * we use GetChildToRootMap() to either create one from scratch if
>          * we didn't already create it here.
>          *
>          * Note: We cannot always initialize this map lazily, that is, use
>          * GetChildToRootMap(), because AfterTriggerSaveEvent(), which needs
>          * the map, doesn't have access to the "target" relation that is
>          * needed to create the map.
>          */
>         if (mtstate->mt_transition_capture && operation != CMD_INSERT)
>         {
>             Relation    relation = resultRelInfo->ri_RelationDesc;
>             Relation    targetRel = mtstate->rootResultRelInfo->ri_RelationDesc;
> 
>             resultRelInfo->ri_ChildToRootMap =
>                 convert_tuples_by_name(RelationGetDescr(relation),
>                                        RelationGetDescr(targetRel));
>             /* First time creating the map for this result relation. */
>             Assert(!resultRelInfo->ri_ChildToRootMapValid);
>             resultRelInfo->ri_ChildToRootMapValid = true;
>         }

The comment explains that AfterTriggerSaveEvent() cannot use 
GetChildToRootMap(), because it doesn't have access to the root target 
relation. But there is a field for that in ResultRelInfo: 
ri_PartitionRoot. However, that's only set up when we do partition routing.

How about we rename ri_PartitionRoot to e.g ri_RootTarget, and set it 
always, even for non-partition inheritance? We have that information 
available when we initialize the ResultRelInfo, so might as well.

Some code currently checks ri_PartitionRoot, to determine if a tuple 
that's been inserted, has been routed. For example:

>         /*
>          * Also check the tuple against the partition constraint, if there is
>          * one; except that if we got here via tuple-routing, we don't need to
>          * if there's no BR trigger defined on the partition.
>          */
>         if (resultRelationDesc->rd_rel->relispartition &&
>             (resultRelInfo->ri_PartitionRoot == NULL ||
>              (resultRelInfo->ri_TrigDesc &&
>               resultRelInfo->ri_TrigDesc->trig_insert_before_row)))
>             ExecPartitionCheck(resultRelInfo, slot, estate, true);

So if we set ri_PartitionRoot always, we would need some other way to 
determine if the tuple at hand has actually been routed or not. But 
wouldn't that be a good thing anyway? Isn't it possible that the same 
ResultRelInfo is sometimes used for routed tuples, and sometimes for 
tuples that have been inserted/updated "directly"? 
ExecLookupUpdateResultRelByOid() sets that field lazily, so I think it 
would be possible to get here with ri_PartitionRoot either set or not, 
depending on whether an earlier cross-partition update was routed to the 
table.

The above check is just an optimization, to skip unnecessary 
ExecPartitionCheck() calls, but I think this snippet in 
ExecConstraints() needs to get this right:

>                 /*
>                  * If the tuple has been routed, it's been converted to the
>                  * partition's rowtype, which might differ from the root
>                  * table's.  We must convert it back to the root table's
>                  * rowtype so that val_desc shown error message matches the
>                  * input tuple.
>                  */
>                 if (resultRelInfo->ri_PartitionRoot)
>                 {
>                     AttrMap    *map;
> 
>                     rel = resultRelInfo->ri_PartitionRoot;
>                     tupdesc = RelationGetDescr(rel);
>                     /* a reverse map */
>                     map = build_attrmap_by_name_if_req(orig_tupdesc,
>                                                        tupdesc);
> 
>                     /*
>                      * Partition-specific slot's tupdesc can't be changed, so
>                      * allocate a new one.
>                      */
>                     if (map != NULL)
>                         slot = execute_attr_map_slot(map, slot,
>                                                      MakeTupleTableSlot(tupdesc, &TTSOpsVirtual));
>                 }

Is that an existing bug, or am I missing?

- Heikki



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Clean up optional rules in grammar
Next
From: Greg Sabino Mullane
Date:
Subject: Re: Prefer TG_TABLE_NAME over TG_RELNAME in tests