Re: ON CONFLICT DO UPDATE for partitioned tables - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: ON CONFLICT DO UPDATE for partitioned tables
Date
Msg-id 5AB4DEB6.2020901@lab.ntt.co.jp
Whole thread Raw
In response to Re: ON CONFLICT DO UPDATE for partitioned tables  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
List pgsql-hackers
(2018/03/22 18:31), Amit Langote wrote:
> On 2018/03/20 20:53, Etsuro Fujita wrote:
>> Here are comments on executor changes in (the latest version of) the patch:
>>
>> @@ -421,8 +424,18 @@ ExecInsert(ModifyTableState *mtstate,
>>               ItemPointerData conflictTid;
>>               bool        specConflict;
>>               List       *arbiterIndexes;
>> +            PartitionTupleRouting *proute =
>> +                                        mtstate->mt_partition_tuple_routing;
>>
>> -            arbiterIndexes = node->arbiterIndexes;
>> +            /* Use the appropriate list of arbiter indexes. */
>> +            if (mtstate->mt_partition_tuple_routing != NULL)
>> +            {
>> +                Assert(partition_index>= 0&&  proute != NULL);
>> +                arbiterIndexes =
>> +                        proute->partition_arbiter_indexes[partition_index];
>> +            }
>> +            else
>> +                arbiterIndexes = node->arbiterIndexes;
>>
>> To handle both cases the same way, I wonder if it would be better to have
>> the arbiterindexes list in ResultRelInfo as well, as mentioned by Alvaro
>> upthread, or to re-add mt_arbiterindexes as before and set it to
>> proute->partition_arbiter_indexes[partition_index] before we get here,
>> maybe in ExecPrepareTupleRouting, in the case of tuple routing.
>
> It's a good idea.  I somehow missed that Alvaro had already mentioned it.
>
> In HEAD, we now have ri_onConflictSetProj and ri_onConflictSetWhere.  I
> propose we name the field ri_onConflictArbiterIndexes as done in the
> updated patch.

I like that naming.

>> @@ -2368,9 +2419,13 @@ ExecInitModifyTable(ModifyTable *node, EState
>> *estate, int eflags)
>>           econtext = mtstate->ps.ps_ExprContext;
>>           relationDesc = resultRelInfo->ri_RelationDesc->rd_att;
>>
>> -        /* initialize slot for the existing tuple */
>> -        mtstate->mt_existing =
>> -            ExecInitExtraTupleSlot(mtstate->ps.state, relationDesc);
>> +        /*
>> +         * Initialize slot for the existing tuple.  We determine which
>> +         * tupleDesc to use for this after we have determined which relation
>> +         * the insert/update will be applied to, possibly after performing
>> +         * tuple routing.
>> +         */
>> +        mtstate->mt_existing = ExecInitExtraTupleSlot(mtstate->ps.state,
>> NULL);
>>
>>           /* carried forward solely for the benefit of explain */
>>           mtstate->mt_excludedtlist = node->exclRelTlist;
>> @@ -2378,8 +2433,16 @@ ExecInitModifyTable(ModifyTable *node, EState
>> *estate, int eflags)
>>           /* create target slot for UPDATE SET projection */
>>           tupDesc = ExecTypeFromTL((List *) node->onConflictSet,
>>                                    relationDesc->tdhasoid);
>> +        PinTupleDesc(tupDesc);
>> +        mtstate->mt_conflproj_tupdesc = tupDesc;
>> +
>> +        /*
>> +         * Just like the "existing tuple" slot, we'll defer deciding which
>> +         * tupleDesc to use for this slot to a point where tuple routing has
>> +         * been performed.
>> +         */
>>           mtstate->mt_conflproj =
>> -            ExecInitExtraTupleSlot(mtstate->ps.state, tupDesc);
>> +            ExecInitExtraTupleSlot(mtstate->ps.state, NULL);
>>
>> If we do ExecInitExtraTupleSlot for mtstate->mt_existing and
>> mtstate->mt_conflproj in ExecPrepareTupleRouting in the case of tuple
>> routing, as said above, we wouldn't need this changes.  I think doing that
>> only in the case of tuple routing and keeping this as-is would be better
>> because that would save cycles in the normal case.
>
> Hmm, I think we shouldn't be doing ExecInitExtraTupleSlot in
> ExecPrepareTupleRouting, because we shouldn't have more than one instance
> of mtstate->mt_existing and mtstate->mt_conflproj slots.

Yeah, I think so too.  What I was going to say here is 
ExecSetSlotDescriptor, not ExecInitExtraTupleSlot, as you said below. 
Sorry about the incorrectness.  I guess I was too tired when writing 
that comments.

> As you also said above, I think you meant to say here that we do
> ExecInitExtraTupleSlot only once for both mtstate->mt_existing and
> mtstate->mt_conflproj in ExecInitModifyTable and only do
> ExecSetSlotDescriptor in ExecPrepareTupleRouting.

That's right.

> I have changed it so
> that ExecInitModifyTable now both creates the slot and sets the descriptor
> for non-tuple-routing cases and only creates but doesn't set the
> descriptor in the tuple-routing case.

IMHO I don't see much value in modifying code as such, because we do 
ExecSetSlotDescriptor for mt_existing and mt_conflproj in 
ExecPrepareTupleRouting for every inserted tuple.  So, I would leave 
that as-is, to keep that simple.

> For ExecPrepareTupleRouting to be able to access the tupDesc of the
> onConflictSet target list, I've added ri_onConflictSetProjTupDesc which is
> set by ExecInitPartitionInfo on first call for a give partition.  This is
> also suggested by Pavan in his review.

Seems like a good idea.

Here are some comments on the latest version of the patch:

+               /*
+                * Caller must set mtstate->mt_conflproj's tuple 
descriptor to
+                * this one before trying to use it for projection.
+                */
+               tupDesc = ExecTypeFromTL(onconflset, partrelDesc->tdhasoid);
+               leaf_part_rri->ri_onConflictSet->proj =
+                       ExecBuildProjectionInfo(onconflset, econtext,
+                                               mtstate->mt_conflproj,
+                                               &mtstate->ps, partrelDesc);

ExecBuildProjectionInfo is called without setting the tuple descriptor 
of mtstate->mt_conflproj to tupDesc.  That might work at least for now, 
but I think it's a good thing to set it appropriately to make that 
future proof.

+            * This corresponds to a dropped attribute in the partition, for
+            * which we enerate a dummy entry with resno matching the
+            * partition's attno.

s/enerate/generate/

+ * OnConflictSetState
+ *
+ * Contains execution time state of a ON CONFLICT DO UPDATE operation, 
which
+ * includes the state of projection, tuple descriptor of the 
projection, and
+ * WHERE quals if any.

s/a ON/an ON/

+typedef struct OnConflictSetState
+{  /* for computing ON CONFLICT DO UPDATE SET */

This is nitpicking, but this wouldn't follow the project style, so I 
think that needs re-formatting.

I'll look at the patch a little bit more early next week.

Thanks for updating the patch!

Best regards,
Etsuro Fujita


pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Odd procedure resolution
Next
From: Jeevan Chalke
Date:
Subject: Re: [HACKERS] Partition-wise aggregation/grouping