Re: ON CONFLICT DO UPDATE for partitioned tables - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: ON CONFLICT DO UPDATE for partitioned tables
Date
Msg-id 5AB0F633.2030007@lab.ntt.co.jp
Whole thread Raw
In response to Re: ON CONFLICT DO UPDATE for partitioned tables  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Responses Re: ON CONFLICT DO UPDATE for partitioned tables  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
List pgsql-hackers
(2018/03/20 13:30), Amit Langote wrote:
> On 2018/03/19 21:59, Etsuro Fujita wrote:
>> (2018/03/18 13:17), Alvaro Herrera wrote:
>>> Alvaro Herrera wrote:
>>> The only thing that I remain unhappy about this patch is the whole
>>> adjust_and_expand_partition_tlist() thing.  I fear we may be doing
>>> redundant and/or misplaced work.  I'll look into it next week.
>>
>> I'm still reviewing the patches, but I really agree on that point.  As
>> Pavan mentioned upthread, the onConflictSet tlist for the root parent,
>> from which we create a translated onConflictSet tlist for a partition,
>> would have already been processed by expand_targetlist() to contain all
>> missing columns as well, so I think we could create the tlist for the
>> partition by simply re-ordering the expression-converted tlist (ie,
>> conv_setproj) based on the conversion map for the partition.  The Attached
>> defines a function for that, which could be called, instead of calling
>> adjust_and_expand_partition_tlist().  This would allow us to get rid of
>> planner changes from the patches.  Maybe I'm missing something, though.
>
> Thanks for the patch.  I can confirm your proposed
> adjust_onconflictset_tlist() is enough to replace adjust_inherited_tlist()
> + expand_targetlist() combination (aka
> adjust_and_expand_partition_tlist()), thus rendering the planner changes
> in this patch unnecessary.  I tested it with a partition tree involving
> partitions of varying attribute numbers (dropped columns included) and it
> seems to work as expected (as also exercised in regression tests) as shown
> below.

Thanks for testing!

> I have incorporated your patch in the main patch after updating the
> comments a bit.  Also, now that 6666ee49f49 is in [1], the transition
> table related tests I proposed yesterday pass nicely.  Instead of posting
> as a separate patch, I have merged it with the main patch.  So now that
> planner refactoring is unnecessary, attached is just one patch.

Here are comments on executor changes in (the latest version of) the patch:

@@ -421,8 +424,18 @@ ExecInsert(ModifyTableState *mtstate,
              ItemPointerData conflictTid;
              bool        specConflict;
              List       *arbiterIndexes;
+            PartitionTupleRouting *proute =
+                                        mtstate->mt_partition_tuple_routing;

-            arbiterIndexes = node->arbiterIndexes;
+            /* Use the appropriate list of arbiter indexes. */
+            if (mtstate->mt_partition_tuple_routing != NULL)
+            {
+                Assert(partition_index >= 0 && proute != NULL);
+                arbiterIndexes =
+                        proute->partition_arbiter_indexes[partition_index];
+            }
+            else
+                arbiterIndexes = node->arbiterIndexes;

To handle both cases the same way, I wonder if it would be better to 
have the arbiterindexes list in ResultRelInfo as well, as mentioned by 
Alvaro upthread, or to re-add mt_arbiterindexes as before and set it to 
proute->partition_arbiter_indexes[partition_index] before we get here, 
maybe in ExecPrepareTupleRouting, in the case of tuple routing.

  ExecOnConflictUpdate(ModifyTableState *mtstate,
                       ResultRelInfo *resultRelInfo,
+                     TupleDesc onConflictSetTupdesc,
                       ItemPointer conflictTid,
                       TupleTableSlot *planSlot,
                       TupleTableSlot *excludedSlot,
@@ -1419,6 +1459,7 @@ ExecOnConflictUpdate(ModifyTableState *mtstate,
      ExecCheckHeapTupleVisible(estate, &tuple, buffer);

      /* Store target's existing tuple in the state's dedicated slot */
+    ExecSetSlotDescriptor(mtstate->mt_existing, RelationGetDescr(relation));
      ExecStoreTuple(&tuple, mtstate->mt_existing, buffer, false);

      /*
@@ -1462,6 +1503,7 @@ ExecOnConflictUpdate(ModifyTableState *mtstate,
      }

      /* Project the new tuple version */
+    ExecSetSlotDescriptor(mtstate->mt_conflproj, onConflictSetTupdesc);
      ExecProject(resultRelInfo->ri_onConflictSetProj);

Can we do ExecSetSlotDescriptor for mtstate->mt_existing and 
mtstate->mt_conflproj in ExecPrepareTupleRouting in the case of tuple 
routing?  That would make the API changes to ExecOnConflictUpdate 
unnecessary.

@@ -2368,9 +2419,13 @@ ExecInitModifyTable(ModifyTable *node, EState 
*estate, int eflags)
          econtext = mtstate->ps.ps_ExprContext;
          relationDesc = resultRelInfo->ri_RelationDesc->rd_att;

-        /* initialize slot for the existing tuple */
-        mtstate->mt_existing =
-            ExecInitExtraTupleSlot(mtstate->ps.state, relationDesc);
+        /*
+         * Initialize slot for the existing tuple.  We determine which
+         * tupleDesc to use for this after we have determined which relation
+         * the insert/update will be applied to, possibly after performing
+         * tuple routing.
+         */
+        mtstate->mt_existing = ExecInitExtraTupleSlot(mtstate->ps.state, NULL);

          /* carried forward solely for the benefit of explain */
          mtstate->mt_excludedtlist = node->exclRelTlist;
@@ -2378,8 +2433,16 @@ ExecInitModifyTable(ModifyTable *node, EState 
*estate, int eflags)
          /* create target slot for UPDATE SET projection */
          tupDesc = ExecTypeFromTL((List *) node->onConflictSet,
                                   relationDesc->tdhasoid);
+        PinTupleDesc(tupDesc);
+        mtstate->mt_conflproj_tupdesc = tupDesc;
+
+        /*
+         * Just like the "existing tuple" slot, we'll defer deciding which
+         * tupleDesc to use for this slot to a point where tuple routing has
+         * been performed.
+         */
          mtstate->mt_conflproj =
-            ExecInitExtraTupleSlot(mtstate->ps.state, tupDesc);
+            ExecInitExtraTupleSlot(mtstate->ps.state, NULL);

If we do ExecInitExtraTupleSlot for mtstate->mt_existing and 
mtstate->mt_conflproj in ExecPrepareTupleRouting in the case of tuple 
routing, as said above, we wouldn't need this changes.  I think doing 
that only in the case of tuple routing and keeping this as-is would be 
better because that would save cycles in the normal case.

I'll look at other parts of the patch next.

Best regards,
Etsuro Fujita


pgsql-hackers by date:

Previous
From: ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Date:
Subject: Re: configure's checks for --enable-tap-tests are insufficient
Next
From: Etsuro Fujita
Date:
Subject: Re: [HACKERS] Add support for tuple routing to foreign partitions