Re: ON CONFLICT DO UPDATE for partitioned tables - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: ON CONFLICT DO UPDATE for partitioned tables
Date
Msg-id CABOikdOQ3iLmSWpgmYnOmMXo05cG8rRDWpajVmEE8jnXeQF-=A@mail.gmail.com
Whole thread Raw
In response to Re: ON CONFLICT DO UPDATE for partitioned tables  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
List pgsql-hackers


On Tue, Mar 20, 2018 at 10:09 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
On 2018/03/20 13:30, Amit Langote wrote:
> I have incorporated your patch in the main patch after updating the
> comments a bit.  Also, now that 6666ee49f49 is in [1], the transition
> table related tests I proposed yesterday pass nicely.  Instead of posting
> as a separate patch, I have merged it with the main patch.  So now that
> planner refactoring is unnecessary, attached is just one patch.

Sorry, I forgot to remove a hunk in the patch affecting
src/include/optimizer/prep.h.  Fixed in the attached updated version.


Thanks for writing a new version. A few comments:

 
      <listitem>
       <para>
-       Using the <literal>ON CONFLICT</literal> clause with partitioned tables
-       will cause an error if the conflict target is specified (see
-       <xref linkend="sql-on-conflict" /> for more details on how the clause
-       works).  Therefore, it is not possible to specify
-       <literal>DO UPDATE</literal> as the alternative action, because
-       specifying the conflict target is mandatory in that case.  On the other
-       hand, specifying <literal>DO NOTHING</literal> as the alternative action
-       works fine provided the conflict target is not specified.  In that case,
-       unique constraints (or exclusion constraints) of the individual leaf
-       partitions are considered.
-      </para>
-     </listitem>

We should document it somewhere that partition key update is not supported by
ON CONFLICT DO UPDATE

 /*
  * get_partition_parent
+ * Obtain direct parent or topmost ancestor of given relation
  *
- * Returns inheritance parent of a partition by scanning pg_inherits
+ * Returns direct inheritance parent of a partition by scanning pg_inherits;
+ * or, if 'getroot' is true, the topmost parent in the inheritance hierarchy.
  *
  * Note: Because this function assumes that the relation whose OID is passed
  * as an argument will have precisely one parent, it should only be called
  * when it is known that the relation is a partition.
  */

Given that most callers only look for immediate parent, I wonder if it makes
sense to have a new function, get_partition_root(), instead of changing
signature of the current function. That will reduce foot-print of this patch
quite a lot.


@@ -36,6 +38,7 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
  Datum *values,
  bool *isnull,
  int maxfieldlen);
+static List *adjust_onconflictset_tlist(List *tlist, TupleConversionMap *map);
 
We should name this function in a more generic way, given that it's going to be
used for other things too. What about adjust_partition_tlist?

+
+ /*
+ * If partition's tuple descriptor differs from the root parent,
+ * we need to adjust the onConflictSet target list to account for
+ * differences in attribute numbers.
+ */
+ if (map != NULL)
+ {
+ /*
+ * First convert Vars to contain partition's atttribute
+ * numbers.
+ */
+
+ /* Convert Vars referencing EXCLUDED pseudo-relation. */
+ onconflset = map_partition_varattnos(node->onConflictSet,
+ INNER_VAR,
+ partrel,
+ firstResultRel, NULL);

Are we not modifying node->onConflictSet in place? Or does
map_partition_varattnos() create a fresh copy before scribbling on the input?
If it's former then I guess that's a problem. If it's latter then we ought to
have comments explaining that.


+ tupDesc = ExecTypeFromTL(onconflset, partrelDesc->tdhasoid);
+ PinTupleDesc(tupDesc);

Why do we need to pin the descriptor? If we do need, why don't we need
corresponding ReleaseTupDesc() call?

I am still confused if the partition_conflproj_tdescs really belongs to
PartitionTupleRouting or should it be a part of the ResultRelInfo. FWIW for the
MERGE patch, I moved everything to a new struct and made it part of the
ResultRelInfo. If no re-mapping is necessary, we can just point to the struct
in the root relation's ResultRelInfo. Otherwise create and populate a new one
in the partition relation's ResultRelInfo.

+ leaf_part_rri->ri_onConflictSetWhere =
+ ExecInitQual(onconflwhere, &mtstate->ps);
+ }

So ri_onConflictSetWhere and ri_onConflictSetProj are part of the
ResultRelInfo. Why not move mt_conflproj_tupdesc, partition_conflproj_tdescs,
partition_arbiter_indexes etc to the ResultRelInfo as well? 

+
+/*
+ * Adjust the targetlist entries of an inherited ON CONFLICT DO UPDATE
+ * operation for a given partition
+ *

As I said above, we should disassociate this function from ON CONFLICT DO
UPDATE and rather have it as a general purpose facility.

+ * The expressions have already been fixed, but we have to make sure that the
+ * target resnos match the partition.  In some cases, this can force us to
+ * re-order the tlist to preserve resno ordering.
+ *

Can we have some explanation regarding how the targetlist is reordered? I know
the function does that by updating the resno in place, but some explanation
would help. Also, should we add an assertion-build check to confirm that the
resultant list is actually ordered?

@@ -66,7 +67,8 @@ static TupleTableSlot *ExecPrepareTupleRouting(ModifyTableState *mtstate,
  EState *estate,
  PartitionTupleRouting *proute,
  ResultRelInfo *targetRelInfo,
- TupleTableSlot *slot);
+ TupleTableSlot *slot,
+ int *partition_index);
 static ResultRelInfo *getTargetResultRelInfo(ModifyTableState *node);
 static void ExecSetupChildParentMapForTcs(ModifyTableState *mtstate);
 static void ExecSetupChildParentMapForSubplan(ModifyTableState *mtstate);
@@ -264,6 +266,7 @@ ExecInsert(ModifyTableState *mtstate,
     TupleTableSlot *slot,
     TupleTableSlot *planSlot,
     EState *estate,
+    int partition_index,
     bool canSetTag)
 {
  HeapTuple tuple;

If we move the list of arbiter indexes and the tuple desc to ResultRelInfo, as
suggested above, I think we can avoid making any API changes to
ExecPrepareTupleRouting and ExecInsert.

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: [HACKERS] path toward faster partition pruning
Next
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] Partition-wise aggregation/grouping