On Thu, Jun 15, 2017 at 1:36 PM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
> Attached patch v10 fixes the above. In the existing code, where it
> builds WCO constraints for each leaf partition; with the patch, that
> code now is applicable to row-movement-updates as well.
I guess I don't see why it should work like this. In the INSERT case,
we must build withCheckOption objects for each partition because those
partitions don't appear in the plan otherwise -- but in the UPDATE
case, they're already there, so why do we need to build anything at
all? Similarly for RETURNING projections. How are the things we need
for those cases not already getting built, associated with the
relevant resultRelInfos? Maybe there's a concern if some children got
pruned - they could turn out later to be the children into which
tuples need to be routed. But the patch makes no distinction between
possibly-pruned children and any others.
> There is another issue I discovered. The row-movement works fine if
> the destination leaf partition has different attribute ordering than
> the root : the existing insert-tuple-routing mapping handles that. But
> if the source partition has different ordering w.r.t. the root, it has
> a problem : there is no mapping in the opposite direction, i.e. from
> the leaf to root. And we require that because the tuple of source leaf
> partition needs to be converted to root partition tuple descriptor,
> since ExecFindPartition() starts with root.
Seems reasonable, but...
> To fix this, I have introduced another mapping array
> mtstate->mt_resultrel_maps[]. This corresponds to the
> mtstate->resultRelInfo[]. We don't require per-leaf-partition mapping,
> because the update result relations are pruned subset of the total
> leaf partitions.
... I don't understand how you can *not* need a per-leaf-partition
mapping. I mean, maybe you only need the mapping for the *unpruned*
leaf partitions but you certainly need a separate mapping for each one
of those.
It's possible to imagine driving the tuple routing off of just the
partition key attributes, extracted from wherever they are inside the
tuple at the current level, rather than converting to the root's tuple
format. However, that's not totally straightforward because there
could be multiple levels of partitioning throughout the tree and
different attributes might be needed at different levels. Moreover,
in most cases, the mappings are going to end up being no-ops because
the column order will be the same, so it's probably not worth
complicating the code to try to avoid a double conversion that usually
won't happen.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company