Re: [HACKERS] UPDATE of partition key - Mailing list pgsql-hackers

From Amit Langote
Subject Re: [HACKERS] UPDATE of partition key
Date
Msg-id 032a04c8-6eca-ba4a-f36f-7cfad908ef13@lab.ntt.co.jp
Whole thread Raw
In response to Re: [HACKERS] UPDATE of partition key  (Amit Khandekar <amitdkhan.pg@gmail.com>)
List pgsql-hackers
Thanks Amit.

Looking at the latest v25 patch.

On 2017/11/16 23:50, Amit Khandekar wrote:
> Below has the responses for both Amit's and David's comments, starting
> with Amit's ....
> On 2 November 2017 at 12:40, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> On 2017/10/24 0:15, Amit Khandekar wrote:
>>> On 16 October 2017 at 08:28, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>>>>
>>>> +           (event == TRIGGER_EVENT_UPDATE && ((oldtup == NULL) ^ (newtup
>>>> == NULL))))
>>>>
>>>> Is there some reason why a bitwise operator is used here?
>>>
>>> That exact condition means that the function is called for transition
>>> capture for updated rows being moved to another partition. For this
>>> scenario, either the oldtup or the newtup is NULL. I wanted to exactly
>>> capture that condition there. I think the bitwise operator is more
>>> user-friendly in emphasizing the point that it is indeed an "either a
>>> or b, not both" condition.
>>
>> I see.  In that case, since this patch adds the new condition, a note
>> about it in the comment just above would be good, because the situation
>> you describe here seems to arise only during update-tuple-routing, IIUC.
> 
> Done. Please check.

Looks fine.

>> + * 'update_rri' has the UPDATE per-subplan result rels. These are re-used
>> + *      instead of allocating new ones while generating the array of all leaf
>> + *      partition result rels.
>>
>> Instead of:
>>
>> "These are re-used instead of allocating new ones while generating the
>> array of all leaf partition result rels."
>>
>> how about:
>>
>> "There is no need to allocate a new ResultRellInfo entry for leaf
>> partitions for which one already exists in this array"
> 
> Ok. I have made it like this :
> 
> + * 'update_rri' contains the UPDATE per-subplan result rels. For the
> output param
> + *             'partitions', we don't allocate new ResultRelInfo objects for
> + *             leaf partitions for which they are already available
> in 'update_rri'.

Sure.

>>> It looks like the interface does not much simplify, and above that, we
>>> have more number of lines in that function. Also, the caller anyway
>>> has to be aware whether the map_index is the index into the leaf
>>> partitions or the update subplans. So it is not like the caller does
>>> not have to be aware about whether the mapping should be
>>> mt_persubplan_childparent_maps or mt_perleaf_parentchild_maps.
>>
>> Hmm, I think we should try to make it so that the caller doesn't have to
>> be aware of that.  And by caller I guess you mean ExecInsert(), which
>> should not be a place, IMHO, where to try to introduce a lot of new logic
>> specific to update tuple routing.
> 
> I think, for ExecInsert() since we have already given the job of
> routing the tuple from root partitioned table to a partition, it makes
> sense to give the function the additional job of routing the tuple
> from any partition to any partition. ExecInsert() can be looked at as
> doing this job : "insert a tuple into the right partition; the
> original tuple can belong to any partition"

Yeah, that's one way of looking at that.  But I think ExecInsert() as it
is today thinks it's got a *new* tuple and that's it.  I think the newly
introduced code in it to find out that it is not so (that the tuple
actually comes from some other partition), that it's really the
update-turned-into-delete-plus-insert, and then switch to the root
partitioned table's ResultRelInfo, etc. really belongs outside of it.
Maybe in its caller, which is ExecUpdate().  I mean why not add this code
to the block in ExecUpdate() that handles update-row-movement.

Just before calling ExecInsert() to do the re-routing seems like a good
place to do all that.  For example, try the attached incremental patch
that applies on top of yours.  I can see after applying it that diffs to
ExecInsert() are now just some refactoring ones and there are no
significant additions making it look like supporting update-row-movement
required substantial changes to how ExecInsert() itself works.

> After doing the changes for the int[] array map in the previous patch
> version, I still feel that ConvertPartitionTupleSlot() should be
> retained. We save some repeated lines of code saved.

OK.

>> You may be right, but I see for WithCheckOptions initialization
>> specifically that the non-tuple-routing code passes the actual sub-plan
>> when initializing the WCO for a given result rel.
> 
> Yes that's true. The problem with WithCheckOptions for newly allocated
> partition result rels is : we can't use a subplan for the parent
> parameter because there is no subplan for it. But I will still think
> on it a bit more (TODO).

Alright.

>>> I think you are suggesting we do it like how it's done in
>>> is_partition_attr(). Can you please let me know other places we do
>>> this same way ? I couldn't find.
>>
>> OK, not as many as I thought there would be, but there are following
>> beside is_partition_attrs():
>>
>> partition.c: get_range_nulltest()
>> partition.c: get_qual_for_range()
>> relcache.c: RelationBuildPartitionKey()
>>
> 
> Ok, I think I will first address Robert's suggestion of re-using
> is_partition_attrs() for pull_child_partition_columns(). If I do that,
> this discussion won't be applicable, so I am deferring this one.
> (TODO)

Sure, no problem.

Thanks,
Amit

Attachment

pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: default range partition and constraint exclusion
Next
From: Antonin Houska
Date:
Subject: Re: [HACKERS] [PATCH] Incremental sort