Re: [HACKERS] UPDATE of partition key - Mailing list pgsql-hackers

From Amit Khandekar
Subject Re: [HACKERS] UPDATE of partition key
Date
Msg-id CAJ3gD9fM_EQdWXcz0iHRPS=0k_v902tVke8LYDksRX2TTV7PdQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] UPDATE of partition key  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
On 9 November 2017 at 09:27, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
> On Wed, Nov 8, 2017 at 5:57 PM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
>> On 8 November 2017 at 07:55, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>>> On Tue, Nov 7, 2017 at 8:03 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>> The changes to trigger.c still make me super-nervous.  Hey THOMAS
>>>> MUNRO, any chance you could review that part?
>
> At first, it seemed quite strange to me that row triggers and
> statement triggers fire different events for the same modification.
> Row triggers see DELETE +  INSERT (necessarily because different
> tables are involved), but this fact is hidden from the target table's
> statement triggers.
>
> The alternative would be for all triggers to see consistent events and
> transitions.  Instead of having your special case code in ExecInsert
> and ExecDelete that creates the two halves of a 'synthetic' UPDATE for
> the transition tables, you'd just let the existing ExecInsert and
> ExecDelete code do its thing, and you'd need a flag to record that you
> should also fire INSERT/DELETE after statement triggers if any rows
> moved.

Yeah I also had thought about that. But thought that change was too
invasive. For e.g. letting ExecARInsertTriggers() do the transition
capture even when transition_capture->tcs_update_new_table is set.

I was also thinking of having a separate function to *only* add the
transition table rows. So in ExecInsert, call this one instead of
ExecARUpdateTriggers(). But realized that the existing
ExecARUpdateTriggers() looks like a better, robust interface with all
its checks. Just that calling ExecARUpdateTriggers() sounds like we
are also firing trigger; we are not firing any trigger or saving any
event, we are just adding the transition row.

>
> After sleeping on this question, I am coming around to the view that
> the way you have it is right.  The distinction isn't really between
> row triggers and statement triggers, it's between triggers at
> different levels in the hierarchy.  It just so happens that we
> currently only fire target table statement triggers and leaf table row
> triggers.

Yes. And rows are there only in leaf partitions. So we have to
simulate as though the target table has these rows. Like you
mentioned, the user has to get the impression of a normal table. So we
have to do something extra to capture the rows.

> Future development ideas that seem consistent with your choice:
>
> 1.  If we ever allow row triggers with transition tables on child
> tables, then I think *their* transition tables should certainly see
> the deletes and inserts, otherwise OLD TABLE and NEW TABLE would be
> inconsistent with the OLD and NEW variables in a single trigger
> invocation.  (These were prohibited mainly due to lack of time and
> (AFAIK) limited usefulness; I think they would need probably need
> their own separate tuplestores, or possibly some kind of filtering.)

As we know, for row triggers on leaf partitions, we treat them as
normal tables, so a trigger written on a leaf partition sees only the
local changes. The trigger is unaware whether the insert is part of an
UPDATE row movement. Similarly, the transition table referenced by
that row trigger function should see only the NEW table, not the old
table.

>
> 2.  If we ever allow row triggers on partitioned tables (ie that fire
> when its children are modified), then I think their UPDATE trigger
> should probably fire when a row moves between any two (grand-)*child
> tables, just as you have it for target table statement triggers.

Yes I agree.

> It doesn't matter that the view from parent tables' triggers is
> inconsistent with the view from leaf table triggers: it's a feature
> that we 'hide' partitioning from the user to the extent we can so that
> you can treat the partitioned table just like a table.
>
> Any other views?

I think because because there is no provision for a row trigger on
partitioned table, users who want to have a common trigger on a
partition subtree, has no choice but to create the same trigger
individually on the leaf partitions. And that's the reason we cannot
handle an update row movement with triggers without anomalies.

Thanks
-Amit Khandekar


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: [HACKERS] path toward faster partition pruning
Next
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] Aggregates push-down to partitions