Re: a misbehavior of partition row movement (?) - Mailing list pgsql-hackers

From Amit Langote
Subject Re: a misbehavior of partition row movement (?)
Date
Msg-id CA+HiwqFVguJ=ENM6MFhosZrpwEthfNsjVzZvmeiPCe62+Vk+1A@mail.gmail.com
Whole thread Raw
In response to Re: a misbehavior of partition row movement (?)  (Amit Langote <amitlangote09@gmail.com>)
Responses Re: a misbehavior of partition row movement (?)  (Rahila Syed <rahilasyed90@gmail.com>)
List pgsql-hackers
On Wed, Jan 20, 2021 at 7:03 PM Amit Langote <amitlangote09@gmail.com> wrote:
> On Wed, Jan 20, 2021 at 4:13 PM Peter Eisentraut
> > Could you summarize here what you are trying to do with respect to what
> > was decided before?  I'm a bit confused, looking through the patches you
> > have posted.  The first patch you posted hard-coded FK trigger OIDs
> > specifically, other patches talk about foreign key triggers in general
> > or special case internal triggers or talk about all triggers.
>
> The original problem statement is this: the way we generally fire
> row-level triggers of a partitioned table can lead to some unexpected
> behaviors of the foreign keys pointing to that partitioned table
> during its cross-partition updates.
>
> Let's start with an example that shows how triggers are fired during a
> cross-partition update:
>
> create table p (a numeric primary key) partition by list (a);
> create table p1 partition of p for values in (1);
> create table p2 partition of p for values in (2);
> create or replace function report_trig_details() returns trigger as $$
> begin raise notice '% % on %', tg_when, tg_op, tg_relname; if tg_op =
> 'DELETE' then return old; end if; return new; end; $$ language
> plpgsql;
> create trigger trig1 before update on p for each row execute function
> report_trig_details();
> create trigger trig2 after update on p for each row execute function
> report_trig_details();
> create trigger trig3 before delete on p for each row execute function
> report_trig_details();
> create trigger trig4 after delete on p for each row execute function
> report_trig_details();
> create trigger trig5 before insert on p for each row execute function
> report_trig_details();
> create trigger trig6 after insert on p for each row execute function
> report_trig_details();
>
> insert into p values (1);
> update p set a = 2;
> NOTICE:  BEFORE UPDATE on p1
> NOTICE:  BEFORE DELETE on p1
> NOTICE:  BEFORE INSERT on p2
> NOTICE:  AFTER DELETE on p1
> NOTICE:  AFTER INSERT on p2
> UPDATE 1
>
> (AR update triggers are not fired.)
>
> For contrast, here is an intra-partition update:
>
> update p set a = a;
> NOTICE:  BEFORE UPDATE on p2
> NOTICE:  AFTER UPDATE on p2
> UPDATE 1
>
> Now, the trigger machinery makes no distinction between user-defined
> and internal triggers, which has implications for the foreign key
> enforcing triggers on partitions.  Consider the following example:
>
> create table q (a bigint references p);
> insert into q values (2);
> update p set a = 1;
> NOTICE:  BEFORE UPDATE on p2
> NOTICE:  BEFORE DELETE on p2
> NOTICE:  BEFORE INSERT on p1
> ERROR:  update or delete on table "p2" violates foreign key constraint
> "q_a_fkey2" on table "q"
> DETAIL:  Key (a)=(2) is still referenced from table "q".
>
> So the RI delete trigger (NOT update) on p2 prevents the DELETE that
> occurs as part of the row movement.  One might make the updates
> cascade and expect that to prevent the error:
>
> drop table q;
> create table q (a bigint references p on update cascade);
> insert into q values (2);
> update p set a = 1;
> NOTICE:  BEFORE UPDATE on p2
> NOTICE:  BEFORE DELETE on p2
> NOTICE:  BEFORE INSERT on p1
> ERROR:  update or delete on table "p2" violates foreign key constraint
> "q_a_fkey2" on table "q"
> DETAIL:  Key (a)=(2) is still referenced from table "q".
>
> No luck, because again it's the RI delete trigger on p2 that gets
> fired.  If you make deletes cascade too, an even worse thing happens:
>
> drop table q;
> create table q (a bigint references p on update cascade on delete cascade);
> insert into q values (2);
> update p set a = 1;
> NOTICE:  BEFORE UPDATE on p2
> NOTICE:  BEFORE DELETE on p2
> NOTICE:  BEFORE INSERT on p1
> NOTICE:  AFTER DELETE on p2
> NOTICE:  AFTER INSERT on p1
> UPDATE 1
> select * from q;
>  a
> ---
> (0 rows)
>
> The RI delete trigger deleted 2 from q, whereas the expected result
> would've been for q to be updated to change 2 to 1.
>
> This shows that the way we've made these triggers behave in general
> can cause some unintended behaviors for foreign keys during
> cross-partition updates.  I started this thread to do something about
> that and sent a patch to prevent cross-partition updates at all when
> there are foreign keys pointing to it.  As others pointed out, that's
> not a great long-term solution to the problem, but that's what we may
> have to do in the back-branches if anything at all.
>
> So I wrote another patch targeting the dev branch to make the
> cross-partition updates work while producing a sane foreign key
> behavior.  The idea of the patch is to tweak the firing of AFTER
> triggers such that unintended RI triggers don't get fired, that is,
> those corresponding to DELETE and INSERT occurring internally as part
> of a cross-partition update.  Instead we now fire the AFTER UPDATE
> triggers, passing the root table as the target result relation (not
> the source partition because the new row doesn't belong to it).  This
> results in the same foreign key behavior as when no partitioning is
> involved at all.
>
> Then, Arne came along and suggested that we do this kind of firing for
> *all* triggers, not just the internal RI triggers, or at least that's
> what I understood Arne as saying.  That however would be changing the
> original design of cross-partition updates and will change the
> documented user-visible trigger behavior.  Changing this for internal
> triggers like the patch does changes no user-visible behavior, AFAIK,
> other than fixing the foreign key annoyance.  So I said if we do want
> to go the way that Arne is suggesting, it should be its own discussion
> and that's that.
>
> Sorry for a long "summary", but I hope it helps clarify things somewhat.

Here is an updated version of the patch with some cosmetic changes
from the previous version.  I moved the code being added to
AfterTriggerSaveEvent() and ExecUpdate() into separate subroutines to
improve readability, hopefully.

-- 
Amit Langote
EDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Laurenz Albe
Date:
Subject: Re: Stronger safeguard for archive recovery not to miss data
Next
From: Keisuke Kuroda
Date:
Subject: Re: simplifying foreign key/RI checks