Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails - Mailing list pgsql-hackers

From Jehan-Guillaume de Rorthais
Subject Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Date
Msg-id 20240902230147.0d3958bc@karst
Whole thread Raw
In response to Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
List pgsql-hackers
Hi,

On Tue, 20 Aug 2024 23:09:27 -0400
Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> On 2024-Aug-20, Jehan-Guillaume de Rorthais wrote:
>
> > I'm back on this issue as well. I start poking at this patch to review it,
> > test it, challenge it and then report here.
> >
> > I'll try to check if some other issues might have lost/forgot on they way as
> > well.
>
> Thanks, much appreciated, looking forward to your feedback.

Sorry, it took me a while to come back to you on this topic. It has been hard to
untangle subjects, reproductions and patch…

There's three distinct issues/thread:

* Constraint & trigger catalog cleanup [1] (this thread)
* FK broken after DETACH [2]
* Maintenance consideration about self referencing FK between partitions [3]

0. Splitting in two commits

  Your patch addresses two bugs:

    * one for the constraint & trigger catalog cleanup;
    * one for the FK broken after DETACH.

  These issues are unrelated, therefore I am wondering if it would be better
  to split their resolution in two different patches.

  Last year, I reported them in two different threads [1][2]. The first with
  implementation consideration, the second with a demo/proposal/draft fix.

  Unfortunately, this discussion about the first bug slipped to the second one
  when Tender stumbled on this bug as well and reported it. But, both bugs can
  be triggered independently, and have distinct fixes.

  Finally, splitting the patch might help setting finer patch co-authoring. I
  know my patch for [2] was a draft and somewhat trivial, but I spend a fair
  amount of time to report, then produce a draft patch, so I was wondering if
  it would be candidate to a co-author flag on this (small, humble and
  refactored by you) patch?

  I'm definitely not involved (yet) in the second part though.

1. Constraint & trigger catalog cleanup [1]

  I have been focusing on the current master branch and haven't taken into
  consideration backpatching related issues yet.

  When I first studied this bug and reported it, I held on writing a patch
  because it seemed it would duplicate some existing code. I wrote:

  > I poked around DetachPartitionFinalize() to try to find a way to fix this,
  > but it looks like it would duplicate a bunch of code from other code path
  > (eg. from CloneFkReferenced).

  My proposal was to clean everything related to the old FK and use some
  existing code path to create a fresh and cleaner one. This requires some
  refactoring in existing code, but we would win a common path of code between
  create/attach/detach, a cleaner catalog and easier code maintenance.

  I've finally been able to write a PoC that implement this by calling
  addFkRecurseReferenced() from DetachPartitionFinalize(). I can't join
  it here because it is currently an ugly draft and I still have some work
  to do. But I would really like to have a little more time (one or two days) to
  explore this avenue further before you commit yours, if you don't mind? Or
  maybe you already have considered this avenue and rejected it?


2. FK broken after DETACH [2]

  Comparing your patch to my draft from [2], I just have a question about the
  refactoring.

  Fencing the constraint/trigger removal inside a conditional
  RELKIND_PARTITIONED_TABLE block of code was obvious. It avoids some useless
  catalog scan compared to my draft patch.

  Also, the "contype == CONSTRAINT_FOREIGN" I had sounds safe to remove.

  However, is it clean/light enough to add the "conparentid == fk->conoid" in
  the scan key as I did? I'm not sure it saves anything else but the small
  conditional block you inserted inside the loop, but I wonder if there's a
  serious concern about this anyway?

  Last, considering the tests, I think we should add some rows in the tables,
  to make sure the FK is correctly enforced after DETACH. Something like:

    CREATE SCHEMA fkpart12
      CREATE TABLE fk_p ( id bigint PRIMARY KEY ) PARTITION BY list (id)
      CREATE TABLE fk_p_1 PARTITION OF fk_p FOR VALUES IN (1)
      CREATE TABLE fk_p_2 PARTITION OF fk_p FOR VALUES IN (2)
      CREATE TABLE fk_r_1 ( id bigint PRIMARY KEY, p_id bigint NOT NULL)
      CREATE TABLE fk_r_2 ( id bigint PRIMARY KEY, p_id bigint NOT NULL)
      CREATE TABLE fk_r   ( id bigint PRIMARY KEY, p_id bigint NOT NULL,
         FOREIGN KEY (p_id) REFERENCES fk_p (id)
      ) PARTITION BY list (id);
    SET search_path TO fkpart12;

    INSERT INTO fk_p VALUES (1);

    ALTER TABLE fk_r ATTACH PARTITION fk_r_2 FOR VALUES IN (2);

    ALTER TABLE fk_r ATTACH PARTITION fk_r_1 FOR VALUES IN (1);
    \d fk_r_1

    INSERT INTO fk_r VALUES (1,1);

    ALTER TABLE fk_r DETACH PARTITION fk_r_1;
    \d fk_r_1

    INSERT INTO c_1 VALUES (2,2); -- fails as EXPECTED
    DELETE FROM p; -- should fails but was buggy

    ALTER TABLE fk_r ATTACH PARTITION fk_r_1 FOR VALUES IN (1);
    \d fk_r_1


3. Self referencing FK between partitions [3]

  You added to your commit message:

    verify: 20230707175859.17c91538@karst

  I'm not sure what the "verify" flag means. Unfortunately, your patch doesn't
  help on this topic.

  This bug really needs more discussion and design consideration. I have
  thought about this problem and haven't found any solution that don't involve
  breaking the current core behavior. It really looks like an impossible bug to
  fix without dropping the constraint itself. On both side. Maybe the only sane
  behavior would be to forbid detaching the partition if it would break the
  constraint.

  But let's discuss this on the related thread, should we?


Thank you for reading me all the way down to the bottom!

Regards,

[1] https://www.postgresql.org/message-id/20230705233028.2f554f73%40karst
[2] https://www.postgresql.org/message-id/20230420144344.40744130%40karst
[3] https://www.postgresql.org/message-id/20230707175859.17c91538%40karst




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: thread-safety: strerror_r()
Next
From: Andrew Dunstan
Date:
Subject: Re: Improving tracking/processing of buildfarm test failures