Thread: [BUG] Fix DETACH with FK pointing to a partitioned table fails

[BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Jehan-Guillaume de Rorthais
Date:
Hi,

(patch proposal below).

Consider a table with a FK pointing to a partitioned table.

  CREATE TABLE p ( id bigint PRIMARY KEY )
    PARTITION BY list (id);
  CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1);

  CREATE TABLE r_1 (
    id   bigint PRIMARY KEY,
    p_id bigint NOT NULL,
    FOREIGN KEY (p_id) REFERENCES p (id)
  );

Now, attach this table "refg_1" as partition of another one having the same FK:

  CREATE TABLE r (
    id   bigint PRIMARY KEY,
    p_id bigint NOT NULL,
    FOREIGN KEY (p_id) REFERENCES p (id)
  ) PARTITION BY list (id);

  ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);

The old sub-FKs (below 18289) created in this table to enforce the action
triggers on referenced partitions are not deleted when the table becomes a
partition. Because of this, we have additional and useless triggers on the
referenced partitions and we can not DETACH this partition on the referencing
side anymore:

  => ALTER TABLE r DETACH PARTITION r_1;
  ERROR:  could not find ON INSERT check triggers of foreign key
          constraint 18289

  => SELECT c.oid, conparentid,
       conrelid::regclass,
       confrelid::regclass,
       t.tgfoid::regproc
     FROM pg_constraint c
     JOIN pg_trigger t ON t.tgconstraint = c.oid
     WHERE confrelid::regclass = 'p_1'::regclass;
    oid  │ conparentid │ conrelid │ confrelid │         tgfoid
  ───────┼─────────────┼──────────┼───────────┼────────────────────────
   18289 │       18286 │ r_1      │ p_1       │ "RI_FKey_noaction_del"
   18289 │       18286 │ r_1      │ p_1       │ "RI_FKey_noaction_upd"
   18302 │       18299 │ r        │ p_1       │ "RI_FKey_noaction_del"
   18302 │       18299 │ r        │ p_1       │ "RI_FKey_noaction_upd"
  (4 rows)

The legitimate constraint and triggers here are 18302. The old sub-FK
18289 having 18286 as parent should have gone during the ATTACH PARTITION.

Please, find in attachment a patch dropping old "sub-FK" during the ATTACH
PARTITION command and adding a regression test about it. At the very least, it
help understanding the problem and sketch a possible solution.

Regards,

Attachment

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Alvaro Herrera
Date:
On 2023-Jul-05, Jehan-Guillaume de Rorthais wrote:

>   ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1); 
> 
> The old sub-FKs (below 18289) created in this table to enforce the action
> triggers on referenced partitions are not deleted when the table becomes a
> partition. Because of this, we have additional and useless triggers on the
> referenced partitions and we can not DETACH this partition on the referencing
> side anymore:

Oh, hm, interesting.  Thanks for the report and patch.  I found a couple
of minor issues with it (most serious one: nkeys should be 3, not 2;
also sysscan should use conrelid index), but I'll try and complete it so
that it's ready for 2023-08-10's releases.

Regards

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
tender wang
Date:
I think  old "sub-FK" should not be dropped, that will be violates foreign key constraint. For example :
postgres=# insert into r values(1,1);
INSERT 0 1
postgres=# ALTER TABLE r DETACH PARTITION r_1;
ALTER TABLE
postgres=# delete from p_1 where id = 1;
DELETE 1
postgres=# select * from r_1;
 id | p_id
----+------
  1 |    1
(1 row)

If I run above SQLs on pg12.12, it will report error below:
postgres=# delete from p_1 where id = 1;
ERROR:  update or delete on table "p_1" violates foreign key constraint "r_1_p_id_fkey1" on table "r_1"
DETAIL:  Key (id)=(1) is still referenced from table "r_1".

Alvaro Herrera <alvherre@alvh.no-ip.org> 于2023年7月31日周一 20:58写道:
On 2023-Jul-05, Jehan-Guillaume de Rorthais wrote:

>   ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);
>
> The old sub-FKs (below 18289) created in this table to enforce the action
> triggers on referenced partitions are not deleted when the table becomes a
> partition. Because of this, we have additional and useless triggers on the
> referenced partitions and we can not DETACH this partition on the referencing
> side anymore:

Oh, hm, interesting.  Thanks for the report and patch.  I found a couple
of minor issues with it (most serious one: nkeys should be 3, not 2;
also sysscan should use conrelid index), but I'll try and complete it so
that it's ready for 2023-08-10's releases.

Regards

--
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/


Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Alvaro Herrera
Date:
On 2023-Aug-03, tender wang wrote:

> I think  old "sub-FK" should not be dropped, that will be violates foreign
> key constraint.

Yeah, I've been playing more with the patch and it is definitely not
doing the right things.  Just eyeballing the contents of pg_trigger and
pg_constraint for partitions added by ALTER...ATTACH shows that the
catalog contents are inconsistent with those added by CREATE TABLE
PARTITION OF.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
tender wang
Date:
I think the code to determine that fk of a partition is inherited or not is not enough. 
For example, in this case, foreign key r_1_p_id_fkey1  is not inherited from parent.

If conform->conparentid(in DetachPartitionFinalize func) is valid, we should recheck confrelid(pg_constraint) field.

I try to fix this problem in the attached patch.
Any thoughts.

Alvaro Herrera <alvherre@alvh.no-ip.org> 于2023年8月3日周四 17:02写道:
On 2023-Aug-03, tender wang wrote:

> I think  old "sub-FK" should not be dropped, that will be violates foreign
> key constraint.

Yeah, I've been playing more with the patch and it is definitely not
doing the right things.  Just eyeballing the contents of pg_trigger and
pg_constraint for partitions added by ALTER...ATTACH shows that the
catalog contents are inconsistent with those added by CREATE TABLE
PARTITION OF.

--
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
Attachment

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
tender wang
Date:
Oversight the DetachPartitionFinalize(), I found another bug below:

postgres=# CREATE TABLE p ( id bigint PRIMARY KEY ) PARTITION BY list (id);
CREATE TABLE
postgres=# CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1);
CREATE TABLE
postgres=# CREATE TABLE r_1 (
postgres(#     id   bigint PRIMARY KEY,
postgres(#     p_id bigint NOT NULL
postgres(#   );
CREATE TABLE
postgres=#  CREATE TABLE r (
postgres(#     id   bigint PRIMARY KEY,
postgres(#     p_id bigint NOT NULL,
postgres(#     FOREIGN KEY (p_id) REFERENCES p (id)
postgres(#   ) PARTITION BY list (id);
CREATE TABLE
postgres=# ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);
ALTER TABLE
postgres=# ALTER TABLE r DETACH PARTITION r_1;
ALTER TABLE
postgres=# insert into r_1 values(1,1);
ERROR:  insert or update on table "r_1" violates foreign key constraint "r_p_id_fkey"
DETAIL:  Key (p_id)=(1) is not present in table "p".

After detach operation, r_1 is normal relation and the inherited foreign key 'r_p_id_fkey' should be removed.
 

tender wang <tndrwang@gmail.com> 于2023年8月3日周四 17:34写道:
I think the code to determine that fk of a partition is inherited or not is not enough. 
For example, in this case, foreign key r_1_p_id_fkey1  is not inherited from parent.

If conform->conparentid(in DetachPartitionFinalize func) is valid, we should recheck confrelid(pg_constraint) field.

I try to fix this problem in the attached patch.
Any thoughts.

Alvaro Herrera <alvherre@alvh.no-ip.org> 于2023年8月3日周四 17:02写道:
On 2023-Aug-03, tender wang wrote:

> I think  old "sub-FK" should not be dropped, that will be violates foreign
> key constraint.

Yeah, I've been playing more with the patch and it is definitely not
doing the right things.  Just eyeballing the contents of pg_trigger and
pg_constraint for partitions added by ALTER...ATTACH shows that the
catalog contents are inconsistent with those added by CREATE TABLE
PARTITION OF.

--
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
tender wang
Date:
Oversight the DetachPartitionFinalize() again, I found the root cause why 'r_p_id_fkey' wat not removed.

DetachPartitionFinalize() call the GetParentedForeignKeyRefs() func to get tuple from pg_constraint that will be delete but failed.
 according to the comments, the GetParentedForeignKeyRefs() func get the tuple reference me not I reference others.

I try to fix this bug :
i. ConstraintSetParentConstraint() should not be called in DetachPartitionFinalize(), because after conparentid was set to 0, 
we can not find inherited foreign keys.
ii. create another function like GetParentedForeignKeyRefs(), but the ScanKey should be conrelid field not confrelid.

I quickly test on my above solution in my env, can be solve above issue. 

tender wang <tndrwang@gmail.com> 于2023年8月4日周五 17:04写道:
Oversight the DetachPartitionFinalize(), I found another bug below:

postgres=# CREATE TABLE p ( id bigint PRIMARY KEY ) PARTITION BY list (id);
CREATE TABLE
postgres=# CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1);
CREATE TABLE
postgres=# CREATE TABLE r_1 (
postgres(#     id   bigint PRIMARY KEY,
postgres(#     p_id bigint NOT NULL
postgres(#   );
CREATE TABLE
postgres=#  CREATE TABLE r (
postgres(#     id   bigint PRIMARY KEY,
postgres(#     p_id bigint NOT NULL,
postgres(#     FOREIGN KEY (p_id) REFERENCES p (id)
postgres(#   ) PARTITION BY list (id);
CREATE TABLE
postgres=# ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);
ALTER TABLE
postgres=# ALTER TABLE r DETACH PARTITION r_1;
ALTER TABLE
postgres=# insert into r_1 values(1,1);
ERROR:  insert or update on table "r_1" violates foreign key constraint "r_p_id_fkey"
DETAIL:  Key (p_id)=(1) is not present in table "p".

After detach operation, r_1 is normal relation and the inherited foreign key 'r_p_id_fkey' should be removed.
 

tender wang <tndrwang@gmail.com> 于2023年8月3日周四 17:34写道:
I think the code to determine that fk of a partition is inherited or not is not enough. 
For example, in this case, foreign key r_1_p_id_fkey1  is not inherited from parent.

If conform->conparentid(in DetachPartitionFinalize func) is valid, we should recheck confrelid(pg_constraint) field.

I try to fix this problem in the attached patch.
Any thoughts.

Alvaro Herrera <alvherre@alvh.no-ip.org> 于2023年8月3日周四 17:02写道:
On 2023-Aug-03, tender wang wrote:

> I think  old "sub-FK" should not be dropped, that will be violates foreign
> key constraint.

Yeah, I've been playing more with the patch and it is definitely not
doing the right things.  Just eyeballing the contents of pg_trigger and
pg_constraint for partitions added by ALTER...ATTACH shows that the
catalog contents are inconsistent with those added by CREATE TABLE
PARTITION OF.

--
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
tender wang
Date:
The foreign key still works even though partition was detached. Is this behavior expected?
I can't find the answer in the document. If it is expected behavior , please ignore the bug I reported a few days ago. 

tender wang <tndrwang@gmail.com> 于2023年8月4日周五 17:04写道:
Oversight the DetachPartitionFinalize(), I found another bug below:

postgres=# CREATE TABLE p ( id bigint PRIMARY KEY ) PARTITION BY list (id);
CREATE TABLE
postgres=# CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1);
CREATE TABLE
postgres=# CREATE TABLE r_1 (
postgres(#     id   bigint PRIMARY KEY,
postgres(#     p_id bigint NOT NULL
postgres(#   );
CREATE TABLE
postgres=#  CREATE TABLE r (
postgres(#     id   bigint PRIMARY KEY,
postgres(#     p_id bigint NOT NULL,
postgres(#     FOREIGN KEY (p_id) REFERENCES p (id)
postgres(#   ) PARTITION BY list (id);
CREATE TABLE
postgres=# ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);
ALTER TABLE
postgres=# ALTER TABLE r DETACH PARTITION r_1;
ALTER TABLE
postgres=# insert into r_1 values(1,1);
ERROR:  insert or update on table "r_1" violates foreign key constraint "r_p_id_fkey"
DETAIL:  Key (p_id)=(1) is not present in table "p".

After detach operation, r_1 is normal relation and the inherited foreign key 'r_p_id_fkey' should be removed.
 

tender wang <tndrwang@gmail.com> 于2023年8月3日周四 17:34写道:
I think the code to determine that fk of a partition is inherited or not is not enough. 
For example, in this case, foreign key r_1_p_id_fkey1  is not inherited from parent.

If conform->conparentid(in DetachPartitionFinalize func) is valid, we should recheck confrelid(pg_constraint) field.

I try to fix this problem in the attached patch.
Any thoughts.

Alvaro Herrera <alvherre@alvh.no-ip.org> 于2023年8月3日周四 17:02写道:
On 2023-Aug-03, tender wang wrote:

> I think  old "sub-FK" should not be dropped, that will be violates foreign
> key constraint.

Yeah, I've been playing more with the patch and it is definitely not
doing the right things.  Just eyeballing the contents of pg_trigger and
pg_constraint for partitions added by ALTER...ATTACH shows that the
catalog contents are inconsistent with those added by CREATE TABLE
PARTITION OF.

--
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Alvaro Herrera
Date:
On 2023-Aug-07, tender wang wrote:

> The foreign key still works even though partition was detached. Is this
> behavior expected?

Well, there's no reason for it not to, right?  For example, if you
detach a partition and then attach it again, you don't have to scan the
partition on attach, because you know the constraint has remained valid
all along.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"In fact, the basic problem with Perl 5's subroutines is that they're not
crufty enough, so the cruft leaks out into user-defined code instead, by
the Conservation of Cruft Principle."  (Larry Wall, Apocalypse 6)



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Jehan-Guillaume de Rorthais
Date:
On Thu, 3 Aug 2023 11:02:43 +0200
Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> On 2023-Aug-03, tender wang wrote:
> 
> > I think  old "sub-FK" should not be dropped, that will be violates foreign
> > key constraint.  
> 
> Yeah, I've been playing more with the patch and it is definitely not
> doing the right things.  Just eyeballing the contents of pg_trigger and
> pg_constraint for partitions added by ALTER...ATTACH shows that the
> catalog contents are inconsistent with those added by CREATE TABLE
> PARTITION OF.

Well, as stated in my orignal message, at the patch helps understanding the
problem and sketch a possible solution. It definitely is not complete.

After DETACHing the table, we surely needs to check everything again and
recreating what is needed to keep the FK consistent.

But should we keep the FK after DETACH? Did you check the two other discussions
related to FK, self-FK & partition? Unfortunately, as Tender experienced, the
more we dig the more we find bugs. Moreover, the second one might seems
unsolvable and deserve a closer look. See:

* FK broken after DETACHing referencing part
  https://www.postgresql.org/message-id/20230420144344.40744130%40karst
* Issue attaching a table to a partitioned table with an  auto-referenced
  foreign key
  https://www.postgresql.org/message-id/20230707175859.17c91538%40karst




Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
tender wang
Date:
Hi
   Is there any conclusion to this issue?

Jehan-Guillaume de Rorthais <jgdr@dalibo.com> 于2023年8月10日周四 23:03写道:
On Thu, 3 Aug 2023 11:02:43 +0200
Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> On 2023-Aug-03, tender wang wrote:
>
> > I think  old "sub-FK" should not be dropped, that will be violates foreign
> > key constraint. 
>
> Yeah, I've been playing more with the patch and it is definitely not
> doing the right things.  Just eyeballing the contents of pg_trigger and
> pg_constraint for partitions added by ALTER...ATTACH shows that the
> catalog contents are inconsistent with those added by CREATE TABLE
> PARTITION OF.

Well, as stated in my orignal message, at the patch helps understanding the
problem and sketch a possible solution. It definitely is not complete.

After DETACHing the table, we surely needs to check everything again and
recreating what is needed to keep the FK consistent.

But should we keep the FK after DETACH? Did you check the two other discussions
related to FK, self-FK & partition? Unfortunately, as Tender experienced, the
more we dig the more we find bugs. Moreover, the second one might seems
unsolvable and deserve a closer look. See:

* FK broken after DETACHing referencing part
  https://www.postgresql.org/message-id/20230420144344.40744130%40karst
* Issue attaching a table to a partitioned table with an  auto-referenced
  foreign key
  https://www.postgresql.org/message-id/20230707175859.17c91538%40karst

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Alvaro Herrera
Date:
On 2023-Oct-25, tender wang wrote:

> Hi
>    Is there any conclusion to this issue?

None yet.  I intend to work on this at some point, hopefully soon.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
tender wang
Date:
Hi Alvaro,
I re-analyzed this issue, and here is my analysis process.
step 1: CREATE TABLE p ( id bigint PRIMARY KEY )
    PARTITION BY list (id);
 step2: CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1);
 step3: CREATE TABLE r_1 (
    id   bigint PRIMARY KEY,
    p_id bigint NOT NULL,
    FOREIGN KEY (p_id) REFERENCES p (id)
  );
After above step 3 operations, we have below catalog tuples:
postgres=# select oid, relname from pg_class where relname = 'p';
oid | relname
-------+---------
16384 | p
(1 row)
postgres=# select oid, relname from pg_class where relname = 'p_1';
oid | relname
-------+---------
16389 | p_1
(1 row)
postgres=# select oid, relname from pg_class where relname = 'r_1';
oid | relname
-------+---------
16394 | r_1
(1 row)
postgres=# select oid, conname,conrelid,conparentid,confrelid from pg_constraint where conrelid = 16394;
oid | conname | conrelid | conparentid | confrelid
-------+-------------------+----------+-------------+-----------
16397 | r_1_p_id_not_null | 16394 | 0 | 0
16399 | r_1_pkey | 16394 | 0 | 0
16400 | r_1_p_id_fkey | 16394 | 0 | 16384
16403 | r_1_p_id_fkey1 | 16394 | 16400 | 16389
(4 rows)
postgres=# select oid, tgrelid, tgparentid, tgconstrrelid,tgconstrindid,tgconstraint from pg_trigger where tgconstraint = 16403;
oid | tgrelid | tgparentid | tgconstrrelid | tgconstrindid | tgconstraint
-------+---------+------------+---------------+---------------+--------------
16404 | 16389 | 16401 | 16394 | 16392 | 16403
16405 | 16389 | 16402 | 16394 | 16392 | 16403
(2 rows)
postgres=# select oid, tgrelid, tgparentid, tgconstrrelid,tgconstrindid,tgconstraint from pg_trigger where tgconstraint = 16400;
oid | tgrelid | tgparentid | tgconstrrelid | tgconstrindid | tgconstraint
-------+---------+------------+---------------+---------------+--------------
16401 | 16384 | 0 | 16394 | 16387 | 16400
16402 | 16384 | 0 | 16394 | 16387 | 16400
16406 | 16394 | 0 | 16384 | 16387 | 16400
16407 | 16394 | 0 | 16384 | 16387 | 16400
(4 rows)
Because table p is partitioned table and it has one child table p_1. So when r_1 add foreign key constraint, according to addFkRecurseReferenced(),
each partition should have one pg_constraint row(e.g. r_1_p_id_fkey1).
After called addFkRecurseReferenced() in ATAddForeignKeyConstraint(), addFkRecurseReferencing() will be called, in which
it will add INSERT check trigger and UPDATE check trigger for r_1_p_id_fkey but not for r_1_p_id_fkey1.
So when detach r_1 from r, according to DetachPartitionFinalize(), the inherited fks should unlink relationship from parent.
The created INSERT and UPDATE check triggers should unlink relationship link fks. But just like I said above, the r_1_p_id_fkey1
actually doesn't have INSERT check trigger.

I slightly modified the previous patch,but I didn't add test case, because I found another issue.
After done ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);
I run the oidjoins.sql and has warnings as belwo:
psql:/tender/postgres/src/test/regress/sql/oidjoins.sql:49: WARNING:  FK VIOLATION IN pg_trigger({tgparentid}): ("(0,3)",16401)
psql:/tender/postgres/src/test/regress/sql/oidjoins.sql:49: WARNING:  FK VIOLATION IN pg_trigger({tgparentid}): ("(0,4)",16402)
postgres=# select oid, tgrelid, tgparentid, tgconstrrelid,tgconstrindid,tgconstraint from pg_trigger where oid >= 16384;
  oid  | tgrelid | tgparentid | tgconstrrelid | tgconstrindid | tgconstraint
-------+---------+------------+---------------+---------------+--------------
 16404 |   16389 |      16401 |         16394 |         16392 |        16403
 16405 |   16389 |      16402 |         16394 |         16392 |        16403
 16415 |   16384 |          0 |         16408 |         16387 |        16414
 16416 |   16384 |          0 |         16408 |         16387 |        16414
 16418 |   16389 |      16415 |         16408 |         16392 |        16417
 16419 |   16389 |      16416 |         16408 |         16392 |        16417
 16420 |   16408 |          0 |         16384 |         16387 |        16414
 16421 |   16408 |          0 |         16384 |         16387 |        16414
 16406 |   16394 |      16420 |         16384 |         16387 |        16400
 16407 |   16394 |      16421 |         16384 |         16387 |        16400
(10 rows)
oid = 16401 and oid = 16402 has been deleted.
The two trigger tuples are deleted in tryAttachPartitionForeignKey called by CloneFkReferencing.
/*
* Looks good!  Attach this constraint.  The action triggers in the new
* partition become redundant -- the parent table already has equivalent
* ones, and those will be able to reach the partition.  Remove the ones
* in the partition.  We identify them because they have our constraint
* OID, as well as being on the referenced rel.
*/
The attached patch can't fix above issue. I'm not sure about the impact of this issue. Maybe redundant triggers no need removed.

But it surely make oidjoings.sql fail if I add test case into v2 patch, so I don't add test case in v2 patch.
No test case is not good patch. I just share my idea about this issue. Hope to get your reply.




Alvaro Herrera <alvherre@alvh.no-ip.org> 于2023年10月25日周三 20:13写道:
On 2023-Oct-25, tender wang wrote:

> Hi
>    Is there any conclusion to this issue?

None yet.  I intend to work on this at some point, hopefully soon.

--
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
Attachment
Hi Alvaro,

Recently, Alexander reported the same issue on [1]. And before that, another same issue was reported on [2].
So I try to re-work those  issues.  In my last email on this thread, I said that 
"
I slightly modified the previous patch,but I didn't add test case, because I found another issue.
After done ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);
I run the oidjoins.sql and has warnings as belwo:
psql:/tender/postgres/src/test/regress/sql/oidjoins.sql:49: WARNING:  FK VIOLATION IN pg_trigger({tgparentid}): ("(0,3)",16401)
psql:/tender/postgres/src/test/regress/sql/oidjoins.sql:49: WARNING:  FK VIOLATION IN pg_trigger({tgparentid}): ("(0,4)",16402)
"

And I gave the explanation:
"
The two trigger tuples are deleted in tryAttachPartitionForeignKey called by CloneFkReferencing.
/*
* Looks good!  Attach this constraint.  The action triggers in the new
* partition become redundant -- the parent table already has equivalent
* ones, and those will be able to reach the partition.  Remove the ones
* in the partition.  We identify them because they have our constraint
* OID, as well as being on the referenced rel.
*/
"
I try to fix above fk violation. I have two ideas.
i. Do not remove redundant, but when detaching parittion, the action trigger on referenced side will be create again.
I have consider about this situation.

ii. We still remove redundant, and the remove the child action trigger, too.  If we do this way. 
Should we create action trigger recursively on referced side when detaching partition.

I can't decide which one is better. And I'm not sure that keep this FK VIOLATION will cause some problem.
I rebase and send v3 patch, which only fix NOT FOUND INSERT CHECK TRIGGER.



--
Tender Wang
Attachment

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Alvaro Herrera
Date:
Hello,

I think the fix for the check triggers should be as the attached.  Very
close to what you did, but you were skipping some operations that needed
to be kept.  AFAICS this patch works correctly for the posted cases.

I haven't looked at the action triggers yet; I think we need to create
one trigger for each partition of the referenced side, so we need to
loop instead of doing a single one.



I find this pair of queries useful; they show which constraints exist
and which triggers belong to each.  We need to make the constraints and
triggers match after a detach right as things would be if the
just-detached partition were an individual table having the same foreign
key.


-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"People get annoyed when you try to debug them."  (Larry Wall)

Attachment


Alvaro Herrera <alvherre@alvh.no-ip.org> 于2024年7月19日周五 21:18写道:
Hello,

I think the fix for the check triggers should be as the attached.  Very
close to what you did, but you were skipping some operations that needed
to be kept.  AFAICS this patch works correctly for the posted cases.

After applying the attached, the r_1_p_id_fkey1 will have redundant action
triggers, as below:
postgres=# select oid, conname, contype, conrelid, conindid,conparentid, confrelid,conislocal,coninhcount, connoinherit from pg_constraint where oid = 16402;
  oid  |    conname     | contype | conrelid | conindid | conparentid | confrelid | conislocal | coninhcount | connoinherit
-------+----------------+---------+----------+----------+-------------+-----------+------------+-------------+--------------
 16402 | r_1_p_id_fkey1 | f       |    16394 |    16392 |           0 |     16389 | t          |           0 | f
(1 row)

postgres=# select oid, tgrelid, tgparentid, tgconstrrelid, tgconstrindid, tgconstraint from pg_trigger where tgconstraint = 16402;
  oid  | tgrelid | tgparentid | tgconstrrelid | tgconstrindid | tgconstraint
-------+---------+------------+---------------+---------------+--------------
 16403 |   16389 |      16400 |         16394 |         16392 |        16402
 16404 |   16389 |      16401 |         16394 |         16392 |        16402
 16422 |   16389 |          0 |         16394 |         16392 |        16402
 16423 |   16389 |          0 |         16394 |         16392 |        16402
(4 rows)


--
Tender Wang


Alvaro Herrera <alvherre@alvh.no-ip.org> 于2024年7月19日周五 21:18写道:

I find this pair of queries useful; they show which constraints exist
and which triggers belong to each.  We need to make the constraints and
triggers match after a detach right as things would be if the
just-detached partition were an individual table having the same foreign
key.

I don't find the useful queries in your last email. Can you provide them.
Thanks. 


--
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Junwang Zhao
Date:
On Mon, Jul 22, 2024 at 1:52 PM Tender Wang <tndrwang@gmail.com> wrote:
>
>
>
> Alvaro Herrera <alvherre@alvh.no-ip.org> 于2024年7月19日周五 21:18写道:
>>
>> Hello,
>>
>> I think the fix for the check triggers should be as the attached.  Very
>> close to what you did, but you were skipping some operations that needed
>> to be kept.  AFAICS this patch works correctly for the posted cases.
>
>
> After applying the attached, the r_1_p_id_fkey1 will have redundant action
> triggers, as below:
> postgres=# select oid, conname, contype, conrelid, conindid,conparentid, confrelid,conislocal,coninhcount,
connoinheritfrom pg_constraint where oid = 16402; 
>   oid  |    conname     | contype | conrelid | conindid | conparentid | confrelid | conislocal | coninhcount |
connoinherit
>
-------+----------------+---------+----------+----------+-------------+-----------+------------+-------------+--------------
>  16402 | r_1_p_id_fkey1 | f       |    16394 |    16392 |           0 |     16389 | t          |           0 | f
> (1 row)
>
> postgres=# select oid, tgrelid, tgparentid, tgconstrrelid, tgconstrindid, tgconstraint from pg_trigger where
tgconstraint= 16402; 
>   oid  | tgrelid | tgparentid | tgconstrrelid | tgconstrindid | tgconstraint
> -------+---------+------------+---------------+---------------+--------------
>  16403 |   16389 |      16400 |         16394 |         16392 |        16402
>  16404 |   16389 |      16401 |         16394 |         16392 |        16402
>  16422 |   16389 |          0 |         16394 |         16392 |        16402
>  16423 |   16389 |          0 |         16394 |         16392 |        16402
> (4 rows)
>

Yes, seems Alvaro has mentioned that he hasn't looked at the
action triggers, in the attached patch, I add some logic that
first check if there exists action triggers, if yes, just update
their Parent Trigger to InvalidOid.

>
> --
> Tender Wang



--
Regards
Junwang Zhao

Attachment

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Junwang Zhao
Date:
On Fri, Jul 26, 2024 at 2:36 PM Junwang Zhao <zhjwpku@gmail.com> wrote:
>
> On Mon, Jul 22, 2024 at 1:52 PM Tender Wang <tndrwang@gmail.com> wrote:
> >
> >
> >
> > Alvaro Herrera <alvherre@alvh.no-ip.org> 于2024年7月19日周五 21:18写道:
> >>
> >> Hello,
> >>
> >> I think the fix for the check triggers should be as the attached.  Very
> >> close to what you did, but you were skipping some operations that needed
> >> to be kept.  AFAICS this patch works correctly for the posted cases.
> >
> >
> > After applying the attached, the r_1_p_id_fkey1 will have redundant action
> > triggers, as below:
> > postgres=# select oid, conname, contype, conrelid, conindid,conparentid, confrelid,conislocal,coninhcount,
connoinheritfrom pg_constraint where oid = 16402; 
> >   oid  |    conname     | contype | conrelid | conindid | conparentid | confrelid | conislocal | coninhcount |
connoinherit
> >
-------+----------------+---------+----------+----------+-------------+-----------+------------+-------------+--------------
> >  16402 | r_1_p_id_fkey1 | f       |    16394 |    16392 |           0 |     16389 | t          |           0 | f
> > (1 row)
> >
> > postgres=# select oid, tgrelid, tgparentid, tgconstrrelid, tgconstrindid, tgconstraint from pg_trigger where
tgconstraint= 16402; 
> >   oid  | tgrelid | tgparentid | tgconstrrelid | tgconstrindid | tgconstraint
> > -------+---------+------------+---------------+---------------+--------------
> >  16403 |   16389 |      16400 |         16394 |         16392 |        16402
> >  16404 |   16389 |      16401 |         16394 |         16392 |        16402
> >  16422 |   16389 |          0 |         16394 |         16392 |        16402
> >  16423 |   16389 |          0 |         16394 |         16392 |        16402
> > (4 rows)
> >
>
> Yes, seems Alvaro has mentioned that he hasn't looked at the
> action triggers, in the attached patch, I add some logic that
> first check if there exists action triggers, if yes, just update
> their Parent Trigger to InvalidOid.
>
> >
> > --
> > Tender Wang
>
>
>
> --
> Regards
> Junwang Zhao

There is a bug report[0] Tender comments might be the same
issue as this one, but I tried Alvaro's and mine patch, neither
could solve that problem, I did not tried Tender's earlier patch
thought. I post the test script below in case you are interested.

CREATE TABLE t1 (a int, PRIMARY KEY (a));
CREATE TABLE t (a int, PRIMARY KEY (a), FOREIGN KEY (a) REFERENCES t1)
PARTITION BY LIST (a);
ALTER TABLE t ATTACH PARTITION t1 FOR VALUES IN (1);
ALTER TABLE t DETACH PARTITION t1;
ALTER TABLE t ATTACH PARTITION t1 FOR VALUES IN (1);


[0] https://www.postgresql.org/message-id/18541-628a61bc267cd2d3@postgresql.org

--
Regards
Junwang Zhao



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Alvaro Herrera
Date:
On 2024-Jul-26, Junwang Zhao wrote:

> There is a bug report[0] Tender comments might be the same
> issue as this one, but I tried Alvaro's and mine patch, neither
> could solve that problem, I did not tried Tender's earlier patch
> thought. I post the test script below in case you are interested.

Yeah, I've been looking at this whole debacle this week and after
looking at it more closely, I realized that the overall problem requires
a much more invasive solution -- namely, that on DETACH, if the
referenced table is partitioned, we need to create additional
pg_constraint entries from the now-standalone table (was partition) to
each of the partitions of the referenced table; and also add action
triggers to each of those.  Without that, the constraint is incomplete
and doesn't work (as reported multiple times already).

One thing I have not yet tried is what if the partition being detach is
also partitioned.  I mean, do we need to handle each sub-partition
explicitly in some way?  I think the answer is no, but it needs tests.

I have written the patch to do this on detach, and AFAICS it works well,
though it changes the behavior of some existing tests (IIRC related to
self-referencing FKs).  Also, the next problem is making sure that
ATTACH deals with it correctly.  I'm on this bit today.

Self-referencing FKs seem to have additional problems :-(

The queries I was talking about are these

\set tables ''''prim.*''',''forign.*''',''''lone''''

select oid, conparentid, contype, conname, conrelid::regclass, confrelid::regclass, conkey, confkey, conindid::regclass
frompg_constraint where contype = 'f' and (conrelid::regclass::text ~ any (array[:tables]) or confrelid::regclass::text
~any (array[:tables])) order by contype, conrelid, confrelid; select tgconstraint, oid, tgrelid::regclass,
tgconstrrelid::regclass,tgname, tgparentid, tgconstrindid::regclass, tgfoid::regproc from pg_trigger where tgconstraint
in(select oid from pg_constraint where conrelid::regclass::text ~ any (array[:tables]) or confrelid::regclass::text ~
any(array[:tables])) order by tgconstraint, tgrelid::regclass::text, tgfoid;
 

Written as a single line in psql they let you quickly see all the
constraints and their associated triggers, so for instance you can see
whether this sequence

create table prim (a int primary key) partition by list (a);
create table prim1 partition of prim for values in (1);
create table prim2 partition of prim for values in (2);
create table forign (a int references prim) partition by list (a);
create table forign1 partition of forign for values in (1);
create table forign2 partition of forign for values in (2);
alter table forign detach partition forign1;

produces the same set of constraints and triggers as this other sequence

create table prim (a int primary key) partition by list (a);
create table prim1 partition of prim for values in (1);
create table prim2 partition of prim for values in (2);
create table forign (a int references prim) partition by list (a);
create table forign2 partition of forign for values in (2);
create table forign1 (a int references prim);


The patch is more or less like the attached, far from ready.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
Syntax error: function hell() needs an argument.
Please choose what hell you want to involve.

Attachment


Junwang Zhao <zhjwpku@gmail.com> 于2024年7月26日周五 14:57写道:

There is a bug report[0] Tender comments might be the same
issue as this one, but I tried Alvaro's and mine patch, neither
could solve that problem, I did not tried Tender's earlier patch
thought. I post the test script below in case you are interested.

My earlier patch should handle Alexander reported case. But I did not do more
test. I'm not sure that wether or not has dismatching between pg_constraint and pg_trigger.

I aggred with Alvaro said that "requires a much more invasive solution".

--
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Alvaro Herrera
Date:
On 2024-Jul-26, Tender Wang wrote:

> Junwang Zhao <zhjwpku@gmail.com> 于2024年7月26日周五 14:57写道:
> 
> > There is a bug report[0] Tender comments might be the same issue as
> > this one, but I tried Alvaro's and mine patch, neither could solve
> > that problem, I did not tried Tender's earlier patch thought. I post
> > the test script below in case you are interested.
> 
> My earlier patch should handle Alexander reported case. But I did not
> do more test. I'm not sure that wether or not has dismatching between
> pg_constraint and pg_trigger.
> 
> I aggred with Alvaro said that "requires a much more invasive
> solution".

Here's the patch which, as far as I can tell, fixes all the reported
problems (other than the one in bug 18541, for which I proposed an
unrelated fix in that thread[1]).  If you can double-check, I would very
much appreciate that.  Also, I think the test cases the patch adds
reflect the provided examples sufficiently, but if we're still failing
to cover some, please let me know.


As I understand, this fix needs to be applied all the way back to 12,
because the overall functionality is that old.  However, in branches 14
and back, the patch doesn't apply cleanly, because of the changes we
made in commit f4566345cf40 :-(  I'm tempted to fix it in branches 15,
16, 17, master now and potentially backpatch later, to avoid dragging
things along further.  It's taken long enough already.

[1]  https://postgr.es/m/202408072222.icgykv5yrql5@alvherre.pgsql

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"World domination is proceeding according to plan"        (Andrew Morton)

Attachment

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tender Wang
Date:


Alvaro Herrera <alvherre@alvh.no-ip.org> 于2024年8月8日周四 06:50写道:
On 2024-Jul-26, Tender Wang wrote:

> Junwang Zhao <zhjwpku@gmail.com> 于2024年7月26日周五 14:57写道:
>
> > There is a bug report[0] Tender comments might be the same issue as
> > this one, but I tried Alvaro's and mine patch, neither could solve
> > that problem, I did not tried Tender's earlier patch thought. I post
> > the test script below in case you are interested.
>
> My earlier patch should handle Alexander reported case. But I did not
> do more test. I'm not sure that wether or not has dismatching between
> pg_constraint and pg_trigger.
>
> I aggred with Alvaro said that "requires a much more invasive
> solution".

Here's the patch which, as far as I can tell, fixes all the reported
problems (other than the one in bug 18541, for which I proposed an
unrelated fix in that thread[1]).  If you can double-check, I would very
much appreciate that.  Also, I think the test cases the patch adds
reflect the provided examples sufficiently, but if we're still failing
to cover some, please let me know.

Thanks for your work. I will take some time to look it in detail. 


--
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tender Wang
Date:


Alvaro Herrera <alvherre@alvh.no-ip.org> 于2024年8月8日周四 06:50写道:
On 2024-Jul-26, Tender Wang wrote:

> Junwang Zhao <zhjwpku@gmail.com> 于2024年7月26日周五 14:57写道:
>
> > There is a bug report[0] Tender comments might be the same issue as
> > this one, but I tried Alvaro's and mine patch, neither could solve
> > that problem, I did not tried Tender's earlier patch thought. I post
> > the test script below in case you are interested.
>
> My earlier patch should handle Alexander reported case. But I did not
> do more test. I'm not sure that wether or not has dismatching between
> pg_constraint and pg_trigger.
>
> I aggred with Alvaro said that "requires a much more invasive
> solution".

Here's the patch which, as far as I can tell, fixes all the reported
problems (other than the one in bug 18541, for which I proposed an
unrelated fix in that thread[1]).  If you can double-check, I would very
much appreciate that.  Also, I think the test cases the patch adds
reflect the provided examples sufficiently, but if we're still failing
to cover some, please let me know.

I did a lot of tests, and did not report error and did not find any warnings using oidjoins.sql.
+1 


--
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tender Wang
Date:


Alvaro Herrera <alvherre@alvh.no-ip.org> 于2024年8月8日周四 06:50写道:
On 2024-Jul-26, Tender Wang wrote:

> Junwang Zhao <zhjwpku@gmail.com> 于2024年7月26日周五 14:57写道:
>
> > There is a bug report[0] Tender comments might be the same issue as
> > this one, but I tried Alvaro's and mine patch, neither could solve
> > that problem, I did not tried Tender's earlier patch thought. I post
> > the test script below in case you are interested.
>
> My earlier patch should handle Alexander reported case. But I did not
> do more test. I'm not sure that wether or not has dismatching between
> pg_constraint and pg_trigger.
>
> I aggred with Alvaro said that "requires a much more invasive
> solution".

Here's the patch which, as far as I can tell, fixes all the reported
problems (other than the one in bug 18541, for which I proposed an
unrelated fix in that thread[1]).  If you can double-check, I would very
much appreciate that.  Also, I think the test cases the patch adds
reflect the provided examples sufficiently, but if we're still failing
to cover some, please let me know.


As I understand, this fix needs to be applied all the way back to 12,
because the overall functionality is that old.  However, in branches 14
and back, the patch doesn't apply cleanly, because of the changes we
made in commit f4566345cf40 :-(  I'm tempted to fix it in branches 15,
16, 17, master now and potentially backpatch later, to avoid dragging
things along further.  It's taken long enough already.

I haven't seen this patch on master.  Is there something that we fotget to handle?


--
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Alvaro Herrera
Date:
On 2024-Aug-20, Tender Wang wrote:

> > As I understand, this fix needs to be applied all the way back to 12,
> > because the overall functionality is that old.  However, in branches 14
> > and back, the patch doesn't apply cleanly, because of the changes we
> > made in commit f4566345cf40 :-(  I'm tempted to fix it in branches 15,
> > 16, 17, master now and potentially backpatch later, to avoid dragging
> > things along further.  It's taken long enough already.
> 
> I haven't seen this patch on master.  Is there something that we fotget to
> handle?

I haven't pushed it yet, mostly because of being unsure about not doing
anything for the oldest branches (14 and back).

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Java is clearly an example of money oriented programming"  (A. Stepanov)



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tender Wang
Date:


Alvaro Herrera <alvherre@alvh.no-ip.org> 于2024年8月20日周二 10:25写道:
On 2024-Aug-20, Tender Wang wrote:

> > As I understand, this fix needs to be applied all the way back to 12,
> > because the overall functionality is that old.  However, in branches 14
> > and back, the patch doesn't apply cleanly, because of the changes we
> > made in commit f4566345cf40 :-(  I'm tempted to fix it in branches 15,
> > 16, 17, master now and potentially backpatch later, to avoid dragging
> > things along further.  It's taken long enough already.
>
> I haven't seen this patch on master.  Is there something that we fotget to
> handle?

I haven't pushed it yet, mostly because of being unsure about not doing
anything for the oldest branches (14 and back).

I only did codes and tests on master. I'm not sure how much complicated it would be
to fix this issue on branches 14 and back. 


--
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Jehan-Guillaume de Rorthais
Date:
On Wed, 7 Aug 2024 18:50:10 -0400
Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> On 2024-Jul-26, Tender Wang wrote:
>
> > Junwang Zhao <zhjwpku@gmail.com> 于2024年7月26日周五 14:57写道:
> >
> > > There is a bug report[0] Tender comments might be the same issue as
> > > this one, but I tried Alvaro's and mine patch, neither could solve
> > > that problem, I did not tried Tender's earlier patch thought. I post
> > > the test script below in case you are interested.
> >
> > My earlier patch should handle Alexander reported case. But I did not
> > do more test. I'm not sure that wether or not has dismatching between
> > pg_constraint and pg_trigger.
> >
> > I aggred with Alvaro said that "requires a much more invasive
> > solution".
>
> Here's the patch which, as far as I can tell, fixes all the reported
> problems (other than the one in bug 18541, for which I proposed an
> unrelated fix in that thread[1]).  If you can double-check, I would very
> much appreciate that.  Also, I think the test cases the patch adds
> reflect the provided examples sufficiently, but if we're still failing
> to cover some, please let me know.

I'm back on this issue as well. I start poking at this patch to review it,
test it, challenge it and then report here.

I'll try to check if some other issues might have lost/forgot on they way as
well.

In the meantime, thank you Alvaro, Tender and Junwang for your work, time,
research and patches!



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Alvaro Herrera
Date:
On 2024-Aug-20, Jehan-Guillaume de Rorthais wrote:

> I'm back on this issue as well. I start poking at this patch to review it,
> test it, challenge it and then report here.
> 
> I'll try to check if some other issues might have lost/forgot on they way as
> well.

Thanks, much appreciated, looking forward to your feedback.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tender Wang
Date:


Alvaro Herrera <alvherre@alvh.no-ip.org> 于2024年8月22日周四 06:00写道:
On 2024-Aug-19, Alvaro Herrera wrote:

> I haven't pushed it yet, mostly because of being unsure about not doing
> anything for the oldest branches (14 and back).

Last night, after much mulling on this, it occurred to me that one easy
way out of this problem for the old branches, without having to write
more code, is to simply remove the constraint from the partition when
it's detached (but only if they reference a partitioned relation).  It's
not a great solution, but at least we're no longer leaving bogus catalog
entries around.  That would be like the attached patch, which was cut
from 14 and applies cleanly to 12 and 13.  I'd throw in a couple of
tests and call it a day.

I apply the v14 patch on branch REL_14_STABLE. I run this thread issue and I
find below error.
postgres=# CREATE TABLE p ( id bigint PRIMARY KEY ) PARTITION BY list (id);
CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1);
CREATE TABLE r_1 (
    id   bigint PRIMARY KEY,
    p_id bigint NOT NULL,
    FOREIGN KEY (p_id) REFERENCES p (id)
);
CREATE TABLE r (
    id   bigint PRIMARY KEY,
    p_id bigint NOT NULL,
    FOREIGN KEY (p_id) REFERENCES p (id)
) PARTITION BY list (id);
ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);
ALTER TABLE r DETACH PARTITION r_1;
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE TABLE
ALTER TABLE
ERROR:  cache lookup failed for constraint 16400

I haven't look into details to find out where cause above error. 


--
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tender Wang
Date:


Tender Wang <tndrwang@gmail.com> 于2024年8月22日周四 11:19写道:


Alvaro Herrera <alvherre@alvh.no-ip.org> 于2024年8月22日周四 06:00写道:
On 2024-Aug-19, Alvaro Herrera wrote:

> I haven't pushed it yet, mostly because of being unsure about not doing
> anything for the oldest branches (14 and back).

Last night, after much mulling on this, it occurred to me that one easy
way out of this problem for the old branches, without having to write
more code, is to simply remove the constraint from the partition when
it's detached (but only if they reference a partitioned relation).  It's
not a great solution, but at least we're no longer leaving bogus catalog
entries around.  That would be like the attached patch, which was cut
from 14 and applies cleanly to 12 and 13.  I'd throw in a couple of
tests and call it a day.

I apply the v14 patch on branch REL_14_STABLE. I run this thread issue and I
find below error.
postgres=# CREATE TABLE p ( id bigint PRIMARY KEY ) PARTITION BY list (id);
CREATE TABLE p_1 PARTITION OF p FOR VALUES IN (1);
CREATE TABLE r_1 (
    id   bigint PRIMARY KEY,
    p_id bigint NOT NULL,
    FOREIGN KEY (p_id) REFERENCES p (id)
);
CREATE TABLE r (
    id   bigint PRIMARY KEY,
    p_id bigint NOT NULL,
    FOREIGN KEY (p_id) REFERENCES p (id)
) PARTITION BY list (id);
ALTER TABLE r ATTACH PARTITION r_1 FOR VALUES IN (1);
ALTER TABLE r DETACH PARTITION r_1;
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE TABLE
ALTER TABLE
ERROR:  cache lookup failed for constraint 16400

I guess it is because cascade dropping, the oid=16400 has been deleted.
Adding a list that remember dropped constraint, if we find the parent of constraint
is in the list, we skip.

By the way, I run above SQL sequences on REL_14_STABLE without your partch.
I didn't find reporting error, and running oidjoins.sql  didn't report warnings.
Do I miss something?


--
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tender Wang
Date:


Alvaro Herrera <alvherre@alvh.no-ip.org> 于2024年8月23日周五 02:41写道:
On 2024-Aug-22, Tender Wang wrote:

> I apply the v14 patch on branch REL_14_STABLE. I run this thread issue and I
> find below error.
> [...]
> ERROR:  cache lookup failed for constraint 16400
>
> I haven't look into details to find out where cause above error.

Right, we try to drop the constraint twice.  We can dodge this by
collecting all constraints to drop in the loop and process them in a
single performMultipleDeletions, as in the attached v14-2.

Can we move the  CommandCounterIncrement() in 
if (get_rel_relkind(fk->confrelid) == RELKIND_PARTITIONED_TABLE) block
to be close to performMultipleDeletions().

Others look good to me.

TBH I think it's a bit infuriating that we lose the constraint (which
was explicitly declared) because of ATTACH/DETACH. 
 
Agree. 
Do you think it is friendly to users if we add hints that inform them the constraint was dropped?

--
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Jehan-Guillaume de Rorthais
Date:
Hi,

On Tue, 20 Aug 2024 23:09:27 -0400
Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> On 2024-Aug-20, Jehan-Guillaume de Rorthais wrote:
>
> > I'm back on this issue as well. I start poking at this patch to review it,
> > test it, challenge it and then report here.
> >
> > I'll try to check if some other issues might have lost/forgot on they way as
> > well.
>
> Thanks, much appreciated, looking forward to your feedback.

Sorry, it took me a while to come back to you on this topic. It has been hard to
untangle subjects, reproductions and patch…

There's three distinct issues/thread:

* Constraint & trigger catalog cleanup [1] (this thread)
* FK broken after DETACH [2]
* Maintenance consideration about self referencing FK between partitions [3]

0. Splitting in two commits

  Your patch addresses two bugs:

    * one for the constraint & trigger catalog cleanup;
    * one for the FK broken after DETACH.

  These issues are unrelated, therefore I am wondering if it would be better
  to split their resolution in two different patches.

  Last year, I reported them in two different threads [1][2]. The first with
  implementation consideration, the second with a demo/proposal/draft fix.

  Unfortunately, this discussion about the first bug slipped to the second one
  when Tender stumbled on this bug as well and reported it. But, both bugs can
  be triggered independently, and have distinct fixes.

  Finally, splitting the patch might help setting finer patch co-authoring. I
  know my patch for [2] was a draft and somewhat trivial, but I spend a fair
  amount of time to report, then produce a draft patch, so I was wondering if
  it would be candidate to a co-author flag on this (small, humble and
  refactored by you) patch?

  I'm definitely not involved (yet) in the second part though.

1. Constraint & trigger catalog cleanup [1]

  I have been focusing on the current master branch and haven't taken into
  consideration backpatching related issues yet.

  When I first studied this bug and reported it, I held on writing a patch
  because it seemed it would duplicate some existing code. I wrote:

  > I poked around DetachPartitionFinalize() to try to find a way to fix this,
  > but it looks like it would duplicate a bunch of code from other code path
  > (eg. from CloneFkReferenced).

  My proposal was to clean everything related to the old FK and use some
  existing code path to create a fresh and cleaner one. This requires some
  refactoring in existing code, but we would win a common path of code between
  create/attach/detach, a cleaner catalog and easier code maintenance.

  I've finally been able to write a PoC that implement this by calling
  addFkRecurseReferenced() from DetachPartitionFinalize(). I can't join
  it here because it is currently an ugly draft and I still have some work
  to do. But I would really like to have a little more time (one or two days) to
  explore this avenue further before you commit yours, if you don't mind? Or
  maybe you already have considered this avenue and rejected it?


2. FK broken after DETACH [2]

  Comparing your patch to my draft from [2], I just have a question about the
  refactoring.

  Fencing the constraint/trigger removal inside a conditional
  RELKIND_PARTITIONED_TABLE block of code was obvious. It avoids some useless
  catalog scan compared to my draft patch.

  Also, the "contype == CONSTRAINT_FOREIGN" I had sounds safe to remove.

  However, is it clean/light enough to add the "conparentid == fk->conoid" in
  the scan key as I did? I'm not sure it saves anything else but the small
  conditional block you inserted inside the loop, but I wonder if there's a
  serious concern about this anyway?

  Last, considering the tests, I think we should add some rows in the tables,
  to make sure the FK is correctly enforced after DETACH. Something like:

    CREATE SCHEMA fkpart12
      CREATE TABLE fk_p ( id bigint PRIMARY KEY ) PARTITION BY list (id)
      CREATE TABLE fk_p_1 PARTITION OF fk_p FOR VALUES IN (1)
      CREATE TABLE fk_p_2 PARTITION OF fk_p FOR VALUES IN (2)
      CREATE TABLE fk_r_1 ( id bigint PRIMARY KEY, p_id bigint NOT NULL)
      CREATE TABLE fk_r_2 ( id bigint PRIMARY KEY, p_id bigint NOT NULL)
      CREATE TABLE fk_r   ( id bigint PRIMARY KEY, p_id bigint NOT NULL,
         FOREIGN KEY (p_id) REFERENCES fk_p (id)
      ) PARTITION BY list (id);
    SET search_path TO fkpart12;

    INSERT INTO fk_p VALUES (1);

    ALTER TABLE fk_r ATTACH PARTITION fk_r_2 FOR VALUES IN (2);

    ALTER TABLE fk_r ATTACH PARTITION fk_r_1 FOR VALUES IN (1);
    \d fk_r_1

    INSERT INTO fk_r VALUES (1,1);

    ALTER TABLE fk_r DETACH PARTITION fk_r_1;
    \d fk_r_1

    INSERT INTO c_1 VALUES (2,2); -- fails as EXPECTED
    DELETE FROM p; -- should fails but was buggy

    ALTER TABLE fk_r ATTACH PARTITION fk_r_1 FOR VALUES IN (1);
    \d fk_r_1


3. Self referencing FK between partitions [3]

  You added to your commit message:

    verify: 20230707175859.17c91538@karst

  I'm not sure what the "verify" flag means. Unfortunately, your patch doesn't
  help on this topic.

  This bug really needs more discussion and design consideration. I have
  thought about this problem and haven't found any solution that don't involve
  breaking the current core behavior. It really looks like an impossible bug to
  fix without dropping the constraint itself. On both side. Maybe the only sane
  behavior would be to forbid detaching the partition if it would break the
  constraint.

  But let's discuss this on the related thread, should we?


Thank you for reading me all the way down to the bottom!

Regards,

[1] https://www.postgresql.org/message-id/20230705233028.2f554f73%40karst
[2] https://www.postgresql.org/message-id/20230420144344.40744130%40karst
[3] https://www.postgresql.org/message-id/20230707175859.17c91538%40karst




Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tender Wang
Date:


Jehan-Guillaume de Rorthais <jgdr@dalibo.com> 于2024年9月3日周二 05:02写道:
Hi,

On Tue, 20 Aug 2024 23:09:27 -0400
Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> On 2024-Aug-20, Jehan-Guillaume de Rorthais wrote:
>
> > I'm back on this issue as well. I start poking at this patch to review it,
> > test it, challenge it and then report here.
> >
> > I'll try to check if some other issues might have lost/forgot on they way as
> > well.
>
> Thanks, much appreciated, looking forward to your feedback.

Sorry, it took me a while to come back to you on this topic. It has been hard to
untangle subjects, reproductions and patch…

There's three distinct issues/thread:

* Constraint & trigger catalog cleanup [1] (this thread)
* FK broken after DETACH [2]
* Maintenance consideration about self referencing FK between partitions [3]

The third issue has been fixed, and codes have been pushed.  Because of my misunderstanding,
It should not be here.


0. Splitting in two commits

  Your patch addresses two bugs:

    * one for the constraint & trigger catalog cleanup;
    * one for the FK broken after DETACH.

  These issues are unrelated, therefore I am wondering if it would be better
  to split their resolution in two different patches.

  Last year, I reported them in two different threads [1][2]. The first with
  implementation consideration, the second with a demo/proposal/draft fix.

  Unfortunately, this discussion about the first bug slipped to the second one
  when Tender stumbled on this bug as well and reported it. But, both bugs can
  be triggered independently, and have distinct fixes.

It's ok that these two issues are fixed together.  It is  because current codes don't handle better 
when the referenced side is the partition table.


  Finally, splitting the patch might help setting finer patch co-authoring. I
  know my patch for [2] was a draft and somewhat trivial, but I spend a fair
  amount of time to report, then produce a draft patch, so I was wondering if
  it would be candidate to a co-author flag on this (small, humble and
  refactored by you) patch?

  I'm definitely not involved (yet) in the second part though.

1. Constraint & trigger catalog cleanup [1]

  I have been focusing on the current master branch and haven't taken into
  consideration backpatching related issues yet.

  When I first studied this bug and reported it, I held on writing a patch
  because it seemed it would duplicate some existing code. I wrote:

  > I poked around DetachPartitionFinalize() to try to find a way to fix this,
  > but it looks like it would duplicate a bunch of code from other code path
  > (eg. from CloneFkReferenced).

  My proposal was to clean everything related to the old FK and use some
  existing code path to create a fresh and cleaner one. This requires some
  refactoring in existing code, but we would win a common path of code between
  create/attach/detach, a cleaner catalog and easier code maintenance.

  I've finally been able to write a PoC that implement this by calling
  addFkRecurseReferenced() from DetachPartitionFinalize(). I can't join
  it here because it is currently an ugly draft and I still have some work
  to do. But I would really like to have a little more time (one or two days) to
  explore this avenue further before you commit yours, if you don't mind? Or
  maybe you already have considered this avenue and rejected it?


2. FK broken after DETACH [2]

  Comparing your patch to my draft from [2], I just have a question about the
  refactoring.

  Fencing the constraint/trigger removal inside a conditional
  RELKIND_PARTITIONED_TABLE block of code was obvious. It avoids some useless
  catalog scan compared to my draft patch.

  Also, the "contype == CONSTRAINT_FOREIGN" I had sounds safe to remove.

  However, is it clean/light enough to add the "conparentid == fk->conoid" in
  the scan key as I did? I'm not sure it saves anything else but the small
  conditional block you inserted inside the loop, but I wonder if there's a
  serious concern about this anyway?

  Last, considering the tests, I think we should add some rows in the tables,
  to make sure the FK is correctly enforced after DETACH. Something like:

    CREATE SCHEMA fkpart12
      CREATE TABLE fk_p ( id bigint PRIMARY KEY ) PARTITION BY list (id)
      CREATE TABLE fk_p_1 PARTITION OF fk_p FOR VALUES IN (1)
      CREATE TABLE fk_p_2 PARTITION OF fk_p FOR VALUES IN (2)
      CREATE TABLE fk_r_1 ( id bigint PRIMARY KEY, p_id bigint NOT NULL)
      CREATE TABLE fk_r_2 ( id bigint PRIMARY KEY, p_id bigint NOT NULL)
      CREATE TABLE fk_r   ( id bigint PRIMARY KEY, p_id bigint NOT NULL,
         FOREIGN KEY (p_id) REFERENCES fk_p (id)
      ) PARTITION BY list (id);
    SET search_path TO fkpart12;

    INSERT INTO fk_p VALUES (1);

    ALTER TABLE fk_r ATTACH PARTITION fk_r_2 FOR VALUES IN (2);

    ALTER TABLE fk_r ATTACH PARTITION fk_r_1 FOR VALUES IN (1);
    \d fk_r_1

    INSERT INTO fk_r VALUES (1,1);

    ALTER TABLE fk_r DETACH PARTITION fk_r_1;
    \d fk_r_1

    INSERT INTO c_1 VALUES (2,2); -- fails as EXPECTED
    DELETE FROM p; -- should fails but was buggy

    ALTER TABLE fk_r ATTACH PARTITION fk_r_1 FOR VALUES IN (1);
    \d fk_r_1


3. Self referencing FK between partitions [3]

  You added to your commit message:

    verify: 20230707175859.17c91538@karst

  I'm not sure what the "verify" flag means. Unfortunately, your patch doesn't
  help on this topic.

  This bug really needs more discussion and design consideration. I have
  thought about this problem and haven't found any solution that don't involve
  breaking the current core behavior. It really looks like an impossible bug to
  fix without dropping the constraint itself. On both side. Maybe the only sane
  behavior would be to forbid detaching the partition if it would break the
  constraint.

  But let's discuss this on the related thread, should we?


Thank you for reading me all the way down to the bottom!

Regards,

[1] https://www.postgresql.org/message-id/20230705233028.2f554f73%40karst
[2] https://www.postgresql.org/message-id/20230420144344.40744130%40karst
[3] https://www.postgresql.org/message-id/20230707175859.17c91538%40karst





--
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Jehan-Guillaume de Rorthais
Date:
Hi Tender,

On Tue, 3 Sep 2024 10:16:44 +0800
Tender Wang <tndrwang@gmail.com> wrote:

> Jehan-Guillaume de Rorthais <jgdr@dalibo.com> 于2024年9月3日周二 05:02写道:
[…]
> > * Constraint & trigger catalog cleanup [1] (this thread)
> > * FK broken after DETACH [2]
> > * Maintenance consideration about self referencing FK between partitions
> > [3]
> >
>
> The third issue has been fixed, and codes have been pushed.  Because of my
> misunderstanding,
> It should not be here.

I just retried the SQL scenario Guillaume gave on both master and master with
Alvaro's patch. See:


https://www.postgresql.org/message-id/flat/CAECtzeWHCA%2B6tTcm2Oh2%2Bg7fURUJpLZb-%3DpRXgeWJ-Pi%2BVU%3D_w%40mail.gmail.com

It doesn't seem fixed at all. Maybe you are mixing up with another thread/issue?

> > 0. Splitting in two commits
> >
> >   […]
> >
> >   Unfortunately, this discussion about the first bug slipped to the second
> >   one when Tender stumbled on this bug as well and reported it. But, both
> >   bugs can be triggered independently, and have distinct fixes.
>
> It's ok that these two issues are fixed together.  It is  because current
> codes don't handle better when the referenced side is the partition table.

I don't feel the same. Mixing two discussions and fixes together in the same
thread and commit makes life harder.

Last year, when you found the other bug, I tried to point you to the
right thread to avoid mixing subjects:

https://www.postgresql.org/message-id/20230810170345.26e41b05%40karst

If I wrote about the third (non fixed) issue yesterday, it's just because
Alvaro included a reference to it in his commit message. But I think we should
really keep up with this issue on its own, dedicated discussion:


https://www.postgresql.org/message-id/flat/CAECtzeWHCA%2B6tTcm2Oh2%2Bg7fURUJpLZb-%3DpRXgeWJ-Pi%2BVU%3D_w%40mail.gmail.com

Regards



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tender Wang
Date:


Jehan-Guillaume de Rorthais <jgdr@dalibo.com> 于2024年9月3日周二 17:26写道:
Hi Tender,

On Tue, 3 Sep 2024 10:16:44 +0800
Tender Wang <tndrwang@gmail.com> wrote:

> Jehan-Guillaume de Rorthais <jgdr@dalibo.com> 于2024年9月3日周二 05:02写道:
[…]
> > * Constraint & trigger catalog cleanup [1] (this thread)
> > * FK broken after DETACH [2]
> > * Maintenance consideration about self referencing FK between partitions
> > [3]
> > 
>
> The third issue has been fixed, and codes have been pushed.  Because of my
> misunderstanding,
> It should not be here.

I just retried the SQL scenario Guillaume gave on both master and master with
Alvaro's patch. See:

https://www.postgresql.org/message-id/flat/CAECtzeWHCA%2B6tTcm2Oh2%2Bg7fURUJpLZb-%3DpRXgeWJ-Pi%2BVU%3D_w%40mail.gmail.com

It doesn't seem fixed at all. Maybe you are mixing up with another thread/issue?

Sorry, I mixed up the third issue with the Alexander reported  issue.  Please ignore the above noise. 

> > 0. Splitting in two commits
> >
> >   […]
> >
> >   Unfortunately, this discussion about the first bug slipped to the second
> >   one when Tender stumbled on this bug as well and reported it. But, both
> >   bugs can be triggered independently, and have distinct fixes.
>
> It's ok that these two issues are fixed together.  It is  because current
> codes don't handle better when the referenced side is the partition table.

I don't feel the same. Mixing two discussions and fixes together in the same
thread and commit makes life harder.

Hmm, these two issues have a close relationship. Anyway, I think it's ok to fix the two issues together.


Last year, when you found the other bug, I tried to point you to the
right thread to avoid mixing subjects:

https://www.postgresql.org/message-id/20230810170345.26e41b05%40karst

If I wrote about the third (non fixed) issue yesterday, it's just because
Alvaro included a reference to it in his commit message. But I think we should
really keep up with this issue on its own, dedicated discussion:

https://www.postgresql.org/message-id/flat/CAECtzeWHCA%2B6tTcm2Oh2%2Bg7fURUJpLZb-%3DpRXgeWJ-Pi%2BVU%3D_w%40mail.gmail.com

Thanks for the reminder. I didn't take the time to look into the third issue. Please give me some to analyze it.


--
Thanks,
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tender Wang
Date:


Alvaro Herrera <alvherre@alvh.no-ip.org> 于2024年8月8日周四 06:50写道:
On 2024-Jul-26, Tender Wang wrote:

> Junwang Zhao <zhjwpku@gmail.com> 于2024年7月26日周五 14:57写道:
>
> > There is a bug report[0] Tender comments might be the same issue as
> > this one, but I tried Alvaro's and mine patch, neither could solve
> > that problem, I did not tried Tender's earlier patch thought. I post
> > the test script below in case you are interested.
>
> My earlier patch should handle Alexander reported case. But I did not
> do more test. I'm not sure that wether or not has dismatching between
> pg_constraint and pg_trigger.
>
> I aggred with Alvaro said that "requires a much more invasive
> solution".

Here's the patch which, as far as I can tell, fixes all the reported
problems (other than the one in bug 18541, for which I proposed an
unrelated fix in that thread[1]).  If you can double-check, I would very
much appreciate that.  Also, I think the test cases the patch adds
reflect the provided examples sufficiently, but if we're still failing
to cover some, please let me know.

When I review Jehan-Guillaume v2 patch, I found the below codes that need
a little tweak. In DetachPartitionFinalize()
/*
* If the referenced side is partitioned (which we know because our
* parent's constraint points to a different relation than ours) then
* we must, in addition to the above, create pg_constraint rows that
* point to each partition, each with its own action triggers.
*/
if (parentConForm->conrelid != conform->conrelid)

I found that the above IF was always true,  regardless of whether the referenced side is partitioned.
Although find_all_inheritors() can return an empty children list when the referenced side is not partitioned,  
we can avoid much useless work.
How about this way:
if (get_rel_relkind(conform->confrelid) == RELKIND_PARTITIONED_TABLE)


--
Thanks,
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tender Wang
Date:


Alvaro Herrera <alvherre@alvh.no-ip.org> 于2024年10月18日周五 22:52写道:
On 2024-Sep-26, Jehan-Guillaume de Rorthais wrote:

> REL_14_STABLE backport doesn't seem trivial, so I'll wait for some
> feedback, review & decision before going further down in backpatching.

Hi, thanks for these patches.  I have made some edits of my own.  In the
end I decided that I quite liked the new structure of the
addFkConstraint() function and friends.  I added a bunch of comments and
changed names somewhat.  Also, I think the benefit of the refactoring is
more obvious when all patches are squashed together, so I did that.

For branch 14, I opted to make it delete the constraints on detach.
This isn't ideal but unless somebody wants to spend a lot more time on
this, it seems the best we can do.  Leaving broken constraints around
seems worse.  The patch for 14 doesn't apply to 13 sadly.  I didn't have
time to verify why, but it seems there's some rather larger conflict in
one spot.

I hope to be able to get these pushed over the weekend.  That would give
us a couple of weeks before the November releases (but my availability
in those two weeks might be spotty.)

I spent some time going through your test additions and ended up with
your functions in this form:

-- displays constraints in schema fkpart12
CREATE OR REPLACE FUNCTION
fkpart12_constraints(OUT conrel regclass, OUT conname name,
                     OUT confrelid regclass, OUT conparent text)
        RETURNS SETOF record AS $$
  WITH RECURSIVE arrh AS (
       SELECT oid, conrelid, conname, confrelid, NULL::name AS conparent
         FROM pg_constraint
        WHERE connamespace = 'fkpart12'::regnamespace AND
              contype = 'f' AND conparentid = 0
    UNION ALL
       SELECT c.oid, c.conrelid, c.conname, c.confrelid,
              (pg_identify_object('pg_constraint'::regclass, arrh.oid, 0)).identity
         FROM pg_constraint c
              JOIN arrh ON c.conparentid = arrh.oid
  ) SELECT conrelid::regclass, conname, confrelid::regclass, conparent
      FROM arrh
  ORDER BY conrelid::regclass::text, conname;
$$
LANGUAGE SQL;

-- displays triggers in tables in schema fkpart12
CREATE OR REPLACE FUNCTION
fkpart12_triggers(OUT tablename regclass, OUT constr text,
                  OUT samefunc boolean, OUT parent text)
        RETURNS SETOF record AS $$
  WITH RECURSIVE arrh AS (
       SELECT t.oid, t.tgrelid::regclass as tablename, tgname,
              t.tgfoid::regproc as trigfn,
              (pg_identify_object('pg_constraint'::regclass, c.oid, 0)).identity as constr,
              NULL::bool as samefunc,
              NULL::name AS parent
         FROM pg_trigger t
    LEFT JOIN pg_constraint c ON c.oid = t.tgconstraint
        WHERE (SELECT relnamespace FROM pg_class WHERE oid = t.tgrelid) = 'fkpart12'::regnamespace
              AND c.contype = 'f' AND t.tgparentid = 0
    UNION ALL
       SELECT t2.oid, t2.tgrelid::regclass as tablename, t2.tgname,
              t2.tgfoid::regproc as trigfn,
              (pg_identify_object('pg_constraint'::regclass, c2.oid, 0)).identity,
              arrh.trigfn = t2.tgfoid as samefunc,
              replace((pg_identify_object('pg_trigger'::regclass, t2.tgparentid, 0)).identity,
                      t2.tgparentid::text, 'TGOID')
         FROM pg_trigger t2
    LEFT JOIN pg_constraint c2 ON c2.oid = t2.tgconstraint
         JOIN arrh ON t2.tgparentid = arrh.oid
  ) SELECT tablename, constr, samefunc, parent
      FROM arrh
  ORDER BY tablename::text, constr;
$$
LANGUAGE SQL;

However, in the end I think this is a very good technique to verify that
the fix works correctly, but it's excessive to include these results in
the tests forevermore.  So I've left them out for now.  Maybe we should
reconsider on the older branches?

Hi,

I looked at the patch on master.   I gave a little comment in [1]
I reconsider the codes.  I suspect that we don't need the below if statement anymore.
/*
* If the referenced side is partitioned (which we know because our
* parent's constraint points to a different relation than ours) then
* we must, in addition to the above, create pg_constraint rows that
* point to each partition, each with its own action triggers.
*/
if (parentConForm->conrelid != conform->conrelid)
{
...
}

The above contidion is always true according to my test.
I haven't figured out an anti-case.
Any thoughts?


--
Thanks,
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tender Wang
Date:


Alvaro Herrera <alvherre@alvh.no-ip.org> 于2024年10月22日周二 05:52写道:
On 2024-Oct-21, Tender Wang wrote:

> I suspect that we don't need the below if
> statement anymore.
> /*
> * If the referenced side is partitioned (which we know because our
> * parent's constraint points to a different relation than ours) then
> * we must, in addition to the above, create pg_constraint rows that
> * point to each partition, each with its own action triggers.
> */
> if (parentConForm->conrelid != conform->conrelid)
> {
> ...
> }
>
> The above contidion is always true according to my test.
> I haven't figured out an anti-case.

You're right, this is useless, we can remove the 'if' line.  I kept the
block though, to have a place for all those local variable declarations
(otherwise the code looks messier than it needs to).

Agree.
 

I also noticed that addFkRecurseReferenced() is uselessly taking a List
**wqueue argument but doesn't use it, so I removed it (as fallout, other
routines don't need it either, especially DetachPartitionFinalize).  I
added some commentary on the creation of dependencies in
addFkConstraint().

Yeah, I also noticed this before. 
 

I also include a few other cosmetic changes; just comment layout
changes.

This time I attach the patch for master only; the others have changed
identically.  14 is unchanged from before.  I figured that the conflict
from 14 to 13 was trivial to resolve -- it was just because of DETACH
CONCURRENTLY, so some code moves around, but that's all that happens.


I don't find any other problems with the newest patch. 
 
--
Thanks,
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Jehan-Guillaume de Rorthais
Date:
On Fri, 18 Oct 2024 16:50:59 +0200
Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> On 2024-Sep-26, Jehan-Guillaume de Rorthais wrote:
>
> > REL_14_STABLE backport doesn't seem trivial, so I'll wait for some
> > feedback, review & decision before going further down in backpatching.
>
> Hi, thanks for these patches.  I have made some edits of my own.  In the
> end I decided that I quite liked the new structure of the
> addFkConstraint() function and friends.

Yes, I especially like the fact that addFkRecurseReferencing() and
addFkRecurseReferenced() have now the same logic/behavior.

> I added a bunch of comments and changed names somewhat.

checked.

> Also, I think the benefit of the refactoring is
> more obvious when all patches are squashed together, so I did that.

OK.

> For branch 14, I opted to make it delete the constraints on detach.
> This isn't ideal but unless somebody wants to spend a lot more time on
> this, it seems the best we can do.  Leaving broken constraints around
> seems worse.

Keep that in mind, and move ahead to the next level: self-FK on partitioned
table! ;-)

https://www.postgresql.org/message-id/flat/20230707175859.17c91538%40karst#0dc7b8afd8b780899021bbb075598250

> […]
>
> I spent some time going through your test additions and ended up with
> your functions in this form:
> […]
>
> However, in the end I think this is a very good technique to verify that
> the fix works correctly, but it's excessive to include these results in
> the tests forevermore.  So I've left them out for now.  Maybe we should
> reconsider on the older branches?

The point here was to make sure futur work/refactoring don't forget/break
anything in the catalog representation of FK on partitions.

Regards,



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Alvaro Herrera
Date:
On 2024-Oct-22, Jehan-Guillaume de Rorthais wrote:

> On Fri, 18 Oct 2024 16:50:59 +0200
> Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> > For branch 14, I opted to make it delete the constraints on detach.
> > This isn't ideal but unless somebody wants to spend a lot more time on
> > this, it seems the best we can do.  Leaving broken constraints around
> > seems worse. 
> 
> Keep that in mind, and move ahead to the next level: self-FK on partitioned
> table! ;-)
> 
> https://www.postgresql.org/message-id/flat/20230707175859.17c91538%40karst#0dc7b8afd8b780899021bbb075598250

Yeah.  I pushed these patches finally, thanks!

Now we can discuss what to do about self-referencing FKs ... maybe we
should consider dropping the FK on detach in problematic cases.

> > However, in the end I think this is a very good technique to verify that
> > the fix works correctly, but it's excessive to include these results in
> > the tests forevermore.  So I've left them out for now.  Maybe we should
> > reconsider on the older branches?
> 
> The point here was to make sure futur work/refactoring don't forget/break
> anything in the catalog representation of FK on partitions.

There is that ... but I think testing for user-visible symptoms is
better.  I'm not sure I want to bet against the odds that this will
become make-work to keep the output of internal catalog state up to date
for whatever other changes we need to do.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
 Are you not unsure you want to delete Firefox?
       [Not unsure]     [Not not unsure]    [Cancel]
                   http://smylers.hates-software.com/2008/01/03/566e45b2.html



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Jehan-Guillaume de Rorthais
Date:
On Mon, 21 Oct 2024 23:52:18 +0200
Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> On 2024-Oct-21, Tender Wang wrote:
>
> > I suspect that we don't need the below if
> > statement anymore.
> > /*
> > * If the referenced side is partitioned (which we know because our
> > * parent's constraint points to a different relation than ours) then
> > * we must, in addition to the above, create pg_constraint rows that
> > * point to each partition, each with its own action triggers.
> > */
> > if (parentConForm->conrelid != conform->conrelid)
> > {
> > ...
> > }
> >
> > The above contidion is always true according to my test.
> > I haven't figured out an anti-case.
>
> You're right, this is useless, we can remove the 'if' line.  I kept the
> block though, to have a place for all those local variable declarations
> (otherwise the code looks messier than it needs to).

I'm confused the original patch was considering the trigger creation
on the referenced side ONLY when it was partitioned, which was the point of
this conditional block (as the comment said), no matter if the test was always
true or not.

Maybe this was a leftover of some refactoring…

> I also noticed that addFkRecurseReferenced() is uselessly taking a List
> **wqueue argument but doesn't use it, so I removed it (as fallout, other
> routines don't need it either, especially DetachPartitionFinalize).  I
> added some commentary on the creation of dependencies in
> addFkConstraint().

Good catch.

Looking at this, on a side note, I realize now addFkRecurseReferenced() and
addFkRecurseReferencing() are not exactly mirroring each other as the later
is doing more than the former.

Either this is fine, or maybe this is the sign something might need some
refactoring as addFkRecurseReferencing() carry more than it should. I'm not sure
it deserve more than a comment for futur work/study, if only this is justified.

Regards,



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Jehan-Guillaume de Rorthais
Date:
On Tue, 22 Oct 2024 16:32:33 +0200
Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> On 2024-Oct-22, Jehan-Guillaume de Rorthais wrote:
> 
> > On Fri, 18 Oct 2024 16:50:59 +0200
> > Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:  
> 
> > > For branch 14, I opted to make it delete the constraints on detach.
> > > This isn't ideal but unless somebody wants to spend a lot more time on
> > > this, it seems the best we can do.  Leaving broken constraints around
> > > seems worse.   
> > 
> > Keep that in mind, and move ahead to the next level: self-FK on partitioned
> > table! ;-)
> > 
> > https://www.postgresql.org/message-id/flat/20230707175859.17c91538%40karst#0dc7b8afd8b780899021bbb075598250  
> 
> Yeah.  I pushed these patches finally, thanks!

Great! Many thanks Alvaro and Tender for your work on this issue.



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Alexander Lakhin
Date:
Hello Alvaro,

22.10.2024 17:32, Alvaro Herrera wrote:
> Yeah.  I pushed these patches finally, thanks!

Please look at a new anomaly introduced with 53af9491a. When running the
following script:
CREATE TABLE t (a int, b int, PRIMARY KEY (a, b));
CREATE TABLE pt (a int, b int, FOREIGN KEY (a, b) REFERENCES t(a, b))
  PARTITION BY LIST (a);

CREATE TABLE tp1 (x int, a int, b int);
ALTER TABLE tp1 DROP COLUMN x;

ALTER TABLE pt ATTACH PARTITION tp1 FOR VALUES IN (1);

ALTER TABLE pt DETACH PARTITION tp1;

I get a memory access error detected by Valgrind:
2024-10-24 12:05:04.645 UTC [1079077] LOG:  statement: ALTER TABLE pt DETACH PARTITION tp1;
==00:00:00:07.887 1079077== Invalid read of size 2
==00:00:00:07.887 1079077==    at 0x4A61DD: DetachPartitionFinalize (tablecmds.c:19545)
==00:00:00:07.887 1079077==    by 0x4A5C11: ATExecDetachPartition (tablecmds.c:19386)
==00:00:00:07.887 1079077==    by 0x48561E: ATExecCmd (tablecmds.c:5540)
==00:00:00:07.887 1079077==    by 0x4845DE: ATRewriteCatalogs (tablecmds.c:5203)
==00:00:00:07.887 1079077==    by 0x4838EC: ATController (tablecmds.c:4758)
==00:00:00:07.887 1079077==    by 0x4834F1: AlterTable (tablecmds.c:4404)
==00:00:00:07.887 1079077==    by 0x7D6D52: ProcessUtilitySlow (utility.c:1318)
==00:00:00:07.887 1079077==    by 0x7D65F7: standard_ProcessUtility (utility.c:1067)
==00:00:00:07.887 1079077==    by 0x7D54F7: ProcessUtility (utility.c:523)
==00:00:00:07.887 1079077==    by 0x7D3D70: PortalRunUtility (pquery.c:1158)
==00:00:00:07.887 1079077==    by 0x7D3FE7: PortalRunMulti (pquery.c:1316)
==00:00:00:07.887 1079077==    by 0x7D3431: PortalRun (pquery.c:791)

Reproduced on REL_15_STABLE .. master.

Best regards,
Alexander



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tender Wang
Date:


Alexander Lakhin <exclusion@gmail.com> 于2024年10月24日周四 22:00写道:
Hello Alvaro,

22.10.2024 17:32, Alvaro Herrera wrote:
> Yeah.  I pushed these patches finally, thanks!

Please look at a new anomaly introduced with 53af9491a. When running the
following script:
CREATE TABLE t (a int, b int, PRIMARY KEY (a, b));
CREATE TABLE pt (a int, b int, FOREIGN KEY (a, b) REFERENCES t(a, b))
  PARTITION BY LIST (a);

CREATE TABLE tp1 (x int, a int, b int);
ALTER TABLE tp1 DROP COLUMN x;

ALTER TABLE pt ATTACH PARTITION tp1 FOR VALUES IN (1);

ALTER TABLE pt DETACH PARTITION tp1;

I get a memory access error detected by Valgrind:
2024-10-24 12:05:04.645 UTC [1079077] LOG:  statement: ALTER TABLE pt DETACH PARTITION tp1;
==00:00:00:07.887 1079077== Invalid read of size 2
==00:00:00:07.887 1079077==    at 0x4A61DD: DetachPartitionFinalize (tablecmds.c:19545)
==00:00:00:07.887 1079077==    by 0x4A5C11: ATExecDetachPartition (tablecmds.c:19386)
==00:00:00:07.887 1079077==    by 0x48561E: ATExecCmd (tablecmds.c:5540)
==00:00:00:07.887 1079077==    by 0x4845DE: ATRewriteCatalogs (tablecmds.c:5203)
==00:00:00:07.887 1079077==    by 0x4838EC: ATController (tablecmds.c:4758)
==00:00:00:07.887 1079077==    by 0x4834F1: AlterTable (tablecmds.c:4404)
==00:00:00:07.887 1079077==    by 0x7D6D52: ProcessUtilitySlow (utility.c:1318)
==00:00:00:07.887 1079077==    by 0x7D65F7: standard_ProcessUtility (utility.c:1067)
==00:00:00:07.887 1079077==    by 0x7D54F7: ProcessUtility (utility.c:523)
==00:00:00:07.887 1079077==    by 0x7D3D70: PortalRunUtility (pquery.c:1158)
==00:00:00:07.887 1079077==    by 0x7D3FE7: PortalRunMulti (pquery.c:1316)
==00:00:00:07.887 1079077==    by 0x7D3431: PortalRun (pquery.c:791)

Reproduced on REL_15_STABLE .. master.

Sorry, I can't reproduce this leak on master. 

--
Thanks,
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Alvaro Herrera
Date:
On 2024-Oct-25, Tender Wang wrote:

> Thanks for reporting. I can reproduce this memory invalid access on HEAD.
> After debuging codes, I found the root cause.
>  In DetachPartitionFinalize(), below code:
> att = TupleDescAttr(RelationGetDescr(partRel),
>                                attmap->attnums[conkey[i] - 1] - 1);
> 
> We should use confkey[i] -1 not conkey[i] - 1;

Wow, how embarrasing, now that's one _really_ stupid bug and it's
totally my own.  Thanks for the analysis!  I'll get this patched soon.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tender Wang
Date:


Alvaro Herrera <alvherre@alvh.no-ip.org> 于2024年10月25日周五 16:30写道:
On 2024-Oct-25, Tender Wang wrote:

> Thanks for reporting. I can reproduce this memory invalid access on HEAD.
> After debuging codes, I found the root cause.
>  In DetachPartitionFinalize(), below code:
> att = TupleDescAttr(RelationGetDescr(partRel),
>                                attmap->attnums[conkey[i] - 1] - 1);
>
> We should use confkey[i] -1 not conkey[i] - 1;

Wow, how embarrasing, now that's one _really_ stupid bug and it's
totally my own.  Thanks for the analysis!  I'll get this patched soon.


Hi,

When I debug codes, I find that the way to access AttrMap almost uses "attrmp_ptr->attnums[offset]."
The codes now usually don't check if the offset is out of bounds, which seems unsafe.
Can we wrap an access function? For example:
inline AttrNumber(attrmap_ptr, offset)
{
  Assert(offset >= 0 && offset < attrmap_ptr->maplen);
 return attrmap_ptr->attnums[offset];
}
Or a Macro.

I'm not sure if it's worth doing this.

--
Thanks,
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Alexander Lakhin
Date:
Hello Alvaro and Tender Wang,

24.10.2024 17:00, Alexander Lakhin wrote:
> Please look at a new anomaly introduced with 53af9491a. When running the
> following script:
> CREATE TABLE t (a int, b int, PRIMARY KEY (a, b));
> CREATE TABLE pt (a int, b int, FOREIGN KEY (a, b) REFERENCES t(a, b))
>  PARTITION BY LIST (a);
>
> CREATE TABLE tp1 (x int, a int, b int);
> ALTER TABLE tp1 DROP COLUMN x;
>
> ALTER TABLE pt ATTACH PARTITION tp1 FOR VALUES IN (1);
>
> ALTER TABLE pt DETACH PARTITION tp1;

I've also discovered another anomaly with a similar setup, but it's not
related to 53af9491a.

CREATE TABLE t (a int, PRIMARY KEY (a));
INSERT INTO t VALUES (1);

CREATE TABLE pt (b int, a int) PARTITION BY RANGE (a);
ALTER TABLE pt DROP COLUMN b;
ALTER TABLE pt ADD FOREIGN KEY (a) REFERENCES t ON DELETE SET DEFAULT (a);

CREATE TABLE tp1 PARTITION OF pt FOR VALUES FROM (1) TO (2);
ALTER TABLE pt DETACH PARTITION tp1;

DELETE FROM t;
\d+ t

This script ends up with:
ERROR:  invalid attribute number 2
ERROR:  cache lookup failed for attribute 2 of relation 16398

(Perhaps it deserves a separate discussion.)

Best regards,
Alexander




Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Alvaro Herrera
Date:
On 2024-Oct-25, Alexander Lakhin wrote:

> I've also discovered another anomaly with a similar setup, but it's not
> related to 53af9491a.

Hmm, it may well be a preexisting problem, but I do think it involves
the same code.  As far as I can tell, the value "2" here

> This script ends up with:
> ERROR:  invalid attribute number 2
> ERROR:  cache lookup failed for attribute 2 of relation 16398

is coming from riinfo->confdelsetcols which was set up by
DetachPartitionFinalize during the last DETACH operation.

> (Perhaps it deserves a separate discussion.)

No need for that.

A backtrace from the errfinish gives me this (apologies for the
wide terminal):

#0  errfinish (filename=0x5610d93e4510 "../../../../../pgsql/source/master/src/backend/parser/parse_relation.c",
lineno=3618,
 
    funcname=0x5610d93e5518 <__func__.5> "attnumAttName") at
../../../../../../pgsql/source/master/src/backend/utils/error/elog.c:475
#1  0x00005610d8d68bcc in attnumAttName (rd=rd@entry=0x7f69dd0c1a90, attid=2) at
../../../../../pgsql/source/master/src/backend/parser/parse_relation.c:3618
#2  0x00005610d924a8c5 in ri_set (trigdata=0x7ffe38d924c0, is_set_null=<optimized out>, tgkind=<optimized out>)
    at ../../../../../../pgsql/source/master/src/backend/utils/adt/ri_triggers.c:1219
#3  0x00005610d8f897ea in ExecCallTriggerFunc (trigdata=trigdata@entry=0x7ffe38d924c0, tgindx=tgindx@entry=2,
finfo=0x5611153f5768,finfo@entry=0x5611153f5708, 
 
    instr=instr@entry=0x0, per_tuple_context=per_tuple_context@entry=0x561115471c30) at
../../../../../pgsql/source/master/src/backend/commands/trigger.c:2362
#4  0x00005610d8f8b646 in AfterTriggerExecute (trig_tuple_slot2=0x0, trig_tuple_slot1=0x0,
per_tuple_context=0x561115471c30,instr=0x0, finfo=0x5611153f5708, 
 
    trigdesc=0x5611153f53e8, dst_relInfo=<optimized out>, src_relInfo=<optimized out>, relInfo=0x5611153f51d8,
event=0x56111546fd4c,estate=0x5611153f4d50)
 
    at ../../../../../pgsql/source/master/src/backend/commands/trigger.c:4498
#5  afterTriggerInvokeEvents (events=events@entry=0x5611154292a0, firing_id=1, estate=estate@entry=0x5611153f4d50,
delete_ok=delete_ok@entry=false)
    at ../../../../../pgsql/source/master/src/backend/commands/trigger.c:4735
#6  0x00005610d8f8ff10 in AfterTriggerEndQuery (estate=estate@entry=0x5611153f4d50) at
../../../../../pgsql/source/master/src/backend/commands/trigger.c:5090
#7  0x00005610d8fb2457 in standard_ExecutorFinish (queryDesc=0x561115460810) at
../../../../../pgsql/source/master/src/backend/executor/execMain.c:442
#8  0x00005610d917df48 in ProcessQuery (plan=0x56111545ef50, sourceText=0x56111539cfb0 "DELETE FROM t;", params=0x0,
queryEnv=0x0,dest=0x56111545f0b0, 
 
    qc=0x7ffe38d92830) at ../../../../../pgsql/source/master/src/backend/tcop/pquery.c:193

and then

(gdb) frame 2
#2  0x00005610d924a8c5 in ri_set (trigdata=0x7ffe38d924c0, is_set_null=<optimized out>, tgkind=<optimized out>)
    at ../../../../../../pgsql/source/master/src/backend/utils/adt/ri_triggers.c:1219
1219                quoteOneName(attname, RIAttName(fk_rel, set_cols[i]));
(gdb) print *riinfo 
$5 = {constraint_id = 16400, valid = true, constraint_root_id = 16400, oidHashValue = 3001019032, rootHashValue =
3001019032,conname = {
 
    data = "pt_a_fkey", '\000' <repeats 54 times>}, pk_relid = 16384, fk_relid = 16397, confupdtype = 97 'a',
confdeltype= 100 'd', ndelsetcols = 1, 
 
  confdelsetcols = {2, 0 <repeats 31 times>}, confmatchtype = 115 's', hasperiod = false, nkeys = 1, pk_attnums = {1, 0
<repeats31 times>}, fk_attnums = {1, 
 
    0 <repeats 31 times>}, pf_eq_oprs = {96, 0 <repeats 31 times>}, pp_eq_oprs = {96, 0 <repeats 31 times>}, ff_eq_oprs
={96, 0 <repeats 31 times>}, 
 
  period_contained_by_oper = 0, agged_period_contained_by_oper = 0, valid_link = {prev = 0x5611153f4c30, next =
0x5610d96b51b0<ri_constraint_cache_valid_list>}}
 


-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Digital and video cameras have this adjustment and film cameras don't for the
same reason dogs and cats lick themselves: because they can."   (Ken Rockwell)



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Alvaro Herrera
Date:
On 2024-Oct-25, Tender Wang wrote:

> When I debug codes, I find that the way to access AttrMap almost uses
> "attrmp_ptr->attnums[offset]."
> The codes now usually don't check if the offset is out of bounds, which
> seems unsafe.
> Can we wrap an access function? For example:
> inline AttrNumber(attrmap_ptr, offset)
> {
>   Assert(offset >= 0 && offset < attrmap_ptr->maplen);
>  return attrmap_ptr->attnums[offset];
> }

I don't see any reason not to do this, though it's not directly related
to the bugs in this thread.  I encourage you to submit a patch, opening
a new thread.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tender Wang
Date:


Alvaro Herrera <alvherre@alvh.no-ip.org> 于2024年10月27日周日 05:47写道:
On 2024-Oct-25, Alvaro Herrera wrote:

> On 2024-Oct-25, Tender Wang wrote:
>
> > Thanks for reporting. I can reproduce this memory invalid access on HEAD.
> > After debuging codes, I found the root cause.
> >  In DetachPartitionFinalize(), below code:
> > att = TupleDescAttr(RelationGetDescr(partRel),
> >                                attmap->attnums[conkey[i] - 1] - 1);
> >
> > We should use confkey[i] -1 not conkey[i] - 1;
>
> Wow, how embarrasing, now that's one _really_ stupid bug and it's
> totally my own.  Thanks for the analysis!  I'll get this patched soon.

Actually, now that I look at this again, I think this proposed fix is
wrong; conkey/confkey confusion is not what the problem is.  Rather, the
problem is that we're applying a tuple conversion map when none should
be applied.  So the fix is to remove "attmap" altogether.  One thing
that didn't appear correct to me was that the patch was changing one
constraint name so that it appeared that the constrained columns were
"id, p_id" but that didn't make sense: they are "p_id, p_jd" instead.

Yeah, The constrained name  change made me think about if the patch is correct.
After your explanation, I have understood it.

Then I realized that you're wrong that it's the referenced side that
we're processing: what we're doing there is generate the fk_attrs list,
which is the list of constrained columns (not the list of referenced
columns, which is pk_attrs).

I also felt like modifying the fkpart12 test rather than adding a
separate fkpart13, so I did that.

So here's a v2 for this.  (Commit message needs love still, but it's a
bit late here.)


The v2 LGTM.
BTW, while reviewing the v2 patch, I found the parentConTup in foreach(cell, fks) block
didn't need it anymore. We can remove the related codes. 

--
Thanks,
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tender Wang
Date:


Alvaro Herrera <alvherre@alvh.no-ip.org> 于2024年10月25日周五 23:14写道:
On 2024-Oct-25, Alexander Lakhin wrote:

> I've also discovered another anomaly with a similar setup, but it's not
> related to 53af9491a.

Hmm, it may well be a preexisting problem, but I do think it involves
the same code.  As far as I can tell, the value "2" here

> This script ends up with:
> ERROR:  invalid attribute number 2
> ERROR:  cache lookup failed for attribute 2 of relation 16398

is coming from riinfo->confdelsetcols which was set up by
DetachPartitionFinalize during the last DETACH operation.

Hmm, actually, the confdelsetcols before detach and after detach is always {2}, as below:

postgres=# select oid, conname, conrelid,conparentid,confdelsetcols from pg_constraint where conrelid = 16397;
  oid  |  conname  | conrelid | conparentid | confdelsetcols
-------+-----------+----------+-------------+----------------
 16400 | pt_a_fkey |    16397 |       16392 | {2}
(1 row)

postgres=# ALTER TABLE pt DETACH PARTITION tp1;
ALTER TABLE
postgres=# select oid, conname, conrelid,conparentid,confdelsetcols from pg_constraint where conrelid = 16397;
  oid  |  conname  | conrelid | conparentid | confdelsetcols
-------+-----------+----------+-------------+----------------
 16400 | pt_a_fkey |    16397 |           0 | {2}
(1 row) 

Even though no detach, the confdelsetcols is {2} . But no error report. Because the rel->rd_att->natts of pt is 2.
It will not go into tp1 because tp1 is a partition of pt. But after detach,  the rel->rd_att->natts of tp1 is 1, 
so "ERROR:  invalid attribute number 2" will report.

CREATE TABLE tp1...  will ignore the dropped column of parent, so the natts of tp1 is 1, but its parent is 2.

--
Thanks,
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tender Wang
Date:


Alvaro Herrera <alvherre@alvh.no-ip.org> 于2024年10月28日周一 17:16写道:
On 2024-Oct-27, Tender Wang wrote:

> BTW, while reviewing the v2 patch, I found the parentConTup in
> foreach(cell, fks) block
> didn't need it anymore. We can remove the related codes.

True.  Done so in this v3.

I noticed another problem here: we're grabbing the wrong lock type on
the referenced rel (AccessShareLock) during detach.  (What's more: we
release it afterwards, which is the wrong thing to do.  We need to keep
such locks until end of transaction).  I didn't try to construct a case
where this would be a problem, but if I change AccessShare to NoLock,
the assertion that says we don't hold _any_ lock on that relation fires,
which means that we're not taking any locks on those rels before this
point.  So this lock strength choice is definitely wrong.  I changed it
to ShareRowExclusive, which is what we're suppose to use when adding a
trigger. 

In CloneFKReferencing(), the constrForm->confrelid uses the same lock type.
I think you're right. I don't find any other problem.


--
Thanks,
Tender Wang

Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tom Lane
Date:
I'm trying to write release notes for commits 53af9491a et al,
and it seems to me that we need to explain how to get out of
the mess that would be left behind by the old DETACH code.
There's no hint about that in the commit message :-(

Clearly, if you have now-inconsistent data, there's little
help for that but to manually fix the inconsistencies.
What I am worried about is how to get to a state where you
have correct catalog entries for the constraint.

Will ALTER TABLE DROP CONSTRAINT on the now stand-alone table
work to clean out the old catalog entries for the constraint?
I'm worried that it will either fail, or go through but remove
triggers on the referenced table that we still need for the
original partitioned table.  If that doesn't work I think we had
better create a recipe for manually removing the detritus.

Once the old entries are gone it should be possible to do ALTER TABLE
ADD CONSTRAINT (with an updated server), and that would validate
your data.  It's the DROP CONSTRAINT part that worries me.

            regards, tom lane



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Alvaro Herrera
Date:
On 2024-Nov-05, Tom Lane wrote:

> I'm trying to write release notes for commits 53af9491a et al,
> and it seems to me that we need to explain how to get out of
> the mess that would be left behind by the old DETACH code.
> There's no hint about that in the commit message :-(

> Clearly, if you have now-inconsistent data, there's little
> help for that but to manually fix the inconsistencies.
> What I am worried about is how to get to a state where you
> have correct catalog entries for the constraint.
> 
> Will ALTER TABLE DROP CONSTRAINT on the now stand-alone table
> work to clean out the old catalog entries for the constraint?

Yes -- as far as I can tell, a DROP CONSTRAINT of the offending
constraint is successful and leaves no unwanted detritus.

Perhaps one more task for me is to figure out a way to get a list of all
the constraints that are broken because of this ... let me see if I can
figure that out.

> I'm worried that it will either fail, or go through but remove
> triggers on the referenced table that we still need for the
> original partitioned table.  If that doesn't work I think we had
> better create a recipe for manually removing the detritus.

As as far as I can see, it works and no triggers are spuriously removed.

> Once the old entries are gone it should be possible to do ALTER TABLE
> ADD CONSTRAINT (with an updated server), and that would validate
> your data.  It's the DROP CONSTRAINT part that worries me.

Yeah, that's correct: adding the constraint again after removing its
broken self detects that there are values violating the RI.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
Thou shalt check the array bounds of all strings (indeed, all arrays), for
surely where thou typest "foo" someone someday shall type
"supercalifragilisticexpialidocious" (5th Commandment for C programmers)



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Alvaro Herrera
Date:
On 2024-Nov-06, Alvaro Herrera wrote:

> Perhaps one more task for me is to figure out a way to get a list of all
> the constraints that are broken because of this ... let me see if I can
> figure that out.

It's gotta be something like this,

SELECT conrelid::regclass AS "constrained table",
       conname as constraint, confrelid::regclass AS "references"
FROM pg_constraint
WHERE contype = 'f' and conparentid = 0 AND
   (SELECT count(*) FROM pg_constraint p2 WHERE conparentid = pg_constraint.oid) <>
   (SELECT count(*)
      FROM pg_inherits
     WHERE inhparent = pg_constraint.conrelid OR inhparent = pg_constraint.confrelid);

Essentially, top-level constraints should have as many children
constraint as direct partitions each partitioned table has.  Ideally
here you'd get an empty set, but you won't if the DETACH problem has
occurred.  I'll test this further later or maybe tomorrow as time
allows.

A quick test rig for this is:

create table pk (a int primary key) partition by list (a);
create table pk1 partition of pk for values in (1);
create table pk2367 partition of pk for values in (2, 3, 6, 7) partition by list (a);
create table pk67 partition of pk2367 for values in (6, 7) partition by list (a);
create table pk2 partition of pk2367 for values in (2);
create table pk3 partition of pk2367 for values in (3);
create table pk6 partition of pk67 for values in (6);
create table pk7 partition of pk67 for values in (7);
create table pk45 partition of pk for values in (4, 5) partition by list (a);
create table pk4 partition of pk45 for values in (4);
create table pk5 partition of pk45 for values in (5);

create table fk (a int references pk) partition by list (a);
create table fk1 partition of fk for values in (1);
create table fk2367 partition of fk for values in (2, 3, 6, 7) partition by list (a);
create table fk67 partition of fk2367 for values in (6, 7) partition by list (a);
create table fk2 partition of fk2367 for values in (2);
create table fk3 partition of fk2367 for values in (3);
create table fk6 partition of fk67 for values in (6);
create table fk7 partition of fk67 for values in (7);
create table fk45 partition of fk for values in (4, 5) partition by list (a);
create table fk4 partition of fk45 for values in (4);
create table fk5 partition of fk45 for values in (5);

alter table fk detach partition fk2367;



Before the fix, you get

 constrained table │ constraint │ references 
───────────────────┼────────────┼────────────
 fk2367            │ fk_a_fkey  │ pk
(1 fila)

which means you need to
ALTER TABLE fk2367 DROP CONSTRAINT fk_a_fkey;

and then put it back.  Maybe it'd be better to have the query emit the
commands to drop and reconstruct the FK?

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
>> Perhaps one more task for me is to figure out a way to get a list of all
>> the constraints that are broken because of this ... let me see if I can
>> figure that out.

> It's gotta be something like this,

> SELECT conrelid::regclass AS "constrained table",
>        conname as constraint, confrelid::regclass AS "references"
> FROM pg_constraint
> WHERE contype = 'f' and conparentid = 0 AND
>    (SELECT count(*) FROM pg_constraint p2 WHERE conparentid = pg_constraint.oid) <>
>    (SELECT count(*)
>       FROM pg_inherits
>      WHERE inhparent = pg_constraint.conrelid OR inhparent = pg_constraint.confrelid);

Hmm ... interestingly, if I run this in HEAD's regression database,
I get

 constrained table |  constraint   | references
-------------------+---------------+-------------
 clstr_tst         | clstr_tst_con | clstr_tst_s
(1 row)

Digging a bit deeper, the sub-select for conparentid finds no rows,
but the sub-select on pg_inherits finds

regression=# SELECT inhrelid::regclass, inhparent::regclass, inhseqno,inhdetachpending from pg_inherits WHERE inhparent
='clstr_tst'::regclass or inhparent = 'clstr_tst_s'::regclass; 
   inhrelid    | inhparent | inhseqno | inhdetachpending
---------------+-----------+----------+------------------
 clstr_tst_inh | clstr_tst |        1 | f
(1 row)

So it looks like this query needs a guard to make it ignore
constraints on traditional-inheritance tables.

            regards, tom lane



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Alvaro Herrera
Date:
On 2024-Nov-08, Tom Lane wrote:

> Hmm ... interestingly, if I run this in HEAD's regression database,
> I get
> 
>  constrained table |  constraint   | references  
> -------------------+---------------+-------------
>  clstr_tst         | clstr_tst_con | clstr_tst_s
> (1 row)

Eeek.

> So it looks like this query needs a guard to make it ignore
> constraints on traditional-inheritance tables.

Hmm, looks tricky, the only thing I found was to only consider rows in
pg_inherit if there's a corresponding one in pg_partitioned_table.  This
should do it.  I added the DROP/ADD commands.  I also added some
pg_catalog schema quals, though that may be kinda useless.  Anyway, this
reports empty in the regression database.

SELECT conrelid::pg_catalog.regclass AS "constrained table",
       conname AS constraint,
       confrelid::pg_catalog.regclass AS "references",
       pg_catalog.format('ALTER TABLE %s DROP CONSTRAINT %I;', conrelid::regclass, conname),
       pg_catalog.format('ALTER TABLE %s ADD CONSTRAINT %I %s;', conrelid::regclass, conname,
                         pg_catalog.pg_get_constraintdef(oid))
FROM pg_catalog.pg_constraint
WHERE contype = 'f' and conparentid = 0 AND
   (SELECT count(*) FROM pg_catalog.pg_constraint p2 WHERE conparentid = pg_constraint.oid) <>
   (SELECT count(*)
      FROM pg_catalog.pg_inherits
     WHERE EXISTS (SELECT 1 FROM pg_catalog.pg_partitioned_table WHERE partrelid = inhparent) AND
     inhparent = pg_constraint.conrelid OR inhparent = pg_constraint.confrelid)
;


I would have loved to be able to add the constraint as NOT VALID
followed by a separate VALIDATE command, because if there are any RI
violations, the constraint would now be in place to prevent future ones.
However,

=# ALTER TABLE fk2367 ADD CONSTRAINT fk_a_fkey FOREIGN KEY (a) REFERENCES pk(a) NOT VALID;
ERROR:  cannot add NOT VALID foreign key on partitioned table "fk2367" referencing relation "pk"
DETAIL:  This feature is not yet supported on partitioned tables.


So it looks like we should suggest to save the output of the query,
execute each DROP followed by each ADD, and if the latter fails, fix the
violations and retry the ADD.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Java is clearly an example of money oriented programming"  (A. Stepanov)



Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> So it looks like we should suggest to save the output of the query,
> execute each DROP followed by each ADD, and if the latter fails, fix the
> violations and retry the ADD.

Right.  I'll make it so -- thanks for doing the legwork on
creating the query!

            regards, tom lane