From Mon, Sep 6, 2021 3:59 PM tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:
> I met a problem when using logical replication. Maybe it's a bug in logical
> replication.
> When publishing a partition table without replica identity, update
> or delete operation can be successful in some cases.
>
> For example:
> create table tbl1 (a int) partition by range ( a );
> create table tbl1_part1 partition of tbl1 for values from (1) to (101);
> create table tbl1_part2 partition of tbl1 for values from (101) to (200);
> insert into tbl1 select generate_series(1, 10);
> delete from tbl1 where a=1;
> create publication pub for table tbl1;
> delete from tbl1 where a=2;
>
> The last DELETE statement can be executed successfully, but it should report
> error message about missing a replica identity.
>
> I found this problem on HEAD and I could reproduce this problem at PG13 and
> PG14. (Logical replication of partition table was introduced in PG13.)
I can reproduce this bug.
I think the reason is it didn't invalidate all the leaf partitions' relcache
when add a partitioned table to the publication, so the publication info was
not rebuilt.
The following code only invalidate the target table:
---
PublicationAddTables
publication_add_relation
/* Invalidate relcache so that publication info is rebuilt. */
CacheInvalidateRelcache(targetrel);
---
In addition, this problem can happen in both ADD TABLE, DROP
TABLE, and SET TABLE cases, so we need to invalidate the leaf partitions'
recache in all these cases.
Attach patches to fix this bug. The 0001 patch is borrowed from another
thread[1] which make the existing relation cache invalidation code into a
common function. The 0002 patch is to invalidate the leaf partitions' relcache.
[1] https://www.postgresql.org/message-id/CALDaNm27bs40Rxpy4oKfV97UgsPG%3DvVoZ5bj9pP_4BxnO-6DYA%40mail.gmail.com
Best regards,
Hou zj