Re: Column Filtering in Logical Replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Column Filtering in Logical Replication
Date
Msg-id CAA4eK1JyoVty_BZOMJdx7VMkBTadvv8oOYrh9xTQ5BY+k-3+OA@mail.gmail.com
Whole thread Raw
In response to Re: Column Filtering in Logical Replication  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Column Filtering in Logical Replication
List pgsql-hackers
On Wed, Mar 2, 2022 at 5:43 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> Attached is an updated patch, addressing most of the issues reported so
> far. There are various minor tweaks, but the main changes are:
...
>
> 3) checks of column filter vs. publish_via_partition_root and replica
> identity, following the same logic as the row-filter patch (hopefully,
> it touches the same places, using the same logic, ...)
>
> That means - with "publish_via_partition_root=false" it's not allowed to
> specify column filters on partitioned tables, only for leaf partitions.
>
> And we check column filter vs. replica identity when adding tables to
> publications, or whenever we change the replica identity.
>

This handling is different from row filter work and I see problems
with it. The column list validation w.r.t primary key (default replica
identity) is missing. The handling of column list vs. partitions has
multiple problems: (a) In attach partition, the patch is just checking
ancestors for RI validation but what if the table being attached has
further subpartitions; (b) I think the current locking also seems to
have problems because it is quite possible that while it validates the
ancestors here, concurrently someone changes the column list. I think
it won't be enough to just change the locking mode because with the
current patch strategy during attach, we will be first taking locks
for child tables of current partition and then parent tables which can
pose deadlock hazards.

The columns list validation also needs to be done when we change
publication action.

There could be more similar problems which I might have missed. For
some of these (except for concurrency issues), my colleague Shi-San
has done testing and the results are below [1]. I feel we should do RI
vs. column list handling similar to row filter work (at one place) to
avoid all such hazards and possibly similar handling at various
places, there is a good chance that we will miss some places or make
mistakes that are not easy to catch. Do let me know if you think it
makes sense for me or one of the people who work on row filter patch
to try this (make the handling of RI checks similar to row filter
work) and then we can see if that turns out to be a simple way to deal
with all these problems?

Some other miscellaneous comments:
=============================
*
In get_rel_sync_entry(), the handling for partitioned tables doesn't
seem to be correct. It can publish a different set of columns based on
the order of publications specified in the subscription.

For example:
----
create table parent (a int, b int, c int) partition by range (a);
create table test_part1 (like parent);
alter table parent attach partition test_part1 for values from (1) to (10);

create publication pub for table parent(a) with (PUBLISH_VIA_PARTITION_ROOT);
create publication pub2 for table test_part1(b);
---

Now, depending on the order of publications in the list while defining
subscription, the column list will change
----
create subscription sub connection 'port=10000 dbname=postgres'
publication pub, pub2;

For the above, column list will be: (a)

create subscription sub connection 'port=10000 dbname=postgres'
publication pub2, pub;

For this one, the column list will be: (a, b)
----

To avoid this, the column list should be computed based on the final
publish_as_relid as we are doing for the row filter.

*
Fetching column filter info in tablesync.c is quite expensive. It
seems to be using four round-trips to get the complete info whereas
for row-filter we use just one round trip. I think we should try to
get both row filter and column filter info in just one round trip.

[1] -
Test-1:
The patch doesn't check when the primary key changes.

e.g.
-- publisher --
create table tbl(a int primary key, b int);
create publication pub for table tbl(a);
alter table tbl drop CONSTRAINT tbl_pkey;
alter table tbl add primary key (b);
insert into tbl values (1,1);

-- subscriber --
create table tbl(a int, b int);
create subscription sub connection 'port=5432 dbname=postgres' publication pub;
update tbl set b=1 where a=1;
alter table tbl add primary key (b);

-- publisher --
delete from tbl;

Column "b" is part of replica identity, but it is filtered, which
caused an error on the subscriber side.

ERROR:  publisher did not send replica identity column expected by the
logical replication target relation "public.tbl"
CONTEXT:  processing remote data during "DELETE" for replication
target relation "public.tbl" in transaction 724 at 2022-03-04
11:46:16.330892+08

Test-2: Partitioned table RI w.r.t column list.
2.1
Using "create table ... partition of".

e.g.
-- publisher --
create table parent (a int, b int) partition by range (a);
create publication pub for table parent(a)
with(publish_via_partition_root=true);
create table child partition of parent (primary key (a,b)) default;
insert into parent values (1,1);

-- subscriber --
create table parent (a int, b int) partition by range (a);
create table child partition of parent default;
create subscription sub connection 'port=5432 dbname=postgres'
publication pub; update child set b=1 where a=1;
alter table parent add primary key (a,b);

-- publisher --
delete from parent;

Column "b" is part of replica identity in the child table, but it is
filtered, which caused an error on the subscriber side.

ERROR:  publisher did not send replica identity column expected by the
logical replication target relation "public.parent"
CONTEXT:  processing remote data during "DELETE" for replication
target relation "public.parent" in transaction 723 at 2022-03-04
15:15:39.776949+08

2.2
It is likely that a table to be attached also has a partition.

e.g.
-- publisher --
create table t1 (a int, b int) partition by range (a);
create publication pub for table t1(b) with(publish_via_partition_root=true);
create table t2 (a int, b int) partition by range (a);
create table t3 (a int primary key, b int);
alter table t2 attach partition t3 default;
alter table t1 attach partition t2 default;
insert into t1 values (1,1);

-- subscriber --
create table t1 (a int, b int) partition by range (a);
create table t2 (a int, b int) partition by range (a);
create table t3 (a int, b int);
alter table t2 attach partition t3 default;
alter table t1 attach partition t2 default;
create subscription sub connection 'port=5432 dbname=postgres' publication pub;
update t1 set a=1 where b=1;
alter table t1 add primary key (a);

-- publisher --
delete from t1;

Column "a" is part of replica identity in table t3, but t3's ancestor
t1 is published with column "a" filtered, which caused an error on the
subscriber side.

ERROR:  publisher did not send replica identity column expected by the
logical replication target relation "public.t1"
CONTEXT:  processing remote data during "DELETE" for replication
target relation "public.t1" in transaction 726 at 2022-03-04
14:40:29.297392+08

3.
Using "alter publication pub set(publish='...'); "

e.g.
-- publisher --
create table tbl(a int primary key, b int); create publication pub for
table tbl(b) with(publish='insert'); insert into tbl values (1,1);

-- subscriber --
create table tbl(a int, b int);
create subscription sub connection 'port=5432 dbname=postgres' publication pub;

-- publisher --
alter publication pub set(publish='insert,update');

-- subscriber --
update tbl set a=1 where b=1;
alter table tbl add primary key (b);

-- publisher --
update tbl set a=2 where a=1;

Updates are replicated, and the column "a" is part of replica
identity, but it is filtered, which caused an error on the subscriber
side.

ERROR:  publisher did not send replica identity column expected by the
logical replication target relation "public.tbl"
CONTEXT:  processing remote data during "UPDATE" for replication
target relation "public.tbl" in transaction 723 at 2022-03-04
11:56:33.905843+08

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pg_tablespace_location() failure with allow_in_place_tablespaces
Next
From: Andrey Lepikhov
Date:
Subject: Re: Removing unneeded self joins