Re: Column Filtering in Logical Replication - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Column Filtering in Logical Replication
Date
Msg-id e4b0fb99-3d74-7333-1e71-74f990b641ee@enterprisedb.com
Whole thread Raw
In response to RE: Column Filtering in Logical Replication  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Responses Re: Column Filtering in Logical Replication  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On 3/9/22 11:12, houzj.fnst@fujitsu.com wrote:
> Hi,
> 
> Here are some tests and results about the table sync query of
> column filter patch and row filter.
> 
> 1) multiple publications which publish schema of parent table and partition.
> ----pub
> create schema s1;
> create table s1.t (a int, b int, c int) partition by range (a);
> create table t_1 partition of s1.t for values from (1) to (10);
> create publication pub1 for all tables in schema s1;
> create publication pub2 for table t_1(b);
> 
> ----sub
> - prepare tables
> CREATE SUBSCRIPTION sub CONNECTION 'port=10000 dbname=postgres' PUBLICATION pub1, pub2;
> 
> When doing table sync for 't_1', the column list will be (b). I think it should
> be no filter because table t_1 is also published via ALL TABLES IN SCHMEA
> publication.
> 
> For Row Filter, it will use no filter for this case.
> 
> 
> 2) one publication publishes both parent and child
> ----pub
> create table t (a int, b int, c int) partition by range (a);
> create table t_1 partition of t for values from (1) to (10)
>        partition by range (a);
> create table t_2 partition of t_1 for values from (1) to (10);
> 
> create publication pub2 for table t_1(a), t_2
>   with (PUBLISH_VIA_PARTITION_ROOT);
> 
> ----sub
> - prepare tables
> CREATE SUBSCRIPTION sub CONNECTION 'port=10000 dbname=postgres' PUBLICATION pub2;
> 
> When doing table sync for table 't_1', it has no column list. I think the
> expected column list is (a).
> 
> For Row Filter, it will use the row filter of the top most parent table(t_1) in
> this case.
> 
> 
> 3) one publication publishes both parent and child
> ----pub
> create table t (a int, b int, c int) partition by range (a);
> create table t_1 partition of t for values from (1) to (10)
>        partition by range (a);
> create table t_2 partition of t_1 for values from (1) to (10);
> 
> create publication pub2 for table t_1(a), t_2(b)
>   with (PUBLISH_VIA_PARTITION_ROOT);
> 
> ----sub
> - prepare tables
> CREATE SUBSCRIPTION sub CONNECTION 'port=10000 dbname=postgres' PUBLICATION pub2;
> 
> When doing table sync for table 't_1', the column list would be (a, b). I think
> the expected column list is (a).
> 
> For Row Filter, it will use the row filter of the top most parent table(t_1) in
> this case.
> 

Attached is an updated patch version, addressing all of those issues.

0001 is a bugfix, reworking how we calculate publish_as_relid. The old
approach was unstable with multiple publications, giving different
results depending on order of the publications. This should be
backpatched into PG13 where publish_via_partition_root was introduced, I
think.

0002 is the main patch, merging the changes proposed by Peter and fixing
the issues reported here. In most cases this means adopting the code
used for row filters, and perhaps simplifying it a bit.


But I also tried to implement a row-filter test for 0001, and I'm not
sure I understand the behavior I observe. Consider this:

-- a chain of 3 partitions (on both publisher and subscriber)
CREATE TABLE test_part_rf (a int primary key, b int, c int)
       PARTITION BY LIST (a);

CREATE TABLE test_part_rf_1
       PARTITION OF test_part_rf FOR VALUES IN (1,2,3,4,5)
       PARTITION BY LIST (a);

CREATE TABLE test_part_rf_2
       PARTITION OF test_part_rf_1 FOR VALUES IN (1,2,3,4,5);

-- initial data
INSERT INTO test_part_rf VALUES (1, 5, 100);
INSERT INTO test_part_rf VALUES (2, 15, 200);

-- two publications, each adding a different partition
CREATE PUBLICATION test_pub_part_1 FOR TABLE test_part_rf_1
 WHERE (b < 10) WITH (publish_via_partition_root);

CREATE PUBLICATION test_pub_part_2 FOR TABLE test_part_rf_2
 WHERE (b > 10) WITH (publish_via_partition_root);

-- now create the subscription (also try opposite ordering)
CREATE SUBSCRIPTION test_part_sub CONNECTION '...'
       PUBLICATION test_pub_part_1, test_pub_part_2;

-- wait for sync

-- inert some more data
INSERT INTO test_part_rf VALUES (3, 6, 300);
INSERT INTO test_part_rf VALUES (4, 16, 400);

-- wait for catchup

Now, based on the discussion here, my expectation is that we'll use the
row filter from the top-most ancestor in any publication, which in this
case is test_part_rf_1. Hence the filter should be (b < 10).

So I'd expect these rows to be replicated:

1,5,100
3,6,300

But that's not what I get, unfortunately. I get different results,
depending on the order of publications:

1) test_pub_part_1, test_pub_part_2

1|5|100
2|15|200
3|6|300
4|16|400

2) test_pub_part_2, test_pub_part_1

3|6|300
4|16|400

That seems pretty bizarre, because it either means we're not enforcing
any filter or some strange combination of filters (notice that for (2)
we skip/replicate rows matching either filter).

I have to be missing something important, but this seems confusing.
There's a patch adding a simple test case to 028_row_filter.sql (named
.txt, so as not to confuse cfbot).


FWIW I'm not convinced applying just the filters (both row and column)
is the right approach. It might be OK for a single publication, but with
multiple publications not so much. If you list multiple publications for
a subscription, it seems natural to expect a union of all the data, a
bit as if there were multiple subscriptions. But what you actually get
is some subset, depending on what other relations the other publications
include.

Of course, this only happens if the publications include different
ancestors. If all include the same ancestor, everything works fine and
you get the "union" of data.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: role self-revocation
Next
From: Tomas Vondra
Date:
Subject: Re: Column Filtering in Logical Replication