RE: row filtering for logical replication - Mailing list pgsql-hackers

From tanghy.fnst@fujitsu.com
Subject RE: row filtering for logical replication
Date
Msg-id OS0PR01MB6113D82113AA081ACF710D0CFB6E9@OS0PR01MB6113.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: row filtering for logical replication
Re: row filtering for logical replication
List pgsql-hackers
On Friday, December 3, 2021 10:09 AM Peter Smith <smithpb2250@gmail.com> wrote:
> 
> On Thu, Dec 2, 2021 at 2:32 PM tanghy.fnst@fujitsu.com
> <tanghy.fnst@fujitsu.com> wrote:
> >
> > On Thursday, December 2, 2021 5:21 AM Peter Smith
> <smithpb2250@gmail.com> wrote:
> > >
> > > PSA the v44* set of patches.
> > >
> >
> > Thanks for the new patch. Few comments:
> >
> > 1. This is an example in publication doc, but in fact it's not allowed. Should we
> > change this example?
> >
> > +CREATE PUBLICATION active_departments FOR TABLE departments WHERE
> (active IS TRUE);
> >
> > postgres=# CREATE PUBLICATION active_departments FOR TABLE departments
> WHERE (active IS TRUE);
> > ERROR:  invalid publication WHERE expression for relation "departments"
> > HINT:  only simple expressions using columns, constants and immutable system
> functions are allowed
> >
> 
> Thanks for finding this. Actually, the documentation looks correct to
> me. The problem was the validation walker of patch 0002 was being
> overly restrictive. It needed to also allow a BooleanTest node.
> 
> Now it works (locally) for me. For example.
> 
> test_pub=# create table departments(depno int primary key, active boolean);
> CREATE TABLE
> test_pub=# create publication pdept for table departments where
> (active is true) with (publish="insert");
> CREATE PUBLICATION
> test_pub=# create publication pdept2 for table departments where
> (active is false) with (publish="insert");
> CREATE PUBLICATION
> 
> This fix will be available in v45*.
> 

Thanks for looking into it.

I have another problem with your patch. The document says:

... If the subscription has several publications in
+   which the same table has been published with different filters, those
+   expressions get OR'ed together so that rows satisfying any of the expressions
+   will be replicated. Notice this means if one of the publications has no filter
+   at all then all other filters become redundant.

Then, what if one of the publications is specified as 'FOR ALL TABLES' or 'FOR
ALL TABLES IN SCHEMA'.

For example:
create table tbl (a int primary key);"
create publication p1 for table tbl where (a > 10);
create publication p2 for all tables;
create subscription sub connection 'dbname=postgres port=5432' publication p1, p2;

I think for "FOR ALL TABLE" publication(p2 in my case), table tbl should be
treated as no filter, and table tbl should have no filter in subscription sub. Thoughts?

But for now, the filter(a > 10) works both when copying initial data and later changes.

To fix it, I think we can check if the table is published in a 'FOR ALL TABLES'
publication or published as part of schema in function pgoutput_row_filter_init
(which was introduced in v44-0003 patch), also we need to make some changes in
tablesync.c.

Regards
Tang

pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: Do we need pre-allocate WAL files during end-of-recovery checkpoint?
Next
From: Alexander Lakhin
Date:
Subject: Re: MSVC SSL test failure