Re: row filtering for logical replication - Mailing list pgsql-hackers

From Rahila Syed
Subject Re: row filtering for logical replication
Date
Msg-id CAH2L28sS8=jY6Lj2-xVu6exxQ9gK8uGZr9jf64zDw9hacMrkyA@mail.gmail.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Amit Langote <amitlangote09@gmail.com>)
Responses Re: row filtering for logical replication  ("Euler Taveira" <euler@eulerto.com>)
List pgsql-hackers
Hi Euler,

Please find below some review comments,

1. 
   +
   +     <row>
   +      <entry><structfield>prqual</structfield></entry>
   +      <entry><type>pg_node_tree</type></entry>
   +      <entry></entry>
   +      <entry>Expression tree (in <function>nodeToString()</function>
   +      representation) for the relation's qualifying condition</entry>
   +     </row>
I think the docs are being incorrectly updated to add a column to pg_partitioned_table
instead of pg_publication_rel.

2.   +typedef struct PublicationRelationQual
 +{
+       Oid                     relid;
+       Relation        relation;
+       Node       *whereClause;
+} PublicationRelationQual;

Can this be given a more generic name like PublicationRelationInfo, so that the same struct 
can be used to store additional relation information in future, for ex. column names, if column filtering is introduced.

3. Also, in the above structure, it seems that we can do with storing just relid and derive relation information from it
using table_open when needed. Am I missing something?

4.  Currently in logical replication, I noticed that an UPDATE is being applied on the subscriber even if the column values
 are unchanged. Can row-filtering feature be used to change it such that, when all the OLD.columns = NEW.columns, filter out 
the row from being sent to the subscriber. I understand this would need REPLICA IDENTITY FULL to work, but would be an
improvement from the existing state.

On subscriber:

postgres=# select xmin, * from tab_rowfilter_1;
 xmin | a |      b      
------+---+-------------
  555 | 1 | unfiltered
(1 row)

On publisher:
postgres=# ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;
ALTER TABLE
postgres=# update tab_rowfilter_1 SET b = 'unfiltered' where a = 1;
UPDATE 1

On Subscriber:  The xmin has changed indicating the update from the publisher was applied
even though nothing changed. 

postgres=# select xmin, * from tab_rowfilter_1;
 xmin | a |      b      
------+---+-------------
  556 | 1 | unfiltered
(1 row)

5. Currently, any existing rows that were not replicated, when updated to match the publication quals
using UPDATE tab SET pub_qual_column = 'not_filtered' where a = 1; won't be applied, as row 
does not exist on the subscriber.  It would be good if ALTER SUBSCRIBER REFRESH PUBLICATION
would help fetch such existing rows from publishers that match the qual now(either because the row changed
or the qual changed)

Thank you,
Rahila Syed






On Tue, Mar 9, 2021 at 8:35 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
Hi Euler,

Please find some comments below:

1. If the where clause contains non-replica identity columns, the delete performed on a replicated row
 using DELETE from pub_tab where repl_ident_col = n;
is not being replicated, as logical replication does not have any info whether the column has
to be filtered or not. 
Shouldn't a warning be thrown in this case to notify the user that the delete is not replicated.

2. Same for update, even if I update a row to match the quals on publisher, it is still not being replicated to 
the subscriber. (if the quals contain non-replica identity columns). I think for UPDATE at least, the new value
of the non-replicate identity column is available which can be used to filter and replicate the update.

3. 0001.patch, 
Why is the name of the existing ExclusionWhereClause node being changed, if the exact same definition is being used?

For 0002.patch,
4.   +
 +       memset(lrel, 0, sizeof(LogicalRepRelation));

Is this needed, apart from the above, patch does not use or update lrel at all in that function.

5.  PublicationRelationQual and PublicationTable have similar fields, can PublicationTable
be used in place of PublicationRelationQual instead of defining a new struct?

Thank you,
Rahila Syed

pgsql-hackers by date:

Previous
From: Sergei Kornilov
Date:
Subject: Re: hint Consider using pg_file_read()
Next
From: Thomas Munro
Date:
Subject: Re: fdatasync performance problem with large number of DB files