RE: row filtering for logical replication - Mailing list pgsql-hackers
From | houzj.fnst@fujitsu.com |
---|---|
Subject | RE: row filtering for logical replication |
Date | |
Msg-id | OS0PR01MB57168FD9932E3F42406EB13B94629@OS0PR01MB5716.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: row filtering for logical replication (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: row filtering for logical replication
Re: row filtering for logical replication |
List | pgsql-hackers |
On Wed, Nov 24, 2021 1:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > On Wed, Nov 24, 2021 at 6:51 AM houzj.fnst@fujitsu.com > <houzj.fnst@fujitsu.com> wrote: > > > > On Tues, Nov 23, 2021 6:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > On Tue, Nov 23, 2021 at 1:29 PM houzj.fnst@fujitsu.com > > > <houzj.fnst@fujitsu.com> wrote: > > > > > > > > On Tues, Nov 23, 2021 2:27 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > On Thu, Nov 18, 2021 at 7:04 AM Peter Smith > > > > > <smithpb2250@gmail.com> > > > > > wrote: > > > > > > > > > > > > PSA new set of v40* patches. > > > > > > > > > > Few comments: > > > > > 1) When a table is added to the publication, replica identity is > > > > > checked. But while modifying the publish action to include > > > > > delete/update, replica identity is not checked for the existing > > > > > tables. I felt it should be checked for the existing tables too. > > > > > > > > In addition to this, I think we might also need some check to > > > > prevent user from changing the REPLICA IDENTITY index which is used in > > > > the filter expression. > > > > > > > > I was thinking is it possible do the check related to REPLICA > > > > IDENTITY in function CheckCmdReplicaIdentity() or In > > > > GetRelationPublicationActions(). If we move the REPLICA IDENTITY > > > > check to this function, it would be consistent with the existing > > > > behavior about the check related to REPLICA IDENTITY(see the > > > > comments in CheckCmdReplicaIdentity) and seems can cover all the cases > > > > mentioned above. > > > > > > Yeah, adding the replica identity check in CheckCmdReplicaIdentity() > > > would cover all the above cases but I think that would put a premium > > > on each update/delete operation. I think traversing the expression > > > tree (it could be multiple traversals if the relation is part of > > > multiple publications) during each update/delete would be costly. > > > Don't you think so? > > > > Yes, I agreed that traversing the expression every time would be costly. > > > > I thought maybe we can cache the columns used in row filter or cache > > only the a > > flag(can_update|delete) in the relcache. I think every operation that > > affect the row-filter or replica-identity will invalidate the relcache > > and the cost of check seems acceptable with the cache. > > > > I think if we can cache this information especially as a bool flag then that should > probably be better. When researching and writing a top-up patch about this. I found a possible issue which I'd like to confirm first. It's possible the table is published in two publications A and B, publication A only publish "insert" , publication B publish "update". When UPDATE, both row filter in A and B will be executed. Is this behavior expected? For example: ---- Publication create table tbl1 (a int primary key, b int); create publication A for table tbl1 where (b<2) with(publish='insert'); create publication B for table tbl1 where (a>1) with(publish='update'); ---- Subscription create table tbl1 (a int primary key); CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres host=localhost port=10000' PUBLICATION A,B; ---- Publication update tbl1 set a = 2; The publication can be created, and when UPDATE, the rowfilter in A (b<2) will also been executed but the column in it is not part of replica identity. (I am not against this behavior just confirm) Best regards, Hou zj
pgsql-hackers by date: