RE: row filtering for logical replication - Mailing list pgsql-hackers

From houzj.fnst@fujitsu.com
Subject RE: row filtering for logical replication
Date
Msg-id OS0PR01MB571645EA968376063E63BB60945B9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: row filtering for logical replication
List pgsql-hackers
On Thur, Jan 20, 2022 10:26 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> 
> On 2022-Jan-20, Amit Kapila wrote:
> 
> > It returns an invalid column referenced in an RF if any but if not
> > then it helps to form pubactions which is anyway required at a later
> > point in the caller. The idea is that when we are already traversing
> > publications we should store/gather as much info as possible.
> 
> I think this design isn't quite awesome.
> 
> > I think probably the API name is misleading, maybe we should name it
> > something like ValidateAndFetchPubInfo, ValidateAndRememberPubInfo, or
> > something along these lines?
> 
> Maybe RelationBuildReplicationPublicationDesc or just
> RelationBuildPublicationDesc are good names for a routine that fill in
> the publication aspect of the relcache entry, as a parallel to
> RelationBuildPartitionDesc.
> 
> > >  Maybe this was meant to be "validate RF
> > > expressions" and return, perhaps, a bitmapset of all invalid columns
> > > referenced?
> >
> > Currently, we stop as soon as we find the first invalid column.
> 
> That seems quite strange.  (And above you say "gather as much info as
> possible", so why stop at the first one?)
> 
> > >  (What is an invalid column in the first place?)
> >
> > A column that is referenced in the row filter but is not part of
> > Replica Identity.
> 
> I do wonder how do these invalid columns reach the table definition in
> the first place.  Shouldn't these be detected at DDL time and prohibited
> from getting into the definition?

Personally, I'm a little hesitant to put the check at DDL level, because
adding check at DDLs like ATTACH PARTITION/CREATE PARTITION OF ( [1]
explained why we need to check these DDLs) looks a bit restrictive and
user might also complain about that. Put the check in
CheckCmdReplicaIdentity seems more acceptable because it is consistent
with the existing behavior which has few complaints from users AFAIK.

[1] https://www.postgresql.org/message-id/CAA4eK1%2Bm45Xyzx7AUY9TyFnB6CZ7_%2B_uooPb7WHSpp7UE%3DYmKg%40mail.gmail.com

Best regards,
Hou zj

pgsql-hackers by date:

Previous
From: "osumi.takamichi@fujitsu.com"
Date:
Subject: RE: Skipping logical replication transactions on subscriber side
Next
From: Amit Kapila
Date:
Subject: Re: Skipping logical replication transactions on subscriber side