Re: row filtering for logical replication - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: row filtering for logical replication
Date
Msg-id 202201201313.zaceiqi4qb6h@alvherre.pgsql
Whole thread Raw
In response to RE: row filtering for logical replication  ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Responses Re: row filtering for logical replication
Re: row filtering for logical replication
RE: row filtering for logical replication
List pgsql-hackers
I was skimming this and the changes in CheckCmdReplicaIdentity caught my
attention.  "Is this code running at the publisher side or the subscriber
side?" I wondered -- because the new error messages being added look
intended to be thrown at the publisher side; but the existing error
messages appear intended for the subscriber side.  Apparently there is
one caller at the publisher side (CheckValidResultRel) and three callers
at the subscriber side.  I'm not fully convinced that this is a problem,
but I think it's not great to have it that way.  Maybe it's okay with
the current coding, but after this patch adds this new errors it is
definitely weird.  Maybe it should split in two routines, and document
more explicitly which is one is for which side.

And while wondering about that, I stumbled upon
GetRelationPublicationActions(), which has a very weird API that it
always returns a palloc'ed block -- but without saying so.  And
therefore, its only caller leaks that memory.  Maybe not critical, but
it looks ugly.  I mean, if we're always going to do a memcpy, why not
use a caller-supplied stack-allocated memory?  Sounds like it'd be
simpler.

And the actual reason I was looking at this code, is that I had stumbled
upon the new GetRelationPublicationInfo() function, which has an even
weirder API:

>  * Get the publication information for the given relation.
>  *
>  * Traverse all the publications which the relation is in to get the
>  * publication actions and validate the row filter expressions for such
>  * publications if any. We consider the row filter expression as invalid if it
>  * references any column which is not part of REPLICA IDENTITY.
>  *
>  * To avoid fetching the publication information, we cache the publication
>  * actions and row filter validation information.
>  *
>  * Returns the column number of an invalid column referenced in a row filter
>  * expression if any, InvalidAttrNumber otherwise.
>  */
> AttrNumber
> GetRelationPublicationInfo(Relation relation, bool validate_rowfilter)

"Returns *an* invalid column referenced in a RF if any"?  That sounds
very strange.  And exactly what info is it getting, given that there is
no actual returned info?  Maybe this was meant to be "validate RF
expressions" and return, perhaps, a bitmapset of all invalid columns
referenced?  (What is an invalid column in the first place?)


In many function comments you see things like "Check, if foo is bar" or
"Returns true, if blah".  These commas there needs to be removed.

Thanks

-- 
Álvaro Herrera           39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"I dream about dreams about dreams", sang the nightingale
under the pale moon (Sandman)



pgsql-hackers by date:

Previous
From: Dipesh Pandit
Date:
Subject: Re: refactoring basebackup.c
Next
From: James Coleman
Date:
Subject: Re: Add last commit LSN to pg_last_committed_xact()