Re: row filtering for logical replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: row filtering for logical replication
Date
Msg-id CAA4eK1L3Cw4evSxRaqrwmWZQodQTkkDP9gfqj0N8FUQVDwhRcw@mail.gmail.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: row filtering for logical replication
List pgsql-hackers
On Thu, Jan 20, 2022 at 6:43 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> And the actual reason I was looking at this code, is that I had stumbled
> upon the new GetRelationPublicationInfo() function, which has an even
> weirder API:
>
> >  * Get the publication information for the given relation.
> >  *
> >  * Traverse all the publications which the relation is in to get the
> >  * publication actions and validate the row filter expressions for such
> >  * publications if any. We consider the row filter expression as invalid if it
> >  * references any column which is not part of REPLICA IDENTITY.
> >  *
> >  * To avoid fetching the publication information, we cache the publication
> >  * actions and row filter validation information.
> >  *
> >  * Returns the column number of an invalid column referenced in a row filter
> >  * expression if any, InvalidAttrNumber otherwise.
> >  */
> > AttrNumber
> > GetRelationPublicationInfo(Relation relation, bool validate_rowfilter)
>
> "Returns *an* invalid column referenced in a RF if any"?  That sounds
> very strange.  And exactly what info is it getting, given that there is
> no actual returned info?
>

It returns an invalid column referenced in an RF if any but if not
then it helps to form pubactions which is anyway required at a later
point in the caller. The idea is that when we are already traversing
publications we should store/gather as much info as possible. I think
probably the API name is misleading, maybe we should name it something
like ValidateAndFetchPubInfo, ValidateAndRememberPubInfo, or something
along these lines?

>  Maybe this was meant to be "validate RF
> expressions" and return, perhaps, a bitmapset of all invalid columns
> referenced?
>

Currently, we stop as soon as we find the first invalid column.

>  (What is an invalid column in the first place?)
>

A column that is referenced in the row filter but is not part of
Replica Identity.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Logical replication timeout problem
Next
From: Alexander Pyhalov
Date:
Subject: Re: Push down time-related SQLValue functions to foreign server