Re: Column Filtering in Logical Replication - Mailing list pgsql-hackers

From Ibrar Ahmed
Subject Re: Column Filtering in Logical Replication
Date
Msg-id CALtqXTeKbzT16ouuiKCDxWM8PWGaSpKnU3MM_5CCyXKGkW-bkQ@mail.gmail.com
Whole thread Raw
In response to Re: Column Filtering in Logical Replication  (Rahila Syed <rahilasyed90@gmail.com>)
List pgsql-hackers


On Tue, Jul 13, 2021 at 7:44 PM Rahila Syed <rahilasyed90@gmail.com> wrote:

Hi Tomas,

Thank you for your comments.
 

>  
> Currently, this capability is not included in the patch. If the table on
> the subscriber
> server has lesser attributes than that on the publisher server, it
> throws an error at the 
> time of CREATE SUBSCRIPTION.
>

That's a bit surprising, to be honest. I do understand the patch simply
treats the filtered columns as "unchanged" because that's the simplest
way to filter the *data* of the columns. But if someone told me we can
"filter columns" I'd expect this to work without the columns on the
subscriber.

OK, I will look into adding this. 
 

> However, need to carefully consider situations in which a server
> subscribes to multiple 
> publications,  each publishing a different subset of columns of a table. 
Isn't that pretty much the same situation as for multiple subscriptions
each with a different set of I/U/D operations? IIRC we simply merge
those, so why not to do the same thing here and merge the attributes?

 
Yeah, I agree with the solution to merge the attributes, similar to how 
operations are merged. My concern was also from an implementation point 
of view, will it be a very drastic change. I now had a look at how remote relation
attributes are acquired for comparison with local attributes at the subscriber. 
It seems that the publisher will need to send the information about the filtered columns 
for each publication specified during CREATE SUBSCRIPTION.
This will be read at the subscriber side which in turn updates its cache accordingly. 
Currently, the subscriber expects all attributes of a published relation to be present. 
I will add code for this in the next version of the patch.

 To nitpick, I find "Bitmapset *att_list" a bit annoying, because it's  
not really a list ;-)
 
I will make this change with the next version
 
 
 FWIW "make check" fails for me with this version, due to segfault in
OpenTableLists. Apparenly there's some confusion - the code expects the
list to contain PublicationTable nodes, and tries to extract the
RangeVar from the elements. But the list actually contains RangeVar, so
this crashes and burns. See the attached backtrace.

 
Thank you for the report, This is fixed in the attached version, now all publication
function calls accept the PublicationTableInfo list.

Thank you,
Rahila Syed

    

The patch does not apply, and an rebase is required
Hunk #8 succeeded at 1259 (offset 99 lines).
Hunk #9 succeeded at 1360 (offset 99 lines).
1 out of 9 hunks FAILED -- saving rejects to file src/backend/replication/pgoutput/pgoutput.c.rej
patching file src/include/catalog/pg_publication.h

Changing the status to "Waiting on Author"

--
Ibrar Ahmed

pgsql-hackers by date:

Previous
From: Ibrar Ahmed
Date:
Subject: Re: Re[3]: On login trigger: take three
Next
From: Ibrar Ahmed
Date:
Subject: Re: shared-memory based stats collector