Re: row filtering for logical replication - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: row filtering for logical replication
Date
Msg-id CAFiTN-tLvpxkGMV9Vutv1bwV7-DpgRJaXL4b8h9Tjj9Aho4QXg@mail.gmail.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: row filtering for logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Fri, Jul 16, 2021 at 8:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jul 14, 2021 at 4:30 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Wed, Jul 14, 2021 at 3:58 PM Tomas Vondra
> > <tomas.vondra@enterprisedb.com> wrote:
> > >
> > > Is there some reasonable rule which of the old/new tuples (or both) to
> > > use for the WHERE condition? Or maybe it'd be handy to allow referencing
> > > OLD/NEW as in triggers?
> >
> > I think for insert we are only allowing those rows to replicate which
> > are matching filter conditions, so if we updating any row then also we
> > should maintain that sanity right? That means at least on the NEW rows
> > we should apply the filter, IMHO.  Said that, now if there is any row
> > inserted which were satisfying the filter and replicated, if we update
> > it with the new value which is not satisfying the filter then it will
> > not be replicated,  I think that makes sense because if an insert is
> > not sending any row to a replica which is not satisfying the filter
> > then why update has to do that, right?
> >
>
> There is another theory in this regard which is what if the old row
> (created by the previous insert) is not sent to the subscriber as that
> didn't match the filter but after the update, we decide to send it
> because the updated row (new row) matches the filter condition. In
> this case, I think it will generate an update conflict on the
> subscriber as the old row won't be present. As of now, we just skip
> the update but in the future, we might have some conflict handling
> there. If this is true then even if the new row matches the filter,
> there is no guarantee that it will be applied on the subscriber-side
> unless the old row also matches the filter.

Yeah, it's a valid point.

 Sure, there could be a
> case where the user might have changed the filter between insert and
> update but maybe we can have a separate way to deal with such cases if
> required like providing some provision where the user can specify
> whether it would like to match old/new row in updates?

Yeah, I think the best way is that users should get an option whether
they want to apply the filter on the old row or on the new row, or
both, in fact, they should be able to apply the different filters on
old and new rows.  I have one more thought in mind: currently, we are
providing a filter for the publication table, doesn't it make sense to
provide filters for operations of the publication table?  I mean the
different filters for Insert, delete, and the old row of update and
the new row of the update.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: "r.takahashi_2@fujitsu.com"
Date:
Subject: RE: Speed up COMMIT PREPARED
Next
From: Michael Paquier
Date:
Subject: Re: Introduce pg_receivewal gzip compression tests