Re: row filtering for logical replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: row filtering for logical replication
Date
Msg-id CAA4eK1Jhp9=ZZZ2=ahpWbTiqrZcuu61FTD_moeq4AgQ4Z2Y4Qg@mail.gmail.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: row filtering for logical replication  (Greg Nancarrow <gregn4422@gmail.com>)
List pgsql-hackers
On Fri, Jul 16, 2021 at 10:11 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Jul 16, 2021 at 8:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Jul 14, 2021 at 4:30 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Wed, Jul 14, 2021 at 3:58 PM Tomas Vondra
> > > <tomas.vondra@enterprisedb.com> wrote:
> > > >
> > > > Is there some reasonable rule which of the old/new tuples (or both) to
> > > > use for the WHERE condition? Or maybe it'd be handy to allow referencing
> > > > OLD/NEW as in triggers?
> > >
> > > I think for insert we are only allowing those rows to replicate which
> > > are matching filter conditions, so if we updating any row then also we
> > > should maintain that sanity right? That means at least on the NEW rows
> > > we should apply the filter, IMHO.  Said that, now if there is any row
> > > inserted which were satisfying the filter and replicated, if we update
> > > it with the new value which is not satisfying the filter then it will
> > > not be replicated,  I think that makes sense because if an insert is
> > > not sending any row to a replica which is not satisfying the filter
> > > then why update has to do that, right?
> > >
> >
> > There is another theory in this regard which is what if the old row
> > (created by the previous insert) is not sent to the subscriber as that
> > didn't match the filter but after the update, we decide to send it
> > because the updated row (new row) matches the filter condition. In
> > this case, I think it will generate an update conflict on the
> > subscriber as the old row won't be present. As of now, we just skip
> > the update but in the future, we might have some conflict handling
> > there. If this is true then even if the new row matches the filter,
> > there is no guarantee that it will be applied on the subscriber-side
> > unless the old row also matches the filter.
>
> Yeah, it's a valid point.
>
>  Sure, there could be a
> > case where the user might have changed the filter between insert and
> > update but maybe we can have a separate way to deal with such cases if
> > required like providing some provision where the user can specify
> > whether it would like to match old/new row in updates?
>
> Yeah, I think the best way is that users should get an option whether
> they want to apply the filter on the old row or on the new row, or
> both, in fact, they should be able to apply the different filters on
> old and new rows.
>

I am not so sure about different filters for old and new rows but it
makes sense to by default apply the filter to both old and new rows.
Then also provide a way for user to specify if the filter can be
specified to just old or new row.

>  I have one more thought in mind: currently, we are
> providing a filter for the publication table, doesn't it make sense to
> provide filters for operations of the publication table?  I mean the
> different filters for Insert, delete, and the old row of update and
> the new row of the update.
>

Hmm, I think this sounds a bit of a stretch but if there is any field
use case then we can consider this in the future.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Vladimir Sitnikov
Date:
Subject: Re: speed up verifying UTF-8
Next
From: David Rowley
Date:
Subject: Re: Add proper planner support for ORDER BY / DISTINCT aggregates