Re: row filtering for logical replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: row filtering for logical replication
Date
Msg-id CAA4eK1Kp8TZ7PqzJBYMhfbQ+62wpYRh_E6eLSPAn=xsCEvOCOg@mail.gmail.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: row filtering for logical replication
Re: row filtering for logical replication
List pgsql-hackers
On Tue, Jul 20, 2021 at 2:39 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 7/20/21 7:23 AM, Amit Kapila wrote:
> > On Mon, Jul 19, 2021 at 7:02 PM Tomas Vondra
> > <tomas.vondra@enterprisedb.com> wrote:
>
> >> So maybe the best thing is to stick to the simple approach already used
> >> e.g. by pglogical, which simply user the new row when available (insert,
> >> update) and old one for deletes.
> >>
> >> I think that behaves more or less sensibly and it's easy to explain.
> >>
> >
> > Okay, if nothing better comes up, then we can fall back to this option.
> >
> >> All the other things (e.g. turning UPDATE to INSERT, advanced conflict
> >> resolution etc.) will require a lot of other stuff,
> >>
> >
> > I have not evaluated this yet but I think spending some time thinking
> > about turning Update to Insert/Delete (yesterday's suggestion by
> > Alvaro) might be worth especially as that seems to be followed by some
> > other replication solution as well.
> >
>
> I think that requires quite a bit of infrastructure, and I'd bet we'll
> need to handle other types of conflicts too.
>

Hmm, I don't see why we need any additional infrastructure here if we
do this at the publisher. I think this could be done without many
changes to the patch as explained in one of my previous emails [1].

> I don't have a clear
> opinion if that's required to get this patch working - I'd try getting
> the simplest implementation with reasonable behavior, with those more
> advanced things as future enhancements.
>
> >> and I see them as
> >> improvements of this simple approach.
> >>
> >>>>> Maybe a second option is to have replication change any UPDATE into
> >>>>> either an INSERT or a DELETE, if the old or the new row do not pass the
> >>>>> filter, respectively.  That way, the databases would remain consistent.
> >>>
> >>> Yeah, I think this is the best way to keep the data consistent.
> >>>
> >>
> >> It'd also require REPLICA IDENTITY FULL, which seems like it'd add a
> >> rather significant overhead.
> >>
> >
> > Why? I think it would just need similar restrictions as we are
> > planning for Delete operation such that filter columns must be either
> > present in primary or replica identity columns.
> >
>
> How else would you turn UPDATE to INSERT? For UPDATE we only send the
> identity columns and modified columns, and the decision happens on the
> subscriber.
>

Hmm, we log the entire new tuple and replica identity columns for the
old tuple in WAL for Update. And, we are going to use a new tuple for
Insert, so we have everything we need.


[1] - https://www.postgresql.org/message-id/CAA4eK1%2BAXEd5bO-qPp6L9Ptckk09nbWvP8V7q5UW4hg%2BkHjXwQ%40mail.gmail.com

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Skipping logical replication transactions on subscriber side
Next
From: Dilip Kumar
Date:
Subject: Re: row filtering for logical replication