Re: row filtering for logical replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: row filtering for logical replication
Date
Msg-id CAA4eK1LExOLLAM9eNLedfpLsmWR_0piRtAvSG7bUJjbL5UeHog@mail.gmail.com
Whole thread Raw
In response to Re: row filtering for logical replication  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Sat, Sep 25, 2021 at 3:07 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 9/24/21 8:09 AM, Amit Kapila wrote:
> > On Thu, Sep 23, 2021 at 6:03 PM Tomas Vondra
> > <tomas.vondra@enterprisedb.com> wrote:
> >>
> >> 13) turning update into insert
> >>
> >> I agree with Ajin Cherian [4] that looking at just old or new row for
> >> updates is not the right solution, because each option will "break" the
> >> replica in some case. So I think the goal "keeping the replica in sync"
> >> is the right perspective, and converting the update to insert/delete if
> >> needed seems appropriate.
> >>
> >> This seems a somewhat similar to what pglogical does, because that may
> >> also convert updates (although only to inserts, IIRC) when handling
> >> replication conflicts. The difference is pglogical does all this on the
> >> subscriber, while this makes the decision on the publisher.
> >>
> >> I wonder if this might have some negative consequences, or whether
> >> "moving" this to downstream would be useful for other purposes in the
> >> fuure (e.g. it might be reused for handling other conflicts).
> >>
> >
> > Apart from additional traffic, I am not sure how will we handle all
> > the conditions on subscribers, say if the new row doesn't match, how
> > will subscribers know about this unless we pass row_filter or some
> > additional information along with tuple. Previously, I have done some
> > research and shared in one of the emails above that IBM's InfoSphere
> > Data Replication [1] performs filtering in this way which also
> > suggests that we won't be off here.
> >
>
> I'm certainly not suggesting what we're doing is wrong. Given the design
> of built-in logical replication it makes sense doing it this way, I was
> just thinking aloud about what we might want to do in the future (e.g.
> pglogical uses this to deal with conflicts between multiple sources, and
> so on).
>

Fair enough.

> >>
> >>
> >> 15) pgoutput_row_filter initializing filter
> >>
> >> I'm not sure I understand why the filter initialization gets moved from
> >> get_rel_sync_entry. Presumably, most of what the replication does is
> >> replicating rows, so I see little point in not initializing this along
> >> with the rest of the rel_sync_entry.
> >>
> >
> > Sorry, IIRC, this has been suggested by me and I thought it was best
> > to do any expensive computation the first time it is required. I have
> > shared few cases like in [2] where it would lead to additional cost
> > without any gain. Unless I am missing something, I don't see any
> > downside of doing it in a delayed fashion.
> >
>
> Not sure, but the arguments presented there seem a bit wonky ...
>
> Yes, the work would be wasted if we discard the cached data without
> using it (it might happen for truncate, I'm not sure). But how likely is
> it that such operations happen *in isolation*? I'd bet the workload is
> almost never just a stream of truncates - there are always some
> operations in between that would actually use this.
>

It could also happen with a mix of truncate and other operations as we
decide whether to publish an operation or not after
get_rel_sync_entry.

> Similarly for the errors - IIRC hitting an error means the replication
> restarts, which is orders of magnitude more expensive than anything we
> can save by this delayed evaluation.
>
> I'd keep it simple, for the sake of simplicity of the whole patch.
>

The current version proposed by Peter is not reviewed yet and by
looking at it I have some questions too which I'll clarify in a
separate email. I am not sure if you are against delaying the
expression initialization because of the current code or concept as a
general because if it is later then we have other instances as well
when we don't do all the work in get_rel_sync_entry like building
tuple conversion map which is cached as well.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: Fixing WAL instability in various TAP tests
Next
From: Amit Kapila
Date:
Subject: Re: row filtering for logical replication