Re: row filtering for logical replication - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: row filtering for logical replication |
Date | |
Msg-id | CAA4eK1LExOLLAM9eNLedfpLsmWR_0piRtAvSG7bUJjbL5UeHog@mail.gmail.com Whole thread Raw |
In response to | Re: row filtering for logical replication (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
List | pgsql-hackers |
On Sat, Sep 25, 2021 at 3:07 AM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > On 9/24/21 8:09 AM, Amit Kapila wrote: > > On Thu, Sep 23, 2021 at 6:03 PM Tomas Vondra > > <tomas.vondra@enterprisedb.com> wrote: > >> > >> 13) turning update into insert > >> > >> I agree with Ajin Cherian [4] that looking at just old or new row for > >> updates is not the right solution, because each option will "break" the > >> replica in some case. So I think the goal "keeping the replica in sync" > >> is the right perspective, and converting the update to insert/delete if > >> needed seems appropriate. > >> > >> This seems a somewhat similar to what pglogical does, because that may > >> also convert updates (although only to inserts, IIRC) when handling > >> replication conflicts. The difference is pglogical does all this on the > >> subscriber, while this makes the decision on the publisher. > >> > >> I wonder if this might have some negative consequences, or whether > >> "moving" this to downstream would be useful for other purposes in the > >> fuure (e.g. it might be reused for handling other conflicts). > >> > > > > Apart from additional traffic, I am not sure how will we handle all > > the conditions on subscribers, say if the new row doesn't match, how > > will subscribers know about this unless we pass row_filter or some > > additional information along with tuple. Previously, I have done some > > research and shared in one of the emails above that IBM's InfoSphere > > Data Replication [1] performs filtering in this way which also > > suggests that we won't be off here. > > > > I'm certainly not suggesting what we're doing is wrong. Given the design > of built-in logical replication it makes sense doing it this way, I was > just thinking aloud about what we might want to do in the future (e.g. > pglogical uses this to deal with conflicts between multiple sources, and > so on). > Fair enough. > >> > >> > >> 15) pgoutput_row_filter initializing filter > >> > >> I'm not sure I understand why the filter initialization gets moved from > >> get_rel_sync_entry. Presumably, most of what the replication does is > >> replicating rows, so I see little point in not initializing this along > >> with the rest of the rel_sync_entry. > >> > > > > Sorry, IIRC, this has been suggested by me and I thought it was best > > to do any expensive computation the first time it is required. I have > > shared few cases like in [2] where it would lead to additional cost > > without any gain. Unless I am missing something, I don't see any > > downside of doing it in a delayed fashion. > > > > Not sure, but the arguments presented there seem a bit wonky ... > > Yes, the work would be wasted if we discard the cached data without > using it (it might happen for truncate, I'm not sure). But how likely is > it that such operations happen *in isolation*? I'd bet the workload is > almost never just a stream of truncates - there are always some > operations in between that would actually use this. > It could also happen with a mix of truncate and other operations as we decide whether to publish an operation or not after get_rel_sync_entry. > Similarly for the errors - IIRC hitting an error means the replication > restarts, which is orders of magnitude more expensive than anything we > can save by this delayed evaluation. > > I'd keep it simple, for the sake of simplicity of the whole patch. > The current version proposed by Peter is not reviewed yet and by looking at it I have some questions too which I'll clarify in a separate email. I am not sure if you are against delaying the expression initialization because of the current code or concept as a general because if it is later then we have other instances as well when we don't do all the work in get_rel_sync_entry like building tuple conversion map which is cached as well. -- With Regards, Amit Kapila.
pgsql-hackers by date: