Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Skipping logical replication transactions on subscriber side
Date
Msg-id CAA4eK1JTmo4OixT50ip6CJ-iL8zfwnHhNfHzKCDVU4Pgagsj+Q@mail.gmail.com
Whole thread Raw
In response to Re: Skipping logical replication transactions on subscriber side  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Responses Re: Skipping logical replication transactions on subscriber side  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Tue, Jun 1, 2021 at 9:05 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 01.06.21 06:01, Amit Kapila wrote:
> > But, won't that be costly in cases where we have errors in the
> > processing of very large transactions? Subscription has to process all
> > the data before it gets an error. I think we can even imagine this
> > feature to be extended to use commitLSN as a skip candidate in which
> > case we can even avoid getting the data of that transaction from the
> > publisher. So if this information is persistent, the user can even set
> > the skip identifier after the restart before the publisher can send
> > all the data.
>
> At least in current practice, skipping parts of the logical replication
> stream on the subscriber is a rare, emergency-level operation when
> something that shouldn't have happened happened.  So it doesn't really
> matter how costly it is.  It's not going to be more costly than the
> error happening in the first place.  All you'd need is one shared memory
> slot per subscription to store a xid to skip.
>

Leaving aside the performance point, how can we do by just storing
skip identifier (XID/commitLSN) in shared_memory? How will the apply
worker know about that information after restart? Do you expect the
user to set it again, if so, I think users might not like that? Also,
how will we prohibit users to give some identifier other than for
failed transactions, and if users provide that what should be our
action? Without that, if users provide XID of some in-progress
transaction, we might need to do more work (rollback) than just
skipping it.

> We will also want some proper conflict handling at some point.  But I
> think what is being discussed here is meant to be a repair tool, not a
> policy tool, and I'm afraid it might get over-engineered.
>

I got your point but I am also a bit skeptical that handling all
boundary cases might become tricky if we go with a simple shared
memory technique but OTOH if we can handle all such cases then it is
fine.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Decoding speculative insert with toast leaks memory
Next
From: Dilip Kumar
Date:
Subject: Re: Decoding speculative insert with toast leaks memory