Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Skipping logical replication transactions on subscriber side
Date
Msg-id CAA4eK1LgtDyayec1FBJ9MUfPmUxVFR8_Umj+xvUnnUrkt_hs7Q@mail.gmail.com
Whole thread Raw
In response to Re: Skipping logical replication transactions on subscriber side  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Skipping logical replication transactions on subscriber side
List pgsql-hackers
On Tue, Jun 1, 2021 at 10:07 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Jun 1, 2021 at 1:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Jun 1, 2021 at 12:55 AM Peter Eisentraut
> > <peter.eisentraut@enterprisedb.com> wrote:
> > >
> > > On 27.05.21 12:04, Amit Kapila wrote:
> > > >>> Also, I am thinking that instead of a stat view, do we need
> > > >>> to consider having a system table (pg_replication_conflicts or
> > > >>> something like that) for this because what if stats information is
> > > >>> lost (say either due to crash or due to udp packet loss), can we rely
> > > >>> on stats view for this?
> > > >> Yeah, it seems better to use a catalog.
> > > >>
> > > > Okay.
> > >
> > > Could you store it shared memory?  You don't need it to be crash safe,
> > > since the subscription will just run into the same error again after
> > > restart.  You just don't want it to be lost, like with the statistics
> > > collector.
> > >
> >
> > But, won't that be costly in cases where we have errors in the
> > processing of very large transactions? Subscription has to process all
> > the data before it gets an error.
>
> I had the same concern. Particularly, the approach we currently
> discussed is to skip the transaction based on the information written
> by the worker rather than require the user to specify the XID.
>

Yeah, but I was imagining that the user still needs to specify
something to indicate that we need to skip it, otherwise, we might try
to skip a transaction that the user wants to resolve by itself rather
than expecting us to skip it. Another point is if we don't store this
information in a persistent way then how will we restrict a user to
specify some random XID which is not even errored after restart.

> Therefore, we will always require the worker to process the same large
> transaction after the restart in order to skip the transaction.
>
> > I think we can even imagine this
> > feature to be extended to use commitLSN as a skip candidate in which
> > case we can even avoid getting the data of that transaction from the
> > publisher. So if this information is persistent, the user can even set
> > the skip identifier after the restart before the publisher can send
> > all the data.
>
> Another possible benefit of writing it to a catalog is that we can
> replicate it to the physical standbys. If we have failover slots in
> the future, the physical standby server also can resolve the conflict
> without processing a possibly large transaction.
>

makes sense.

> > I think the XID (or say another identifier like commitLSN) which we
> > want to use for skipping the transaction as specified by the user has
> > to be stored in the catalog because otherwise, after the restart we
> > won't remember it and the user won't know that he needs to set it
> > again. Now, say we have multiple skip identifiers (XIDs, commitLSN,
> > ..), isn't it better to store all conflict-related information in a
> > separate catalog like pg_subscription_conflict or something like that.
> > I think it might be also better to later extend it for auto conflict
> > resolution where the user can specify auto conflict resolution info
> > for a subscription. Is it better to store all such information in
> > pg_subscription or have a separate catalog? It is possible that even
> > if we have a separate catalog for conflict info, we might not want to
> > store error info there.
>
> Just to be clear, we need to store only the conflict-related
> information that cannot be resolved without manual intervention,
> right? That is, conflicts cause an error, exiting the workers. In
> general, replication conflicts include also conflicts that don’t cause
> an error. I think that those conflicts don’t necessarily need to be
> stored in the catalog and don’t require manual intervention.
>

Yeah, I think we want to record the error cases but which other
conflicts you are talking about here which doesn't lead to any sort of
error?

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Decoding speculative insert with toast leaks memory
Next
From: Dilip Kumar
Date:
Subject: Re: Decoding speculative insert with toast leaks memory