Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Skipping logical replication transactions on subscriber side
Date
Msg-id CAA4eK1+7XD5iiFXfGmVrWjZi-HuCqEn_-XR_c9j+msQ-sr=QCw@mail.gmail.com
Whole thread Raw
In response to Re: Skipping logical replication transactions on subscriber side  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Skipping logical replication transactions on subscriber side
List pgsql-hackers
On Mon, Dec 13, 2021 at 8:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, Dec 11, 2021 at 3:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > 3.
> > + * Also, we don't skip receiving the changes in streaming cases,
> > since we decide
> > + * whether or not to skip applying the changes when starting to apply changes.
> >
> > But why so? Can't we even skip streaming (and writing to file all such
> > messages)? If we can do this then we can avoid even collecting all
> > messages in a file.
>
> IIUC in streaming cases, a transaction can be sent to the subscriber
> while splitting into multiple chunks of changes. In the meanwhile,
> skip_xid can be changed. If the user changed or cleared skip_xid after
> the subscriber skips some streamed changes, the subscriber won't able
> to have complete changes of the transaction.
>

Yeah, I think if we want we can handle this by writing into the stream
xid file whether the changes need to be skipped and then the
consecutive streams can check that in the file or may be in some way
don't allow skip_xid to be changed in worker if it is already skipping
some xact. If we don't want to do anything for this then it is better
to at least reflect this reasoning in the comments.

> >
> > 4.
> > + * Also, one might think that we can skip preparing the skipped transaction.
> > + * But if we do that, PREPARE WAL record won’t be sent to its physical
> > + * standbys, resulting in that users won’t be able to find the prepared
> > + * transaction entry after a fail-over.
> > + *
> > ..
> > + */
> > + if (skipping_changes)
> > + stop_skipping_changes(false);
> >
> > Why do we need such a Prepare's entry either at current subscriber or
> > on its physical standby? I think it is to allow Commit-prepared. If
> > so, how about if we skip even commit prepared as well? Even on
> > physical standby, we would be having the value of skip_xid which can
> > help us to skip there as well after failover.
>
> It's true that skip_xid would be set also on physical standby. When it
> comes to preparing the skipped transaction on the current subscriber,
> if we want to skip commit-prepared I think we need protocol changes in
> order for subscribers to know prepare_lsn and preppare_timestampso
> that it can lookup the prepared transaction when doing
> commit-prepared. I proposed this idea before. This change would be
> benefical as of now since the publisher sends even empty transactions.
> But considering the proposed patch[1] that makes the puslisher not
> send empty transaction, this protocol change would be an optimization
> only for this feature.
>

I was thinking to compare the xid received as part of the
commit_prepared message with the value of skip_xid to skip the
commit_prepared but I guess the user would change it between prepare
and commit prepare and then we won't be able to detect it, right? I
think we can handle this and the streaming case if we disallow users
to change the value of skip_xid when we are already skipping changes
or don't let the new skip_xid to reflect in the apply worker if we are
already skipping some other transaction. What do you think?

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Assertion failure with replication origins and PREPARE TRANSACTIOn
Next
From: Thomas Munro
Date:
Subject: Re: Add client connection check during the execution of the query