Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Skipping logical replication transactions on subscriber side
Date
Msg-id CAD21AoAmJW_G094TUO=cWVPL01V==bWTgGvokO0G700bP=ifsg@mail.gmail.com
Whole thread Raw
In response to Re: Skipping logical replication transactions on subscriber side  (Amit Kapila <amit.kapila16@gmail.com>)
Responses RE: Skipping logical replication transactions on subscriber side  ("osumi.takamichi@fujitsu.com" <osumi.takamichi@fujitsu.com>)
List pgsql-hackers
On Thu, Mar 17, 2022 at 5:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Mar 17, 2022 at 12:39 PM osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
> >
> > On Thursday, March 17, 2022 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > On Wed, Mar 16, 2022 at 1:53 PM Masahiko Sawada
> > > <sawada.mshk@gmail.com> wrote:
> > > >
> > > > I've attached an updated version patch.
> > > >
> > >
> > > The patch LGTM. I have made minor changes in comments and docs in the
> > > attached patch. Kindly let me know what you think of the attached?
> > Hi, thank you for the patch. Few minor comments.
> >
> >
> > (1) comment of maybe_start_skipping_changes
> >
> >
> > +       /*
> > +        * Quick return if it's not requested to skip this transaction. This
> > +        * function is called for every remote transaction and we assume that
> > +        * skipping the transaction is not used often.
> > +        */
> >
> > I feel this comment should explain more about our intention and
> > what it confirms. In a case when user requests skip,
> > but it doesn't match the condition, we don't start
> > skipping changes, strictly speaking.
> >
> > From:
> > Quick return if it's not requested to skip this transaction.
> >
> > To:
> > Quick return if we can't ensure possible skiplsn is set
> > and it equals to the finish LSN of this transaction.
> >
>
> Hmm, the current comment seems more appropriate. What you are
> suggesting is almost writing the code in sentence form.
>
> >
> > (2) 029_on_error.pl
> >
> > +       my $contents = slurp_file($node_subscriber->logfile, $offset);
> > +       $contents =~
> > +         qr/processing remote data for replication origin \"pg_\d+\" during "INSERT" for replication target
relation"public.tbl" in transaction \d+ finishe$ 
> > +         or die "could not get error-LSN";
> >
> > I think we shouldn't use a lot of new words.
> >
> > How about a change below  ?
> >
> > From:
> > could not get error-LSN
> > To:
> > failed to find expected error message that contains finish LSN for SKIP option
> >
> >
> > (3) apply_handle_commit_internal
> >
> ...
> >
> > I feel if we move those two functions at the end
> > of the apply_handle_commit and apply_handle_stream_commit,
> > then we will have more aligned codes and improve readability.
> >

I think we cannot just move them to the end of apply_handle_commit()
and apply_handle_stream_commit(). Because if we do that, we end up
missing updating replication_session_origin_lsn/timestamp when
clearing the subskiplsn if we're skipping a non-stream transaction.

Basically, the apply worker differently handles 2pc transactions and
non-2pc transactions; we always prepare even empty transactions
whereas we don't commit empty non-2pc transactions. So I think we
don’t have to handle both in the same way.

> I think the intention is to avoid duplicate code as we have a common
> function that gets called from both of those.

Yes.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: ICU for global collation
Next
From: Peter Eisentraut
Date:
Subject: Re: Proposal: Support custom authentication methods using hooks