Re: Logical replication timeout problem - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Logical replication timeout problem
Date
Msg-id CAD21AoCLaC-Dj=dcz5hQqcxpzi7h_eDsV5uc2156LkrKa6mLQw@mail.gmail.com
Whole thread Raw
In response to RE: Logical replication timeout problem  ("wangw.fnst@fujitsu.com" <wangw.fnst@fujitsu.com>)
Responses RE: Logical replication timeout problem  ("wangw.fnst@fujitsu.com" <wangw.fnst@fujitsu.com>)
List pgsql-hackers
On Mon, Apr 18, 2022 at 3:16 PM wangw.fnst@fujitsu.com
<wangw.fnst@fujitsu.com> wrote:
>
> On Mon, Apr 18, 2022 at 00:35 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > On Mon, Apr 18, 2022 at 1:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Apr 14, 2022 at 5:50 PM Masahiko Sawada <sawada.mshk@gmail.com>
> > wrote:
> > > >
> > > > On Wed, Apr 13, 2022 at 7:45 PM Amit Kapila <amit.kapila16@gmail.com>
> > wrote:
> > > > >
> > > > > On Mon, Apr 11, 2022 at 12:09 PM wangw.fnst@fujitsu.com
> > > > > <wangw.fnst@fujitsu.com> wrote:
> > > > > >
> > > > > > So I skip tracking lag during a transaction just like the current HEAD.
> > > > > > Attach the new patch.
> > > > > >
> > > > >
> > > > > Thanks, please find the updated patch where I have slightly
> > > > > modified the comments.
> > > > >
> > > > > Sawada-San, Euler, do you have any opinion on this approach? I
> > > > > personally still prefer the approach implemented in v10 [1]
> > > > > especially due to the latest finding by Wang-San that we can't
> > > > > update the lag-tracker apart from when it is invoked at the transaction end.
> > > > > However, I am fine if we like this approach more.
> > > >
> > > > Thank you for updating the patch.
> > > >
> > > > The current patch looks much better than v10 which requires to call
> > > > to
> > > > update_progress() every path.
> > > >
> > > > Regarding v15 patch, I'm concerned a bit that the new function name,
> > > > update_progress(), is too generic. How about
> > > > update_replation_progress() or something more specific name?
> > > >
> > >
> > > Do you intend to say update_replication_progress()? The word
> > > 'replation' doesn't make sense to me. I am fine with this suggestion.
> >
> > Yeah, that was a typo. I meant update_replication_progress().
> Thanks for your comments.
>
> > > > Regarding v15 patch, I'm concerned a bit that the new function name,
> > > > update_progress(), is too generic. How about
> > > > update_replation_progress() or something more specific name?
> Improve as suggested. Change the name from update_progress to
> update_replication_progress.
>
> > > > ---
> > > > +        if (end_xact)
> > > > +        {
> > > > +                /* Update progress tracking at xact end. */
> > > > +                OutputPluginUpdateProgress(ctx, skipped_xact, end_xact);
> > > > +                changes_count = 0;
> > > > +                return;
> > > > +        }
> > > > +
> > > > +        /*
> > > > +         * After continuously processing CHANGES_THRESHOLD changes,
> > > > we try to send
> > > > +         * a keepalive message if required.
> > > > +         *
> > > > +         * We don't want to try sending a keepalive message after
> > > > processing each
> > > > +         * change as that can have overhead. Testing reveals that there is no
> > > > +         * noticeable overhead in doing it after continuously
> > > > processing 100 or so
> > > > +         * changes.
> > > > +         */
> > > > +#define CHANGES_THRESHOLD 100
> > > > +        if (++changes_count >= CHANGES_THRESHOLD)
> > > > +        {
> > > > +                OutputPluginUpdateProgress(ctx, skipped_xact, end_xact);
> > > > +                changes_count = 0;
> > > > +        }
> > > >
> > > > Can we merge two if branches since we do the same things? Or did you
> > > > separate them for better readability?
> Improve as suggested. Merge two if-branches.
>
> Attach the new patch.
> 1. Rename the new function(update_progress) to update_replication_progress. [suggestion by Sawada-San]
> 2. Merge two if-branches in new function update_replication_progress. [suggestion by Sawada-San.]
> 3. Improve comments to make them clear. [suggestions by Euler-San.]

Thank you for updating the patch.

+ * For a large transaction, if we don't send any change to the downstream for a
+ * long time(exceeds the wal_receiver_timeout of standby) then it can timeout.
+ * This can happen when all or most of the changes are either not published or
+ * got filtered out.

+ */
+ if(end_xact || ++changes_count >= CHANGES_THRESHOLD)
+ {

We need a whitespace before '(' at above two places. The rest looks good to me.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Column Filtering in Logical Replication
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: BufferAlloc: don't take two simultaneous locks