Re: Exit walsender before confirming remote flush in logical replication - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Exit walsender before confirming remote flush in logical replication
Date
Msg-id CAA4eK1L+JVwRiy9Z=7DK5B_FGb8hJcsH=nz8BvPFVg82s7zH5g@mail.gmail.com
Whole thread Raw
In response to Re: Exit walsender before confirming remote flush in logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Responses RE: Exit walsender before confirming remote flush in logical replication
List pgsql-hackers
On Tue, Dec 27, 2022 at 2:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Dec 27, 2022 at 1:44 PM Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > Thanks for checking my proposal!
> >
> > > - * Note that if we determine that there's still more data to send, this
> > > - * function will return control to the caller.
> > > + * Note that if we determine that there's still more data to send or we are in
> > > + * the physical replication more, this function will return control to the
> > > + * caller.
> > >
> > > I think in this comment you meant to say
> > >
> > > 1. "or we are in physical replication mode and all WALs are not yet replicated"
> > > 2. Typo /replication more/replication mode
> >
> > Firstly I considered 2, but I thought 1 seemed to be better.
> > PSA the updated patch.
> >
>
> I think even for logical replication we should check whether there is
> any pending WAL (via pq_is_send_pending()) to be sent. Otherwise, what
> is the point to send the done message? Also, the caller of
> WalSndDone() already has that check which is another reason why I
> can't see why you didn't have the same check in function WalSndDone().
>
> BTW, even after fixing this, I think logical replication will behave
> differently when due to some reason (like time-delayed replication)
> send buffer gets full and walsender is not able to send data. I think
> this will be less of an issue with physical replication because there
> is a separate walreceiver process to flush the WAL which doesn't wait
> but the same is not true for logical replication. Do you have any
> thoughts on this matter?
>

In logical replication, it can happen today as well without
time-delayed replication. Basically, say apply worker is waiting to
acquire some lock that is already acquired by some backend then it
will have the same behavior. I have not verified this, so you may want
to check it once.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Exit walsender before confirming remote flush in logical replication
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: Time delayed LR (WAS Re: logical replication restrictions)