On Tue, Dec 13, 2022 at 7:35 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Mon, 12 Dec 2022 18:10:00 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
> > On Mon, Dec 12, 2022 at 1:04 PM Hayato Kuroda (Fujitsu)
> > <kuroda.hayato@fujitsu.com> wrote:
> > > once and apply later. Our basic design is as follows:
> > >
> > > * All transactions areserialized to files if min_apply_delay is set to non-zero.
> > > * After receiving the commit message and spending time, workers reads and
> > > applies spooled messages
> > >
> >
> > I think this may be more work than required because in some cases
> > doing I/O just to delay xacts will later lead to more work. Can't we
> > send some ping to walsender to communicate that walreceiver is alive?
> > We already seem to be sending a ping in LogicalRepApplyLoop if we
> > haven't heard anything from the server for more than
> > wal_receiver_timeout / 2. Now, it is possible that the walsender is
> > terminated due to some other reason and we need to see if we can
> > detect that or if it will only be detected once the walreceiver's
> > delay time is over.
>
> FWIW, I thought the same thing with Amit.
>
> What we should do here is logrep workers notifying to walsender that
> it's living and the communication in-between is fine, and maybe the
> worker's status. Spontaneous send_feedback() calls while delaying will
> be sufficient for this purpose. We might need to supress extra forced
> feedbacks instead. In contrast the worker doesn't need to bother to
> know whether the peer is living until it receives the next data. But
> we might need to adjust the wait_time in LogicalRepApplyLoop().
>
> But, I'm not sure what will happen when walsender is blocked by
> buffer-full for a long time.
>
Yeah, I think ideally it will timeout but if we have a solution like
during delay, we keep sending ping messages time-to-time, it should
work fine. However, that needs to be verified. Do you see any reasons
why that won't work?
--
With Regards,
Amit Kapila.