Re: Exit walsender before confirming remote flush in logical replication - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Exit walsender before confirming remote flush in logical replication
Date
Msg-id 20221223.112154.220544589754432382.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: Exit walsender before confirming remote flush in logical replication  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Responses Re: Exit walsender before confirming remote flush in logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
RE: Exit walsender before confirming remote flush in logical replication  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
List pgsql-hackers
At Thu, 22 Dec 2022 17:29:34 +0530, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote in 
> On Thu, Dec 22, 2022 at 11:16 AM Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> > In case of logical replication, however, we cannot support the use-case that
> > switches the role publisher <-> subscriber. Suppose same case as above, additional
..
> > Therefore, I think that we can ignore the condition for shutting down the
> > walsender in logical replication.
...
> > This change may be useful for time-delayed logical replication. The walsender
> > waits the shutdown until all changes are applied on subscriber, even if it is
> > delayed. This causes that publisher cannot be stopped if large delay-time is
> > specified.
> 
> I think the current behaviour is an artifact of using the same WAL
> sender code for both logical and physical replication.

Yeah, I don't think we do that for the reason of switchover. On the
other hand I think the behavior was intentionally taken over since it
is thought as sensible alone. And I'm afraind that many people already
relies on that behavior.

> I agree with you that the logical WAL sender need not wait for all the
> WAL to be replayed downstream.

Thus I feel that it might be a bit outrageous to get rid of that
bahavior altogether because of a new feature stumbling on it.  I'm
fine doing that only in the apply_delay case, though.  However, I have
another concern that we are introducing the second exception for
XLogSendLogical in the common path.

I doubt that anyone wants to use synchronous logical replication with
apply_delay since the sender transaction is inevitablly affected back
by that delay.

Thus how about before entering an apply_delay, logrep worker sending a
kind of crafted feedback, which reports commit_data.end_lsn as
flushpos?  A little tweak is needed in send_feedback() but seems to
work..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Richard Guo
Date:
Subject: Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: Force streaming every change in logical decoding