Re: Add WALRCV_CONNECTING state to walreceiver - Mailing list pgsql-hackers
| From | Xuneng Zhou |
|---|---|
| Subject | Re: Add WALRCV_CONNECTING state to walreceiver |
| Date | |
| Msg-id | CABPTF7Wqct32kMbY5s8=BK64WeFhO7KTVSsyU5PStRRGunz7Pg@mail.gmail.com Whole thread Raw |
| In response to | Re: Add WALRCV_CONNECTING state to walreceiver (Xuneng Zhou <xunengzhou@gmail.com>) |
| Responses |
Re: Add WALRCV_CONNECTING state to walreceiver
|
| List | pgsql-hackers |
Hi,
On Sun, Dec 14, 2025 at 4:55 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
>
> Hi,
>
> On Sun, Dec 14, 2025 at 1:14 PM Noah Misch <noah@leadboat.com> wrote:
> >
> > On Sun, Dec 14, 2025 at 12:45:46PM +0800, Xuneng Zhou wrote:
> > > On Fri, Dec 12, 2025 at 9:52 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> > > > On Fri, Dec 12, 2025 at 4:45 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> > > > > On Fri, Dec 12, 2025 at 1:05 PM Noah Misch <noah@leadboat.com> wrote:
> > > > > > Waiting for applyPtr to advance
> > > > > > would avoid the short-lived STREAMING. What's the feasibility of that?
> > > > >
> > > > > I think this could work, but with complications. If replay latency is
> > > > > high or replay is paused with pg_wal_replay_pause, the WalReceiver
> > > > > would stay in the CONNECTING state longer than expected. Whether this
> > > > > is ok depends on the definition of the 'connecting' state. For the
> > > > > implementation, deciding where and when to check applyPtr against LSNs
> > > > > like receiveStart is more difficult—the WalReceiver doesn't know when
> > > > > applyPtr advances. While the WalReceiver can read applyPtr from shared
> > > > > memory, it isn't automatically notified when that pointer advances.
> > > > > This leads to latency between checking and replay if this is done in
> > > > > the WalReceiver part unless we let the startup process set the state,
> > > > > which would couple the two components. Am I missing something here?
> > > >
> > > > After some thoughts, a potential approach could be to expose a new
> > > > function in the WAL receiver that transitions the state from
> > > > CONNECTING to STREAMING. This function can then be invoked directly
> > > > from WaitForWALToBecomeAvailable in the startup process, ensuring the
> > > > state change aligns with the actual acceptance of the WAL stream.
> > >
> > > V2 makes the transition from WALRCV_CONNECTING to STREAMING only when
> > > the first valid WAL record is processed by the startup process. A new
> > > function WalRcvSetStreaming is introduced to enable the transition.
> >
> > The original patch set STREAMING in XLogWalRcvFlush(). XLogWalRcvFlush()
> > callee XLogWalRcvSendReply() already fetches applyPtr to send a status
> > message. So I would try the following before involving the startup process
> > like v2 does:
> >
> > 1. store the applyPtr when we enter CONNECTING
> > 2. force a status message as long as we remain in CONNECTING
> > 3. become STREAMING when applyPtr differs from the one stored at (1)
>
> Thanks for the suggestion. Using XLogWalRcvSendReply() for the
> transition could make sense. My concern before is about latency in a
> rare case: if the first flush completes but applyPtr hasn't advanced
> yet at the time of check and then the flush stalls after that, we
> might wait up to wal_receiver_status_interval (default 10s) before the
> next check or indefinitely if (wal_receiver_status_interval <= 0).
> This could be mitigated by shortening the wakeup interval while in
> CONNECTING (step 2), which reduces worst-case latency to ~1 second.
> Given that monitoring typically doesn't require sub-second precision,
> this approach could be feasible.
>
> case WALRCV_WAKEUP_REPLY:
> if (WalRcv->walRcvState == WALRCV_CONNECTING)
> {
> /* Poll frequently while CONNECTING to avoid long latency */
> wakeup[reason] = TimestampTzPlusMilliseconds(now, 1000);
> }
>
> > A possible issue with all patch versions: when the primary is writing no WAL
> > and the standby was caught up before this walreceiver started, CONNECTING
> > could persist for an unbounded amount of time. Only actual primary WAL
> > generation would move the walreceiver to STREAMING. This relates to your
> > above point about high latency. If that's a concern, perhaps this change
> > deserves a total of two new states, CONNECTING and a state that represents
> > "connection exists, no WAL yet applied"?
>
> Yes, this could be an issue. Using two states would help address it.
> That said, when the primary is idle in this case, we might end up
> repeatedly polling the apply status in the state before streaming if
> we implement the 1s short-interval checking like above, which could be
> costful. However, If we do not implement it &&
> wal_receiver_status_interval is set to < 0 && flush stalls, the
> walreceiver could stay in the pre-streaming state indefinitely even if
> streaming did occur, which violates the semantics. Do you think this
> is a valid concern or just an artificial edge case?
After looking more closely, I found that true indefinite waiting
requires ALL of:
wal_receiver_status_interval <= 0 (disables status updates)
wal_receiver_timeout <= 0
Primary sends no keepalives
No more WAL arrives after the first failed-check flush
Startup never sets force_reply
which is quite impossible and artificial, sorry for the noise here.
The worst-case latency of state-transition in the scenario described
above would be max(Primary keepalive, REPLY timeout, PING timeout),
which might be ok without the short-interval mitigation, given this
case is pretty rare. I plan to implement the following approach with
two new states like you suggested as v3.
1. enter CONNECTING
2. transite the state to CONNECTED/IDLE when START_REPLICATION
succeeds, store the applyPtr
2. force a status message in XLogWalRcvFlush as long as we remain in
CONNECTED/IDLE
3. become STREAMING when applyPtr differs from the one stored at (2)
--
Best,
Xuneng
pgsql-hackers by date: