Re: Exit walsender before confirming remote flush in logical replication - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Exit walsender before confirming remote flush in logical replication
Date
Msg-id CAHGQGwHnSSg=2Kue-bBqQyER+GYkuADNN4OHUfp9XStOfv1LEw@mail.gmail.com
Whole thread
In response to Re: Exit walsender before confirming remote flush in logical replication  (Chao Li <li.evan.chao@gmail.com>)
List pgsql-hackers
On Wed, Apr 8, 2026 at 5:39 PM Chao Li <li.evan.chao@gmail.com> wrote:
> I don’t have a FreeBSD box to verify that directly. But the document you pointed out seems to state explicitly that,
onUnix-domain sockets, writes go directly to the peer’s receive buffer. If so, the assumption that “the kernel send
bufferisn’t full” no longer really holds on FreeBSD. From this perspective, changing to non-blocking
pq_flush_if_writable()makes sense to me. 

I was thinking about the effect of replacing that pq_flush() with
pq_flush_if_writable().

Under normal conditions, there should be no behavioral difference. In either
case, the end-of-streaming message is sent to the standby or subscriber.

The difference only appears if walsender cannot complete the send() call for
that message immediately (that is, it cannot append the message to the kernel
send buffer). This should be rare, because that pq_flush() call happens under
the condition "WalSndCaughtUp && sentPtr == replicatedPtr &&
!pq_is_send_pending()",
where the send buffer would normally not be full. However, as observed,
this can apparently happen on FreeBSD when using Unix-domain sockets.

Even if pq_flush_if_writable() fails to send the end-of-streaming message,
walreceiver and the logical apply worker seem to behave almost the same
in practice whether they receive it or not. The main differences are how they
detect closure of the replication connection and the resulting log messages.
If that analysis is correct, replacing pq_flush() with pq_flush_if_writable()
seems acceptable to me.

If the end-of-streaming message is received, walreceiver and the apply worker
log messages like the following, then try to send a reply, which fails because
the connection has already been closed:

    [walreceiver] LOG: replication terminated by primary server
    [walreceiver] DETAIL: End of WAL reached ...
    [walreceiver] FATAL: could not send end-of-streaming message to
primary: server closed the connection unexpectedly

    [apply worker] LOG: data stream from publisher has ended
    [apply worker] ERROR: could not send end-of-streaming message to
primary: server closed the connection unexpectedly

If the message is not received, they simply detect the closed connection
while waiting for the next message:

    [walreceiver] FATAL: could not receive data from WAL stream:
server closed the connection unexpectedly

    [apply worker] ERROR: could not receive data from WAL stream:
server closed the connection unexpectedly

Therefore, since replacing pq_flush() with pq_flush_if_writable() seems to
change behavior only in a limited and acceptable way, I'm thinking to create
the patch doing that replacement.

BTW, though this is not directly related to this topic, I'm also wondering
whether walreceiver can ever successfully send an end-of-streaming message
back to walsender. It appears to attempt that only after the replication
connection has already been closed, which would seem to make it fail every time.

Regards,

--
Fujii Masao



pgsql-hackers by date:

Previous
From: Alexander Lakhin
Date:
Subject: Re: Incorrect checksum in control file with pg_rewind test
Next
From: Melanie Plageman
Date:
Subject: Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)