On Fri, Jan 30, 2026 at 1:19 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Fujii-san,
>
> > This approach doesn't seem helpful on platforms that don't support
> > TCP_USER_TIMEOUT, i.e., tcp_user_timeout is not available. Right?
> > If I remember correctly, Windows is one of those platforms.
>
> Good point, documentation said it's not usable for Windows.
> The easiest fix I can come up with is to determine a timeout for checkpoint wait;
> ConditionVariableTimedSleep() can be used in InvalidatePossiblyObsoleteSlot(),
> we can put some LOG and skip invalidating for a while. Not sure how long we
> should wait but at least we can use the a fixed value. This might be similar
> with your second option.
> Regarding the first option, it can solve the root cause, but I'm afraid we may
> have to modify very common code.
Yeah, but I'd like to try the first option. Attached is a very WIP patch that
attempts to implement it.
With this patch, when a walsender exits with >= FATAL,
send_message_to_frontend() attempts to send the error message to the standby
in non-blocking mode. If that fails, the walsender gives up on sending
the message and exits immediately.
I'm not yet sure whether treating walsender exit as a special case is
acceptable, but I wanted to share this WIP patch to get feedback.
Regards,
--
Fujii Masao