Hi,
On 2019-04-27 20:56:51 -0400, Tom Lane wrote:
> Even if that isn't the proximate cause of the current reports, it's
> clearly trouble waiting to happen, and we should get rid of it.
> Accordingly, see attached proposed patch. This just flushes the
> "immediate interrupt" stuff in favor of making sure that
> libpqwalreceiver.c will take care of any signals received while
> waiting for input.
Good plan.
> The existing code does not use PQsetnonblocking, which means that it's
> theoretically at risk of blocking while pushing out data to the remote
> server. In practice I think that risk is negligible because (IIUC) we
> don't send very large amounts of data at one time. So I didn't bother to
> change that. Note that for the most part, if that happened, the existing
> code was at risk of slow response to SIGTERM anyway since it didn't have
> Enable/DisableWalRcvImmediateExit around the places that send data.
Hm, I'm not convinced that's OK. What if there's a network hickup? We'll
wait until there's an OS tcp timeout, no? It's bad enough that there
were cases of this before. Increasing the surface of cases where we
might want to shut down walreceiver, e.g. because we would rather switch
to recovery_command, or just shut down the server, but just get stuck
waiting for an hour for a tcp timeout, doesn't seem OK.
Greetings,
Andres Freund