Hi,
On 2021-12-11 17:41:34 +1300, Thomas Munro wrote:
> --- a/src/backend/libpq/pqcomm.c
> +++ b/src/backend/libpq/pqcomm.c
> @@ -1959,22 +1959,36 @@ pq_settcpusertimeout(int timeout, Port *port)
> bool
> pq_check_connection(void)
> {
> -#if defined(POLLRDHUP)
> - /*
> - * POLLRDHUP is a Linux extension to poll(2) to detect sockets closed by
> - * the other end. We don't have a portable way to do that without
> - * actually trying to read or write data on other systems. We don't want
> - * to read because that would be confused by pipelined queries and COPY
> - * data. Perhaps in future we'll try to write a heartbeat message instead.
> - */
> - struct pollfd pollfd;
> + WaitEvent event;
> int rc;
>
> - pollfd.fd = MyProcPort->sock;
> - pollfd.events = POLLOUT | POLLIN | POLLRDHUP;
> - pollfd.revents = 0;
> + /*
> + * Temporarily ignore the latch, so that we can poll for just the one
> + * event we care about.
> + */
> + ModifyWaitEvent(FeBeWaitSet, FeBeWaitSetLatchPos, WL_LATCH_SET, NULL);
>
> - rc = poll(&pollfd, 1, 0);
> + PG_TRY();
> + {
> + /*
> + * It's OK to clobber the socket event to report only the event we're
> + * interested in without restoring the previous state afterwards,
> + * because every FeBeWaitSet wait site does the same.
> + */
> + ModifyWaitEvent(FeBeWaitSet, FeBeWaitSetSocketPos, WL_SOCKET_CLOSED,
> + NULL);
> + rc = WaitEventSetWait(FeBeWaitSet, 0, &event, 1, 0);
> + }
> + PG_FINALLY();
> + {
> + /*
> + * Restore the latch, so we can't leave FeBeWaitSet in a broken state
> + * that ignores latches.
> + */
> + ModifyWaitEvent(FeBeWaitSet, FeBeWaitSetLatchPos, WL_LATCH_SET,
> + MyLatch);
> + }
> + PG_END_TRY();
Yuck. Is there really no better way to deal with this? What kind of errors is
this trying to handle transparently? Afaics this still changes when we'd
e.g. detect postmaster death.
Am I misunderstanding code or comment, or is the comment saying that it's ok
to clobber the wes, but then we actually unclobber it?
Greetings,
Andres Freund