Re: Why is src/test/modules/committs/t/002_standby.pl flaky? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Why is src/test/modules/committs/t/002_standby.pl flaky?
Date
Msg-id 20220114204726.jhu7erh26g7yup3l@alap3.anarazel.de
Whole thread Raw
In response to Re: Why is src/test/modules/committs/t/002_standby.pl flaky?  (Andres Freund <andres@anarazel.de>)
Responses Re: Why is src/test/modules/committs/t/002_standby.pl flaky?
List pgsql-hackers
Hi,

On 2022-01-14 12:28:48 -0800, Andres Freund wrote:
> But once all the data is read, walsender.c will do another
> WaitLatchOrSocket(), which does WSAEventSelect(), clearing the "internal event
> record" and loosing the FD_CLOSE.

Walreceiver only started using WES in
2016-03-29 [314cbfc5d] Add new replication mode synchronous_commit = 'remote_ap

With that came the following comment:

                /*
                 * Ideally we would reuse a WaitEventSet object repeatedly
                 * here to avoid the overheads of WaitLatchOrSocket on epoll
                 * systems, but we can't be sure that libpq (or any other
                 * walreceiver implementation) has the same socket (even if
                 * the fd is the same number, it may have been closed and
                 * reopened since the last time).  In future, if there is a
                 * function for removing sockets from WaitEventSet, then we
                 * could add and remove just the socket each time, potentially
                 * avoiding some system calls.
                 */
                Assert(wait_fd != PGINVALID_SOCKET);
                rc = WaitLatchOrSocket(MyLatch,
                                       WL_EXIT_ON_PM_DEATH | WL_SOCKET_READABLE |
                                       WL_TIMEOUT | WL_LATCH_SET,
                                       wait_fd,
                                       NAPTIME_PER_CYCLE,
                                       WAIT_EVENT_WAL_RECEIVER_MAIN);

I don't really see how libpq could have changed the socket underneath us, as
long as we get it the first time after the connection successfully was
established?  I mean, there's a running command that we're processing the
result of?  Nor do I understand what "any other walreceiver implementation"
refers to?

Thomas, I think you wrote that?


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Why is src/test/modules/committs/t/002_standby.pl flaky?
Next
From: Robert Haas
Date:
Subject: Re: Refactoring of compression options in pg_basebackup