On Thu Nov 23, 2023 at 3:19 AM CST, Heikki Linnakangas wrote:
> On 22/11/2023 23:29, Tristan Partin wrote:
> > Ha, you're right. I had this working yesterday, but convinced myself it
> > didn't. I had a do while loop wrapping the blocking call. Here is a v4,
> > which seems to pass the tests that were pointed out to be failing
> > earlier.
>
> Thanks! This suffers from a classic race condition:
>
> > + if (cancel_pressed)
> > + break;
> > +
> > + sock = PQsocket(n_conn);
> > + if (sock == -1)
> > + break;
> > +
> > + rc = pqSocketPoll(sock, for_read, !for_read, (time_t)-1);
> > + Assert(rc != 0); /* Timeout is impossible. */
> > + if (rc == -1)
> > + {
> > + success = false;
> > + break;
> > + }
>
> If a signal arrives between the "if (cancel_pressed)" check and
> pqSocketPoll(), pgSocketPoll will "miss" the signal and block
> indefinitely. There are three solutions to this:
>
> 1. Use a timeout, so that you wake up periodically to check for any
> missed signals. Easy solution, the downsides are that you will not react
> as quickly if the signal is missed, and you waste some cycles by waking
> up unnecessarily.
>
> 2. The Self-pipe trick: https://cr.yp.to/docs/selfpipe.html. We also use
> that in src/backend/storage/ipc/latch.c. It's portable, but somewhat
> complicated.
>
> 3. Use pselect() or ppoll(). They were created specifically to address
> this problem. Not fully portable, I think it's missing on Windows at least.
>
> Maybe use pselect() if it's available, and select() with a timeout if
> it's not.
First I have ever heard of this race condition, and now I will commit it
to durable storage :). I went ahead and did option #3 that you proposed.
On Windows, I have a 1 second timeout for the select/poll. That seemed
like a good balance of user experience and spinning the CPU. But I am
open to other values. I don't have a Windows VM, but maybe I should set
one up...
I am not completely in love with the code I have written. Lots of
conditional compilation which makes it hard to read. Looking forward to
another round of review to see what y'all think.
For what it's worth, I did try #2, but since psql installs its own
SIGINT handler, the code became really hairy.
--
Tristan Partin
Neon (https://neon.tech)