Re: psql not responding to SIGINT upon db reconnection - Mailing list pgsql-hackers

From Tristan Partin
Subject Re: psql not responding to SIGINT upon db reconnection
Date
Msg-id CXBHH2O58HVA.JUP07AES8JVK@neon.tech
Whole thread Raw
In response to Re: psql not responding to SIGINT upon db reconnection  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: psql not responding to SIGINT upon db reconnection
List pgsql-hackers
On Thu Nov 23, 2023 at 3:19 AM CST, Heikki Linnakangas wrote:
> On 22/11/2023 23:29, Tristan Partin wrote:
> > Ha, you're right. I had this working yesterday, but convinced myself it
> > didn't. I had a do while loop wrapping the blocking call. Here is a v4,
> > which seems to pass the tests that were pointed out to be failing
> > earlier.
>
> Thanks! This suffers from a classic race condition:
>
> > +            if (cancel_pressed)
> > +                break;
> > +
> > +            sock = PQsocket(n_conn);
> > +            if (sock == -1)
> > +                break;
> > +
> > +            rc = pqSocketPoll(sock, for_read, !for_read, (time_t)-1);
> > +            Assert(rc != 0); /* Timeout is impossible. */
> > +            if (rc == -1)
> > +            {
> > +                success = false;
> > +                break;
> > +            }
>
> If a signal arrives between the "if (cancel_pressed)" check and
> pqSocketPoll(), pgSocketPoll will "miss" the signal and block
> indefinitely. There are three solutions to this:
>
> 1. Use a timeout, so that you wake up periodically to check for any
> missed signals. Easy solution, the downsides are that you will not react
> as quickly if the signal is missed, and you waste some cycles by waking
> up unnecessarily.
>
> 2. The Self-pipe trick: https://cr.yp.to/docs/selfpipe.html. We also use
> that in src/backend/storage/ipc/latch.c. It's portable, but somewhat
> complicated.
>
> 3. Use pselect() or ppoll(). They were created specifically to address
> this problem. Not fully portable, I think it's missing on Windows at least.
>
> Maybe use pselect() if it's available, and select() with a timeout if
> it's not.

First I have ever heard of this race condition, and now I will commit it
to durable storage :). I went ahead and did option #3 that you proposed.
On Windows, I have a 1 second timeout for the select/poll. That seemed
like a good balance of user experience and spinning the CPU. But I am
open to other values. I don't have a Windows VM, but maybe I should set
one up...

I am not completely in love with the code I have written. Lots of
conditional compilation which makes it hard to read. Looking forward to
another round of review to see what y'all think.

For what it's worth, I did try #2, but since psql installs its own
SIGINT handler, the code became really hairy.

--
Tristan Partin
Neon (https://neon.tech)

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: remaining sql/json patches
Next
From: Nathan Bossart
Date:
Subject: Re: Fix assertion in autovacuum worker