Thread: psql not responding to SIGINT upon db reconnection

psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
Neon provides a quick start mechanism for psql using the following
workflow:

    $ psql -h pg.neon.tech
    NOTICE:  Welcome to Neon!
    Authenticate by visiting:
        https://console.neon.tech/psql_session/xxx

Upon navigating to the link, the user selects their database to work
with. The psql process is then connected to a Postgres database at Neon.

psql provides a way to reconnect to the database from within itself: \c.
If, after connecting to a database such as the process previously
mentioned, the user starts a reconnection to the database from within
psql, the process will refuse to interrupt the reconnection attempt
after sending SIGINT to it.

    tristan=> \c
    NOTICE:  Welcome to Neon!
    Authenticate by visiting:
        https://console.neon.tech/psql_session/yyy

    ^C
    ^C^C
    ^C^C

I am not really sure if this was a problem on Windows machines, but the
attached patch should support it if it does. If this was never a problem
on Windows, it might be best to check for any regressions.

This was originally discussed in the a GitHub issue[0] at Neon.

[0]: https://github.com/neondatabase/neon/issues/1449

--
Tristan Partin
Neon (https://neon.tech)

Attachment

Re: psql not responding to SIGINT upon db reconnection

From
Gurjeet Singh
Date:
On Mon, Jul 24, 2023 at 9:26 AM Tristan Partin <tristan@neon.tech> wrote:

> attached patch

+        /*
+         * Restore the default SIGINT behavior while within libpq.
Otherwise, we
+         * can never exit from polling for the database connection. Failure to
+         * restore is non-fatal.
+         */
+        newact.sa_handler = SIG_DFL;
+        rc = sigaction(SIGINT, &newact, &oldact);

    There's no action taken if rc != 0. It doesn't seem right to
continue as if everything's fine when the handler registration fails.
At least a warning is warranted, so that the user reports such
failures to the community.

Best regards,
Gurjeet
http://Gurje.et



Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
On Mon Jul 24, 2023 at 12:00 PM CDT, Gurjeet Singh wrote:
> On Mon, Jul 24, 2023 at 9:26 AM Tristan Partin <tristan@neon.tech> wrote:
>
> > attached patch
>
> +        /*
> +         * Restore the default SIGINT behavior while within libpq.
> Otherwise, we
> +         * can never exit from polling for the database connection. Failure to
> +         * restore is non-fatal.
> +         */
> +        newact.sa_handler = SIG_DFL;
> +        rc = sigaction(SIGINT, &newact, &oldact);
>
>     There's no action taken if rc != 0. It doesn't seem right to
> continue as if everything's fine when the handler registration fails.
> At least a warning is warranted, so that the user reports such
> failures to the community.

If we fail to go back to the default handler, then we just get the
behavior we currently have. I am not sure logging a message like "Failed
to restore default SIGINT handler" is that useful to the user. It isn't
actionable, and at the end of the day isn't going to affect them for the
most part. They also aren't even aware that default handler was ever
overridden in the first place. I'm more than happy to add a debug log if
it is the blocker to getting this patch accepted however.

--
Tristan Partin
Neon (https://neon.tech)



Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
v2 is attached which fixes a grammatical issue in a comment.

--
Tristan Partin
Neon (https://neon.tech)

Attachment

Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
v3 is attached which fixes up some code comments I added which I hadn't
attached to the commit already, sigh.

--
Tristan Partin
Neon (https://neon.tech)

Attachment

Re: psql not responding to SIGINT upon db reconnection

From
Tom Lane
Date:
"Tristan Partin" <tristan@neon.tech> writes:
> v3 is attached which fixes up some code comments I added which I hadn't 
> attached to the commit already, sigh.

I don't care for this patch at all.  You're bypassing the pqsignal
abstraction layer that the rest of psql goes through, and the behavior
you're implementing isn't very nice.  People do not expect ^C to
kill psql - it should just stop the \c attempt and leave you as you
were.

Admittedly, getting PQconnectdbParams to return control on SIGINT
isn't too practical.  But you could probably replace that with a loop
around PQconnectPoll and test for CancelRequested in the loop.

            regards, tom lane



Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
On Mon Jul 24, 2023 at 12:43 PM CDT, Tom Lane wrote:
> "Tristan Partin" <tristan@neon.tech> writes:
> > v3 is attached which fixes up some code comments I added which I hadn't
> > attached to the commit already, sigh.
>
> I don't care for this patch at all.  You're bypassing the pqsignal
> abstraction layer that the rest of psql goes through, and the behavior
> you're implementing isn't very nice.  People do not expect ^C to
> kill psql - it should just stop the \c attempt and leave you as you
> were.
>
> Admittedly, getting PQconnectdbParams to return control on SIGINT
> isn't too practical.  But you could probably replace that with a loop
> around PQconnectPoll and test for CancelRequested in the loop.

That sounds like a much better solution. Attached you will find a v4
that implements your suggestion. Please let me know if there is
something that I missed. I can confirm that the patch works.

    $ ./build/src/bin/psql/psql -h pg.neon.tech
    NOTICE:  Welcome to Neon!
    Authenticate by visiting:
        https://console.neon.tech/psql_session/xxx


    NOTICE:  Connecting to database.
    psql (17devel, server 15.3)
    SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
    Type "help" for help.

    tristan957=> \c
    NOTICE:  Welcome to Neon!
    Authenticate by visiting:
        https://console.neon.tech/psql_session/yyy


    ^Cconnection to server at "pg.neon.tech" (3.18.6.96), port 5432 failed:
    Previous connection kept
    tristan957=>

--
Tristan Partin
Neon (https://neon.tech)

Attachment

Re: psql not responding to SIGINT upon db reconnection

From
Shlok Kyal
Date:
Hi,

> That sounds like a much better solution. Attached you will find a v4
> that implements your suggestion. Please let me know if there is
> something that I missed. I can confirm that the patch works.
>
>         $ ./build/src/bin/psql/psql -h pg.neon.tech
>         NOTICE:  Welcome to Neon!
>         Authenticate by visiting:
>             https://console.neon.tech/psql_session/xxx
>
>
>         NOTICE:  Connecting to database.
>         psql (17devel, server 15.3)
>         SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
>         Type "help" for help.
>
>         tristan957=> \c
>         NOTICE:  Welcome to Neon!
>         Authenticate by visiting:
>             https://console.neon.tech/psql_session/yyy
>
>
>         ^Cconnection to server at "pg.neon.tech" (3.18.6.96), port 5432 failed:
>         Previous connection kept
>         tristan957=>

I went through CFbot and found out that some tests are timing out.
Links:
https://cirrus-ci.com/task/6735437444677632
https://cirrus-ci.com/task/4536414189125632
https://cirrus-ci.com/task/5046587584413696
https://cirrus-ci.com/task/6172487491256320
https://cirrus-ci.com/task/5609537537835008

Some tests which are timing out are as follows:
[00:48:49.243] Summary of Failures:
[00:48:49.243]
[00:48:49.243] 4/270 postgresql:regress / regress/regress TIMEOUT 1000.01s
[00:48:49.243] 6/270 postgresql:pg_upgrade / pg_upgrade/002_pg_upgrade
TIMEOUT 1000.02s
[00:48:49.243] 34/270 postgresql:recovery /
recovery/027_stream_regress TIMEOUT 1000.02s
[00:48:49.243] 48/270 postgresql:plpgsql / plpgsql/regress TIMEOUT 1000.02s
[00:48:49.243] 49/270 postgresql:plperl / plperl/regress TIMEOUT 1000.01s
[00:48:49.243] 61/270 postgresql:dblink / dblink/regress TIMEOUT 1000.03s
[00:48:49.243] 89/270 postgresql:postgres_fdw / postgres_fdw/regress
TIMEOUT 1000.01s
[00:48:49.243] 93/270 postgresql:test_decoding / test_decoding/regress
TIMEOUT 1000.02s
[00:48:49.243] 110/270 postgresql:test_extensions /
test_extensions/regress TIMEOUT 1000.03s
[00:48:49.243] 123/270 postgresql:unsafe_tests / unsafe_tests/regress
TIMEOUT 1000.02s
[00:48:49.243] 152/270 postgresql:pg_dump / pg_dump/010_dump_connstr
TIMEOUT 1000.03s

Just want to make sure you are aware of the issue.

Thanks
Shlok Kumar Kyal



Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
On Thu Nov 2, 2023 at 4:03 AM CDT, Shlok Kyal wrote:
> Hi,
>
> > That sounds like a much better solution. Attached you will find a v4
> > that implements your suggestion. Please let me know if there is
> > something that I missed. I can confirm that the patch works.
> >
> >         $ ./build/src/bin/psql/psql -h pg.neon.tech
> >         NOTICE:  Welcome to Neon!
> >         Authenticate by visiting:
> >             https://console.neon.tech/psql_session/xxx
> >
> >
> >         NOTICE:  Connecting to database.
> >         psql (17devel, server 15.3)
> >         SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
> >         Type "help" for help.
> >
> >         tristan957=> \c
> >         NOTICE:  Welcome to Neon!
> >         Authenticate by visiting:
> >             https://console.neon.tech/psql_session/yyy
> >
> >
> >         ^Cconnection to server at "pg.neon.tech" (3.18.6.96), port 5432 failed:
> >         Previous connection kept
> >         tristan957=>
>
> I went through CFbot and found out that some tests are timing out.
> Links:
> https://cirrus-ci.com/task/6735437444677632
> https://cirrus-ci.com/task/4536414189125632
> https://cirrus-ci.com/task/5046587584413696
> https://cirrus-ci.com/task/6172487491256320
> https://cirrus-ci.com/task/5609537537835008
>
> Some tests which are timing out are as follows:
> [00:48:49.243] Summary of Failures:
> [00:48:49.243]
> [00:48:49.243] 4/270 postgresql:regress / regress/regress TIMEOUT 1000.01s
> [00:48:49.243] 6/270 postgresql:pg_upgrade / pg_upgrade/002_pg_upgrade
> TIMEOUT 1000.02s
> [00:48:49.243] 34/270 postgresql:recovery /
> recovery/027_stream_regress TIMEOUT 1000.02s
> [00:48:49.243] 48/270 postgresql:plpgsql / plpgsql/regress TIMEOUT 1000.02s
> [00:48:49.243] 49/270 postgresql:plperl / plperl/regress TIMEOUT 1000.01s
> [00:48:49.243] 61/270 postgresql:dblink / dblink/regress TIMEOUT 1000.03s
> [00:48:49.243] 89/270 postgresql:postgres_fdw / postgres_fdw/regress
> TIMEOUT 1000.01s
> [00:48:49.243] 93/270 postgresql:test_decoding / test_decoding/regress
> TIMEOUT 1000.02s
> [00:48:49.243] 110/270 postgresql:test_extensions /
> test_extensions/regress TIMEOUT 1000.03s
> [00:48:49.243] 123/270 postgresql:unsafe_tests / unsafe_tests/regress
> TIMEOUT 1000.02s
> [00:48:49.243] 152/270 postgresql:pg_dump / pg_dump/010_dump_connstr
> TIMEOUT 1000.03s
>
> Just want to make sure you are aware of the issue.

Hi Shlok!

Thanks for taking a look. I will check these failures out to see if
I can reproduce.

--
Tristan Partin
Neon (https://neon.tech)



Re: psql not responding to SIGINT upon db reconnection

From
Heikki Linnakangas
Date:
On 06/11/2023 19:16, Tristan Partin wrote:
>>> That sounds like a much better solution. Attached you will find a v4
>>> that implements your suggestion. Please let me know if there is
>>> something that I missed. I can confirm that the patch works.

This patch is missing a select(). It will busy loop until the connection 
is established or cancelled.

Shouldn't we also clear CancelRequested after we have cancelled the 
attempt? Otherwise, any subsequent attempts will immediately fail too.

Should we use 'cancel_pressed' here rather than CancelRequested? To be 
honest, I don't understand the difference, so that's a genuine question. 
There was an attempt at unifying them in the past but it was reverted in 
commit 5d43c3c54d.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
On Thu Nov 16, 2023 at 8:33 AM CST, Heikki Linnakangas wrote:
> On 06/11/2023 19:16, Tristan Partin wrote:
> >>> That sounds like a much better solution. Attached you will find a v4
> >>> that implements your suggestion. Please let me know if there is
> >>> something that I missed. I can confirm that the patch works.
>
> This patch is missing a select(). It will busy loop until the connection
> is established or cancelled.

If I add a wait (select, poll, etc.), then I can't control-C during the
blocking call, so it doesn't really solve the problem. On Linux, we have
signalfds which seem like the perfect solution to this problem, "wait on
the pq socket or SIGINT." But that doesn't translate well to other
operating systems :(.

> tristan957=> \c
> NOTICE:  Welcome to Neon!
> Authenticate by visiting:
>     https://console.neon.tech/psql_session/XXXX
>
>
> ^CTerminated

You can see here that I can't terminate the command. Where you see
"Terminated" is me running `kill $(pgrep psql)` in another terminal.

> Shouldn't we also clear CancelRequested after we have cancelled the
> attempt? Otherwise, any subsequent attempts will immediately fail too.

After switching to cancel_pressed, I don't think so. See this bit of
code in the psql main loop:

> /* main loop to get queries and execute them */
> while (successResult == EXIT_SUCCESS)
> {
>     /*
>      * Clean up after a previous Control-C
>      */
>     if (cancel_pressed)
>     {
>         if (!pset.cur_cmd_interactive)
>         {
>             /*
>              * You get here if you stopped a script with Ctrl-C.
>              */
>             successResult = EXIT_USER;
>             break;
>         }
>
>         cancel_pressed = false;
>     }

> Should we use 'cancel_pressed' here rather than CancelRequested? To be
> honest, I don't understand the difference, so that's a genuine question.
> There was an attempt at unifying them in the past but it was reverted in
> commit 5d43c3c54d.

The more I look at this, the more I don't understand... I need to look
at this harder, but wanted to get this email out. Switched to
cancel_pressed though. Thanks for pointing it out. Going to wait to send
a v2 while more discussion occurs.

--
Tristan Partin
Neon (https://neon.tech)



Re: psql not responding to SIGINT upon db reconnection

From
Heikki Linnakangas
Date:
On 22/11/2023 19:29, Tristan Partin wrote:
> On Thu Nov 16, 2023 at 8:33 AM CST, Heikki Linnakangas wrote:
>> On 06/11/2023 19:16, Tristan Partin wrote:
>>>>> That sounds like a much better solution. Attached you will find a v4
>>>>> that implements your suggestion. Please let me know if there is
>>>>> something that I missed. I can confirm that the patch works.
>>
>> This patch is missing a select(). It will busy loop until the connection
>> is established or cancelled.
> 
> If I add a wait (select, poll, etc.), then I can't control-C during the
> blocking call, so it doesn't really solve the problem.

Hmm, they should return with EINTR on signal. At least on Linux; I'm not 
sure how portable that is. See signal(7) man page, section "Interruption 
of system calls and library functions by signal handlers". You could 
also use a timeout like 5 s to ensure that you wake up and notice that 
the signal was received eventually, even if it doesn't interrupt the 
blocking call.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
On Wed Nov 22, 2023 at 3:00 PM CST, Heikki Linnakangas wrote:
> On 22/11/2023 19:29, Tristan Partin wrote:
> > On Thu Nov 16, 2023 at 8:33 AM CST, Heikki Linnakangas wrote:
> >> On 06/11/2023 19:16, Tristan Partin wrote:
> >>>>> That sounds like a much better solution. Attached you will find a v4
> >>>>> that implements your suggestion. Please let me know if there is
> >>>>> something that I missed. I can confirm that the patch works.
> >>
> >> This patch is missing a select(). It will busy loop until the connection
> >> is established or cancelled.
> >
> > If I add a wait (select, poll, etc.), then I can't control-C during the
> > blocking call, so it doesn't really solve the problem.
>
> Hmm, they should return with EINTR on signal. At least on Linux; I'm not
> sure how portable that is. See signal(7) man page, section "Interruption
> of system calls and library functions by signal handlers". You could
> also use a timeout like 5 s to ensure that you wake up and notice that
> the signal was received eventually, even if it doesn't interrupt the
> blocking call.

Ha, you're right. I had this working yesterday, but convinced myself it
didn't. I had a do while loop wrapping the blocking call. Here is a v4,
which seems to pass the tests that were pointed out to be failing
earlier.

Noticed that I copy-pasted pqSocketPoll() into the psql code. I think
this may be controversial. Not sure what the best solution to the issue
is. I will pay attention to the buildfarm animals when they pick this
up.

--
Tristan Partin
Neon (https://neon.tech)

Attachment

Re: psql not responding to SIGINT upon db reconnection

From
Heikki Linnakangas
Date:
On 22/11/2023 23:29, Tristan Partin wrote:
> Ha, you're right. I had this working yesterday, but convinced myself it
> didn't. I had a do while loop wrapping the blocking call. Here is a v4,
> which seems to pass the tests that were pointed out to be failing
> earlier.

Thanks! This suffers from a classic race condition:

> +            if (cancel_pressed)
> +                break;
> +
> +            sock = PQsocket(n_conn);
> +            if (sock == -1)
> +                break;
> +
> +            rc = pqSocketPoll(sock, for_read, !for_read, (time_t)-1);
> +            Assert(rc != 0); /* Timeout is impossible. */
> +            if (rc == -1)
> +            {
> +                success = false;
> +                break;
> +            }

If a signal arrives between the "if (cancel_pressed)" check and 
pqSocketPoll(), pgSocketPoll will "miss" the signal and block 
indefinitely. There are three solutions to this:

1. Use a timeout, so that you wake up periodically to check for any 
missed signals. Easy solution, the downsides are that you will not react 
as quickly if the signal is missed, and you waste some cycles by waking 
up unnecessarily.

2. The Self-pipe trick: https://cr.yp.to/docs/selfpipe.html. We also use 
that in src/backend/storage/ipc/latch.c. It's portable, but somewhat 
complicated.

3. Use pselect() or ppoll(). They were created specifically to address 
this problem. Not fully portable, I think it's missing on Windows at least.

Maybe use pselect() if it's available, and select() with a timeout if 
it's not.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
On Thu Nov 23, 2023 at 3:19 AM CST, Heikki Linnakangas wrote:
> On 22/11/2023 23:29, Tristan Partin wrote:
> > Ha, you're right. I had this working yesterday, but convinced myself it
> > didn't. I had a do while loop wrapping the blocking call. Here is a v4,
> > which seems to pass the tests that were pointed out to be failing
> > earlier.
>
> Thanks! This suffers from a classic race condition:
>
> > +            if (cancel_pressed)
> > +                break;
> > +
> > +            sock = PQsocket(n_conn);
> > +            if (sock == -1)
> > +                break;
> > +
> > +            rc = pqSocketPoll(sock, for_read, !for_read, (time_t)-1);
> > +            Assert(rc != 0); /* Timeout is impossible. */
> > +            if (rc == -1)
> > +            {
> > +                success = false;
> > +                break;
> > +            }
>
> If a signal arrives between the "if (cancel_pressed)" check and
> pqSocketPoll(), pgSocketPoll will "miss" the signal and block
> indefinitely. There are three solutions to this:
>
> 1. Use a timeout, so that you wake up periodically to check for any
> missed signals. Easy solution, the downsides are that you will not react
> as quickly if the signal is missed, and you waste some cycles by waking
> up unnecessarily.
>
> 2. The Self-pipe trick: https://cr.yp.to/docs/selfpipe.html. We also use
> that in src/backend/storage/ipc/latch.c. It's portable, but somewhat
> complicated.
>
> 3. Use pselect() or ppoll(). They were created specifically to address
> this problem. Not fully portable, I think it's missing on Windows at least.
>
> Maybe use pselect() if it's available, and select() with a timeout if
> it's not.

First I have ever heard of this race condition, and now I will commit it
to durable storage :). I went ahead and did option #3 that you proposed.
On Windows, I have a 1 second timeout for the select/poll. That seemed
like a good balance of user experience and spinning the CPU. But I am
open to other values. I don't have a Windows VM, but maybe I should set
one up...

I am not completely in love with the code I have written. Lots of
conditional compilation which makes it hard to read. Looking forward to
another round of review to see what y'all think.

For what it's worth, I did try #2, but since psql installs its own
SIGINT handler, the code became really hairy.

--
Tristan Partin
Neon (https://neon.tech)

Attachment

Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
On Wed Nov 29, 2023 at 11:48 AM CST, Tristan Partin wrote:
> I am not completely in love with the code I have written. Lots of
> conditional compilation which makes it hard to read. Looking forward to
> another round of review to see what y'all think.

Ok. Here is a patch which just uses select(2) with a timeout of 1s or
pselect(2) if it is available. I also moved the state machine processing
into its own function.

Thanks for your comments thus far.

--
Tristan Partin
Neon (https://neon.tech)

Attachment

Re: psql not responding to SIGINT upon db reconnection

From
Robert Haas
Date:
On Tue, Dec 5, 2023 at 1:35 PM Tristan Partin <tristan@neon.tech> wrote:
> On Wed Nov 29, 2023 at 11:48 AM CST, Tristan Partin wrote:
> > I am not completely in love with the code I have written. Lots of
> > conditional compilation which makes it hard to read. Looking forward to
> > another round of review to see what y'all think.
>
> Ok. Here is a patch which just uses select(2) with a timeout of 1s or
> pselect(2) if it is available. I also moved the state machine processing
> into its own function.

Hmm, this adds a function called pqSocketPoll to psql/command.c. But
there already is such a function in libpq/fe-misc.c. It's not quite
the same, though. Having the same function in two different modules
with subtly different definitions seems like it's probably not the
right idea.

Also, this seems awfully complicated for something that's supposed to
(checks notes) wait for a file descriptor to become ready for I/O for
up to 1 second. It's 160 lines of code in pqSocketPoll and another 50
in the caller. If this is really the simplest way to do this, we
really need to rethink our libpq APIs or, uh, something. I wonder if
we could make this simpler by, say:

- always use select
- always use a 1-second timeout
- if it returns faster because the race condition doesn't happen, cool
- if it doesn't, well then you get to wait for a second, oh well

I don't feel strongly that that's the right way to go and if Heikki or
some other committer wants to go with this more complex conditional
approach, that's fine. But to me it seems like a lot.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
On Fri Jan 5, 2024 at 12:24 PM CST, Robert Haas wrote:
> On Tue, Dec 5, 2023 at 1:35 PM Tristan Partin <tristan@neon.tech> wrote:
> > On Wed Nov 29, 2023 at 11:48 AM CST, Tristan Partin wrote:
> > > I am not completely in love with the code I have written. Lots of
> > > conditional compilation which makes it hard to read. Looking forward to
> > > another round of review to see what y'all think.
> >
> > Ok. Here is a patch which just uses select(2) with a timeout of 1s or
> > pselect(2) if it is available. I also moved the state machine processing
> > into its own function.
>
> Hmm, this adds a function called pqSocketPoll to psql/command.c. But
> there already is such a function in libpq/fe-misc.c. It's not quite
> the same, though. Having the same function in two different modules
> with subtly different definitions seems like it's probably not the
> right idea.

Yep, not tied to the function name. Happy to rename as anyone suggests.

> Also, this seems awfully complicated for something that's supposed to
> (checks notes) wait for a file descriptor to become ready for I/O for
> up to 1 second. It's 160 lines of code in pqSocketPoll and another 50
> in the caller. If this is really the simplest way to do this, we
> really need to rethink our libpq APIs or, uh, something. I wonder if
> we could make this simpler by, say:
>
> - always use select
> - always use a 1-second timeout
> - if it returns faster because the race condition doesn't happen, cool
> - if it doesn't, well then you get to wait for a second, oh well
>
> I don't feel strongly that that's the right way to go and if Heikki or
> some other committer wants to go with this more complex conditional
> approach, that's fine. But to me it seems like a lot.

I think the way to go is to expose some variation of libpq's
pqSocketPoll(), which I would be happy to put together a patch for.
Making frontends, psql in this case, have to reimplement the polling
logic doesn't strike me as fruitful, which is essentially what I have
done.

Thanks for your input!

But also wait a second. In my last email, I said:

> Ok. Here is a patch which just uses select(2) with a timeout of 1s or
> pselect(2) if it is available. I also moved the state machine processing
> into its own function.

This is definitely not the patch I meant to send. What the? Here is what
I meant to send, but I stand by my comment that we should just expose
a variation of pqSocketPoll().

--
Tristan Partin
Neon (https://neon.tech)

Attachment

Re: psql not responding to SIGINT upon db reconnection

From
Robert Haas
Date:
On Mon, Jan 8, 2024 at 1:03 AM Tristan Partin <tristan@neon.tech> wrote:
> I think the way to go is to expose some variation of libpq's
> pqSocketPoll(), which I would be happy to put together a patch for.
> Making frontends, psql in this case, have to reimplement the polling
> logic doesn't strike me as fruitful, which is essentially what I have
> done.

I encourage further exploration of this line of attack. I fear that if
I were to commit something like what you've posted up until now,
people would complain that that code was too ugly to live, and I'd
have a hard time telling them that they're wrong.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
On Fri Jan 12, 2024 at 10:45 AM CST, Robert Haas wrote:
> On Mon, Jan 8, 2024 at 1:03 AM Tristan Partin <tristan@neon.tech> wrote:
> > I think the way to go is to expose some variation of libpq's
> > pqSocketPoll(), which I would be happy to put together a patch for.
> > Making frontends, psql in this case, have to reimplement the polling
> > logic doesn't strike me as fruitful, which is essentially what I have
> > done.
>
> I encourage further exploration of this line of attack. I fear that if
> I were to commit something like what you've posted up until now,
> people would complain that that code was too ugly to live, and I'd
> have a hard time telling them that they're wrong.

Completely agree. Let me look into this. Perhaps I can get something up
next week or the week after.

--
Tristan Partin
Neon (https://neon.tech)



Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
On Fri Jan 12, 2024 at 11:13 AM CST, Tristan Partin wrote:
> On Fri Jan 12, 2024 at 10:45 AM CST, Robert Haas wrote:
> > On Mon, Jan 8, 2024 at 1:03 AM Tristan Partin <tristan@neon.tech> wrote:
> > > I think the way to go is to expose some variation of libpq's
> > > pqSocketPoll(), which I would be happy to put together a patch for.
> > > Making frontends, psql in this case, have to reimplement the polling
> > > logic doesn't strike me as fruitful, which is essentially what I have
> > > done.
> >
> > I encourage further exploration of this line of attack. I fear that if
> > I were to commit something like what you've posted up until now,
> > people would complain that that code was too ugly to live, and I'd
> > have a hard time telling them that they're wrong.
>
> Completely agree. Let me look into this. Perhaps I can get something up
> next week or the week after.

Not next week, but here is a respin. I've exposed pqSocketPoll as
PQsocketPoll and am just using that. You can see the diff is so much
smaller, which is great!

In order to fight the race condition, I am just using a 1 second timeout
instead of trying to integrate pselect or ppoll. We could add
a PQsocketPPoll() to support those use cases, but I am not sure how
available pselect and ppoll are. I guess on Windows we don't have
pselect. I don't think using the pipe trick that Heikki mentioned
earlier is suitable to expose via an API in libpq, but someone else
might have a different opinion.

Maybe this is good enough until someone complains? Most people would
probably just chalk any latency between keypress and cancellation as
network latency and not a hardcoded 1 second.

Thanks for your feedback Robert!

--
Tristan Partin
Neon (https://neon.tech)

Attachment

Re: psql not responding to SIGINT upon db reconnection

From
Jelte Fennema-Nio
Date:
On Tue, 30 Jan 2024 at 23:20, Tristan Partin <tristan@neon.tech> wrote:
> Not next week, but here is a respin. I've exposed pqSocketPoll as
> PQsocketPoll and am just using that. You can see the diff is so much
> smaller, which is great!

The exports.txt change should be made part of patch 0001, also docs
are missing for the newly exposed function. PQsocketPoll does look
like quite a nice API to expose from libpq.



Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
On Tue Jan 30, 2024 at 4:42 PM CST, Jelte Fennema-Nio wrote:
> On Tue, 30 Jan 2024 at 23:20, Tristan Partin <tristan@neon.tech> wrote:
> > Not next week, but here is a respin. I've exposed pqSocketPoll as
> > PQsocketPoll and am just using that. You can see the diff is so much
> > smaller, which is great!
>
> The exports.txt change should be made part of patch 0001, also docs
> are missing for the newly exposed function. PQsocketPoll does look
> like quite a nice API to expose from libpq.

I was looking for documentation of PQsocket(), but didn't find any
standalone (unless I completely missed it). So I just copied how
PQsocket() is documented in PQconnectPoll(). I am happy to document it
separately if you think it would be useful.

My bad on the exports.txt change being in the wrong commit. Git
things... I will fix it on the next re-spin after resolving the previous
paragraph.

--
Tristan Partin
Neon (https://neon.tech)



Re: psql not responding to SIGINT upon db reconnection

From
Jelte Fennema-Nio
Date:
On Wed, 31 Jan 2024 at 19:07, Tristan Partin <tristan@neon.tech> wrote:
> I was looking for documentation of PQsocket(), but didn't find any
> standalone (unless I completely missed it). So I just copied how
> PQsocket() is documented in PQconnectPoll(). I am happy to document it
> separately if you think it would be useful.

PQsocket its documentation is here:
https://www.postgresql.org/docs/16/libpq-status.html#LIBPQ-PQSOCKET

I think PQsocketPoll should probably have its API documentation
(describing arguments and return value at minimum) in this section of
the docs: https://www.postgresql.org/docs/16/libpq-async.html
And I think the example in the PQconnnectPoll API docs could benefit
from having PQsocketPoll used in that example.



Re: psql not responding to SIGINT upon db reconnection

From
Robert Haas
Date:
On Wed, Jan 31, 2024 at 1:07 PM Tristan Partin <tristan@neon.tech> wrote:
> I was looking for documentation of PQsocket(), but didn't find any
> standalone (unless I completely missed it). So I just copied how
> PQsocket() is documented in PQconnectPoll(). I am happy to document it
> separately if you think it would be useful.

As Jelte said back at the end of January, I think you just completely
missed it. The relevant part of libpq.sgml starts like this:

    <varlistentry id="libpq-PQsocket">
     <term><function>PQsocket</function><indexterm><primary>PQsocket</primary></indexterm></term>

As far as I know, we document all of the exported libpq functions in
that SGML file.

I think there's no real reason why we couldn't get at least 0001 and
maybe also 0002 into this release, but only if you move quickly on
this. Feature freeze is approaching rapidly.

Modulo the documentation changes, I think 0001 is pretty much ready to
go. 0002 needs comments. I'm also not so sure about the name
process_connection_state_machine(); it seems a little verbose. How
about something like wait_until_connected()? And maybe put it below
do_connect() instead of above.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
On Fri Mar 22, 2024 at 9:59 AM CDT, Robert Haas wrote:
> On Wed, Jan 31, 2024 at 1:07 PM Tristan Partin <tristan@neon.tech> wrote:
> > I was looking for documentation of PQsocket(), but didn't find any
> > standalone (unless I completely missed it). So I just copied how
> > PQsocket() is documented in PQconnectPoll(). I am happy to document it
> > separately if you think it would be useful.
>
> As Jelte said back at the end of January, I think you just completely
> missed it. The relevant part of libpq.sgml starts like this:
>
>     <varlistentry id="libpq-PQsocket">
>      <term><function>PQsocket</function><indexterm><primary>PQsocket</primary></indexterm></term>
>
> As far as I know, we document all of the exported libpq functions in
> that SGML file.
>
> I think there's no real reason why we couldn't get at least 0001 and
> maybe also 0002 into this release, but only if you move quickly on
> this. Feature freeze is approaching rapidly.
>
> Modulo the documentation changes, I think 0001 is pretty much ready to
> go. 0002 needs comments. I'm also not so sure about the name
> process_connection_state_machine(); it seems a little verbose. How
> about something like wait_until_connected()? And maybe put it below
> do_connect() instead of above.

Robert, Jelte:

Sorry for taking a while to get back to y'all. I have taken your
feedback into consideration for v9. This is my first time writing
Postgres docs, so I'm ready for another round of editing :).

Robert, could you point out some places where comments would be useful
in 0002? I did rename the function and moved it as suggested, thanks! In
turn, I also realized I forgot a prototype, so also added it.

This patchset has also been rebased on master, so the libpq export
number for PQsocketPoll was bumped.

--
Tristan Partin
Neon (https://neon.tech)

Attachment

Re: psql not responding to SIGINT upon db reconnection

From
Robert Haas
Date:
On Fri, Mar 22, 2024 at 1:05 PM Tristan Partin <tristan@neon.tech> wrote:
> Sorry for taking a while to get back to y'all. I have taken your
> feedback into consideration for v9. This is my first time writing
> Postgres docs, so I'm ready for another round of editing :).

Yeah, that looks like it needs some wordsmithing yet. I can take a
crack at that at some point, but here are a few notes:

- "takes care of" and "working through the state machine" seem quite
vague to me.
- the meanings of forRead and forWrite don't seem to be documented.
- "retuns" is a typo.

> Robert, could you point out some places where comments would be useful
> in 0002? I did rename the function and moved it as suggested, thanks! In
> turn, I also realized I forgot a prototype, so also added it.

Well, for starters, I'd give the function a header comment.

Telling me that a 1 second timeout is a 1 second timeout is less
useful than telling me why we've picked a 1 second timeout. Presumably
the answer is "so we can respond to cancel interrupts in a timely
fashion", but I'd make that explicit.

It might be worth a comment saying that PQsocket() shouldn't be
hoisted out of the loop because it could be a multi-host connection
string and the socket might change under us. Unless that's not true,
in which case we should hoist the PQsocket() call out of the loop.

I think it would also be worth a comment saying that we don't need to
report errors here, as the caller will do that; we just need to wait
until the connection is known to have either succeeded or failed, or
until the user presses cancel.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
On Fri Mar 22, 2024 at 12:17 PM CDT, Robert Haas wrote:
> On Fri, Mar 22, 2024 at 1:05 PM Tristan Partin <tristan@neon.tech> wrote:
> > Sorry for taking a while to get back to y'all. I have taken your
> > feedback into consideration for v9. This is my first time writing
> > Postgres docs, so I'm ready for another round of editing :).
>
> Yeah, that looks like it needs some wordsmithing yet. I can take a
> crack at that at some point, but here are a few notes:
>
> - "takes care of" and "working through the state machine" seem quite
> vague to me.
> - the meanings of forRead and forWrite don't seem to be documented.
> - "retuns" is a typo.
>
> > Robert, could you point out some places where comments would be useful
> > in 0002? I did rename the function and moved it as suggested, thanks! In
> > turn, I also realized I forgot a prototype, so also added it.
>
> Well, for starters, I'd give the function a header comment.
>
> Telling me that a 1 second timeout is a 1 second timeout is less
> useful than telling me why we've picked a 1 second timeout. Presumably
> the answer is "so we can respond to cancel interrupts in a timely
> fashion", but I'd make that explicit.
>
> It might be worth a comment saying that PQsocket() shouldn't be
> hoisted out of the loop because it could be a multi-host connection
> string and the socket might change under us. Unless that's not true,
> in which case we should hoist the PQsocket() call out of the loop.
>
> I think it would also be worth a comment saying that we don't need to
> report errors here, as the caller will do that; we just need to wait
> until the connection is known to have either succeeded or failed, or
> until the user presses cancel.

This is good feedback, thanks. I have added comments where you
suggested. I reworded the PQsocketPoll docs to hopefully meet your
expectations. I am using the term "connection sequence" which is from
the PQconnectStartParams docs, so hopefully people will be able to make
that connection. I wrote documentation for "forRead" and "forWrite" as
well.

I had a question about parameter naming. Right now I have a mix of
camel-case and snake-case in the function signature since that is what
I inherited. Should I change that to be consistent? If so, which case
would you like?

Thanks for your continued reviews.

--
Tristan Partin
Neon (https://neon.tech)

Attachment

Re: psql not responding to SIGINT upon db reconnection

From
Robert Haas
Date:
On Fri, Mar 22, 2024 at 4:58 PM Tristan Partin <tristan@neon.tech> wrote:
> I had a question about parameter naming. Right now I have a mix of
> camel-case and snake-case in the function signature since that is what
> I inherited. Should I change that to be consistent? If so, which case
> would you like?

Uh... PostgreSQL is kind of the wild west in that regard. The thing to
do is look for nearby precedents, but that doesn't help much here
because in the very same file, libpq-fe.h, we have:

extern int      PQsetResultAttrs(PGresult *res, int numAttributes,
PGresAttDesc *attDescs);
extern int      PQsetvalue(PGresult *res, int tup_num, int field_num,
char *value, int len);

Since the existing naming is consistent with one of those two styles,
I'd probably just leave it be.

+       The function returns a value greater than <literal>0</literal>
if the specified condition
+       is met, <literal>0</literal> if a timeout occurred, or
<literal>-1</literal> if an error
+       or interrupt occurred. In the event <literal>forRead</literal> and

We either need to tell people how to find out which error it was, or
if that's not possible and we can't reasonably make it possible, we
need to tell them why they shouldn't care. Because there's nothing
more delightful than someone who shows up and says "hey, I tried to do
XYZ, and I got an error," as if that were sufficient information for
me to do something useful.

+       <literal>end_time</literal> is the time in the future in
seconds starting from the UNIX
+       epoch in which you would like the function to return if the
condition is not met.

This sentence seems a bit contorted to me, like maybe Yoda wrote it. I
was about to try to rephrase it and maybe split it in two when I
wondered why we need to document how time_t works at all. Can't we
just say something like "If end_time is not -1, it specifies the time
at which this function should stop waiting for the condition to be
met" -- and maybe move it to the end of the first paragraph, so it's
before where we list the meanings of the return values?

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
On Mon Mar 25, 2024 at 1:44 PM CDT, Robert Haas wrote:
> On Fri, Mar 22, 2024 at 4:58 PM Tristan Partin <tristan@neon.tech> wrote:
> > I had a question about parameter naming. Right now I have a mix of
> > camel-case and snake-case in the function signature since that is what
> > I inherited. Should I change that to be consistent? If so, which case
> > would you like?
>
> Uh... PostgreSQL is kind of the wild west in that regard. The thing to
> do is look for nearby precedents, but that doesn't help much here
> because in the very same file, libpq-fe.h, we have:
>
> extern int      PQsetResultAttrs(PGresult *res, int numAttributes,
> PGresAttDesc *attDescs);
> extern int      PQsetvalue(PGresult *res, int tup_num, int field_num,
> char *value, int len);
>
> Since the existing naming is consistent with one of those two styles,
> I'd probably just leave it be.
>
> +       The function returns a value greater than <literal>0</literal>
> if the specified condition
> +       is met, <literal>0</literal> if a timeout occurred, or
> <literal>-1</literal> if an error
> +       or interrupt occurred. In the event <literal>forRead</literal> and
>
> We either need to tell people how to find out which error it was, or
> if that's not possible and we can't reasonably make it possible, we
> need to tell them why they shouldn't care. Because there's nothing
> more delightful than someone who shows up and says "hey, I tried to do
> XYZ, and I got an error," as if that were sufficient information for
> me to do something useful.
>
> +       <literal>end_time</literal> is the time in the future in
> seconds starting from the UNIX
> +       epoch in which you would like the function to return if the
> condition is not met.
>
> This sentence seems a bit contorted to me, like maybe Yoda wrote it. I
> was about to try to rephrase it and maybe split it in two when I
> wondered why we need to document how time_t works at all. Can't we
> just say something like "If end_time is not -1, it specifies the time
> at which this function should stop waiting for the condition to be
> met" -- and maybe move it to the end of the first paragraph, so it's
> before where we list the meanings of the return values?

Incorporated feedback, I have :).

--
Tristan Partin
Neon (https://neon.tech)

Attachment

Re: psql not responding to SIGINT upon db reconnection

From
Robert Haas
Date:
On Mon, Apr 1, 2024 at 12:04 PM Tristan Partin <tristan@neon.tech> wrote:
> > This sentence seems a bit contorted to me, like maybe Yoda wrote it.
>
> Incorporated feedback, I have :).

Committed it, I did. My thanks for working on this issue, I extend.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
On Tue Apr 2, 2024 at 9:32 AM CDT, Robert Haas wrote:
> On Mon, Apr 1, 2024 at 12:04 PM Tristan Partin <tristan@neon.tech> wrote:
> > > This sentence seems a bit contorted to me, like maybe Yoda wrote it.
> >
> > Incorporated feedback, I have :).
>
> Committed it, I did. My thanks for working on this issue, I extend.

Thanks to everyone for their reviews! Patch is in a much better place
than when it started.

--
Tristan Partin
Neon (https://neon.tech)



Re: psql not responding to SIGINT upon db reconnection

From
Jelte Fennema-Nio
Date:
On Tue, 2 Apr 2024 at 16:33, Robert Haas <robertmhaas@gmail.com> wrote:
> Committed it, I did. My thanks for working on this issue, I extend.

Looking at the committed version of this patch, the pg_unreachable
calls seemed weird to me. 1 is actually incorrect, thus possibly
resulting in undefined behaviour. And for the other call an imho
better fix would be to remove the now 21 year unused enum variant,
instead of introducing its only reference in the whole codebase.

Attached are two trivial patches, feel free to remove both of the
pg_unreachable calls.

Attachment

Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
On Wed Apr 3, 2024 at 8:32 AM CDT, Jelte Fennema-Nio wrote:
> On Tue, 2 Apr 2024 at 16:33, Robert Haas <robertmhaas@gmail.com> wrote:
> > Committed it, I did. My thanks for working on this issue, I extend.
>
> Looking at the committed version of this patch, the pg_unreachable
> calls seemed weird to me. 1 is actually incorrect, thus possibly
> resulting in undefined behaviour. And for the other call an imho
> better fix would be to remove the now 21 year unused enum variant,
> instead of introducing its only reference in the whole codebase.
>
> Attached are two trivial patches, feel free to remove both of the
> pg_unreachable calls.

Patches look good. Sorry about causing you to do some work.

--
Tristan Partin
Neon (https://neon.tech)



Re: psql not responding to SIGINT upon db reconnection

From
Tom Lane
Date:
Jelte Fennema-Nio <postgres@jeltef.nl> writes:
> Looking at the committed version of this patch, the pg_unreachable
> calls seemed weird to me. 1 is actually incorrect, thus possibly
> resulting in undefined behaviour. And for the other call an imho
> better fix would be to remove the now 21 year unused enum variant,
> instead of introducing its only reference in the whole codebase.

If we do the latter, we will almost certainly get pushback from
distros who check for library ABI breaks.  I fear the comment
suggesting that we could remove it someday is too optimistic.

            regards, tom lane



Re: psql not responding to SIGINT upon db reconnection

From
Jelte Fennema-Nio
Date:
On Wed, 3 Apr 2024 at 16:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> If we do the latter, we will almost certainly get pushback from
> distros who check for library ABI breaks.  I fear the comment
> suggesting that we could remove it someday is too optimistic.

Alright, changed patch 0002 to keep the variant. But remove it from
the recently added switch/case. And also updated the comment to remove
the "for awhile".

Attachment

Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
On Wed Apr 3, 2024 at 9:50 AM CDT, Jelte Fennema-Nio wrote:
> On Wed, 3 Apr 2024 at 16:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > If we do the latter, we will almost certainly get pushback from
> > distros who check for library ABI breaks.  I fear the comment
> > suggesting that we could remove it someday is too optimistic.
>
> Alright, changed patch 0002 to keep the variant. But remove it from
> the recently added switch/case. And also updated the comment to remove
> the "for awhile".

Removing from the switch statement causes a warning:

> [1/2] Compiling C object src/bin/psql/psql.p/command.c.o
> ../src/bin/psql/command.c: In function ‘wait_until_connected’:
> ../src/bin/psql/command.c:3803:17: warning: enumeration value ‘PGRES_POLLING_ACTIVE’ not handled in switch [-Wswitch]
>  3803 |                 switch (PQconnectPoll(conn))
>       |                 ^~~~~~

--
Tristan Partin
Neon (https://neon.tech)



Re: psql not responding to SIGINT upon db reconnection

From
Jelte Fennema-Nio
Date:
On Wed, 3 Apr 2024 at 16:55, Tristan Partin <tristan@neon.tech> wrote:
> Removing from the switch statement causes a warning:
>
> > [1/2] Compiling C object src/bin/psql/psql.p/command.c.o
> > ../src/bin/psql/command.c: In function ‘wait_until_connected’:
> > ../src/bin/psql/command.c:3803:17: warning: enumeration value ‘PGRES_POLLING_ACTIVE’ not handled in switch
[-Wswitch]
> >  3803 |                 switch (PQconnectPoll(conn))
> >       |                 ^~~~~~


Ofcourse... fixed now

Attachment

Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
On Wed Apr 3, 2024 at 10:05 AM CDT, Jelte Fennema-Nio wrote:
> On Wed, 3 Apr 2024 at 16:55, Tristan Partin <tristan@neon.tech> wrote:
> > Removing from the switch statement causes a warning:
> >
> > > [1/2] Compiling C object src/bin/psql/psql.p/command.c.o
> > > ../src/bin/psql/command.c: In function ‘wait_until_connected’:
> > > ../src/bin/psql/command.c:3803:17: warning: enumeration value ‘PGRES_POLLING_ACTIVE’ not handled in switch
[-Wswitch]
> > >  3803 |                 switch (PQconnectPoll(conn))
> > >       |                 ^~~~~~
>
>
> Ofcourse... fixed now

I think patch 2 makes it worse. The value in -Wswitch is that when new
enum variants are added, the developer knows the locations to update.
Adding a default case makes -Wswitch pointless.

Patch 1 is still good. The comment change in patch 2 is good too!

--
Tristan Partin
Neon (https://neon.tech)



Re: psql not responding to SIGINT upon db reconnection

From
Robert Haas
Date:
On Wed, Apr 3, 2024 at 11:17 AM Tristan Partin <tristan@neon.tech> wrote:
> I think patch 2 makes it worse. The value in -Wswitch is that when new
> enum variants are added, the developer knows the locations to update.
> Adding a default case makes -Wswitch pointless.
>
> Patch 1 is still good. The comment change in patch 2 is good too!

It seems to me that 0001 should either remove the pg_unreachable()
call or change the break to a return, but not both. The commit message
tries to justify doing both by saying that the pg_unreachable() call
doesn't have much value, but there's not much value in omitting
pg_unreachable() from unreachable places, either, so I'm not
convinced.

I agree with Tristan's analysis of 0002.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: psql not responding to SIGINT upon db reconnection

From
Jelte Fennema-Nio
Date:
On Wed, 3 Apr 2024 at 21:49, Robert Haas <robertmhaas@gmail.com> wrote:
> It seems to me that 0001 should either remove the pg_unreachable()
> call or change the break to a return, but not both.

Updated the patch to only remove the pg_unreachable call (and keep the breaks).

> but there's not much value in omitting
> pg_unreachable() from unreachable places, either, so I'm not
> convinced.

But I don't agree with this. Having pg_unreachable in places where it
brings no perf benefit has two main downsides:

1. It's extra lines of code
2. If a programmer/reviewer is not careful about maintaining this
unreachable invariant (now or in the future), undefined behavior can
be introduced quite easily.

Also, I'd expect any optimizing compiler to know that the code after
the loop is already unreachable if there are no break/goto statements
in its body.

> I agree with Tristan's analysis of 0002.

Updated 0002, to only change the comment. But honestly I don't think
using "default" really introduces many chances for future bugs in this
case, since it seems very unlikely we'll ever add new variants to the
PostgresPollingStatusType enum.

Attachment

Re: psql not responding to SIGINT upon db reconnection

From
Robert Haas
Date:
On Thu, Apr 4, 2024 at 6:43 AM Jelte Fennema-Nio <postgres@jeltef.nl> wrote:
> But I don't agree with this. Having pg_unreachable in places where it
> brings no perf benefit has two main downsides:
>
> 1. It's extra lines of code
> 2. If a programmer/reviewer is not careful about maintaining this
> unreachable invariant (now or in the future), undefined behavior can
> be introduced quite easily.
>
> Also, I'd expect any optimizing compiler to know that the code after
> the loop is already unreachable if there are no break/goto statements
> in its body.

I'm not super-well-informed about the effects of pg_unreachable(), but
I feel like it's a little strange to complain that it's an extra line
of code. Presumably, it translates into no executable instructions,
and might even reduce the size of the generated code. Now, it could
still be a cognitive burden on human readers, but hopefully it should
do the exact opposite: it should make the code easier to understand by
documenting that the author thought that a certain point in the code
could not be reached. In that sense, it's like a code comment.

Likewise, it certainly is true that one must be careful to put
pg_unreachable() only in places that aren't reachable, and to add or
remove it as appropriate if the set of unreachable places changes. But
it's also true that one must be careful to put any other thing that
looks like a function call only in places where that thing is
appropriate.

> > I agree with Tristan's analysis of 0002.
>
> Updated 0002, to only change the comment. But honestly I don't think
> using "default" really introduces many chances for future bugs in this
> case, since it seems very unlikely we'll ever add new variants to the
> PostgresPollingStatusType enum.

That might be true, but we've avoided using default: for switches over
lots of other enum types in lots of other places in PostgreSQL, and
overall it's saved us from lots of bugs. I'm not a stickler about
this; if there's some reason not to do it in some particular case,
that's fine with me. But where it's possible to do it without any real
disadvantage, I think it's a healthy practice.

I have committed these versions of the patches.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: psql not responding to SIGINT upon db reconnection

From
"Tristan Partin"
Date:
Thanks Jelte and Robert for the extra effort to correct my mistake.
I apologize. Bad copy-paste from a previous revision of the patch at
some point.

--
Tristan Partin
Neon (https://neon.tech)