Thread: [PATCH] Prevent hanging on unreachable hosts on startup

[PATCH] Prevent hanging on unreachable hosts on startup

From
"Ryan P. Kelly"
Date:
The signal handler installed by setup_cancel_handler() will ignore
attempts to exit psql should a host be unreachable. Since the
functionality it provides won't be used until later, it doesn't make
sense to set it up so early. Therefore, move the signal handler closer
to where it is first needed.
---
 src/bin/psql/startup.c |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index 8b1864c..e53d84c 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -111,8 +111,6 @@ main(int argc, char *argv[])
     setvbuf(stderr, NULL, _IONBF, 0);
 #endif

-    setup_cancel_handler();
-
     pset.progname = get_progname(argv[0]);

     pset.db = NULL;
@@ -245,8 +243,9 @@ main(int argc, char *argv[])
     }

     /*
-     * Now find something to do
+     * Now find something to do (and handle cancellation, if applicable)
      */
+    setup_cancel_handler();

     /*
      * process file given by -f
--
1.7.7.4

Re: [PATCH] Prevent hanging on unreachable hosts on startup

From
Tom Lane
Date:
"Ryan P. Kelly" <rpkelly22@gmail.com> writes:
> The signal handler installed by setup_cancel_handler() will ignore
> attempts to exit psql should a host be unreachable.

Hm.  That may be worth doing something about, but the proposed patch
seems more like a quick-and-dirty hack than a solution.  The main
thing I don't like about it is that if we care about this scenario
during initial startup, we should also care about it during a \c
command, but just delaying the installation of the signal handler
won't fix that case.

More generally, what if the server becomes unreachable mid-session?
I'm not sure how much there is to be done about that case, since
there's probably no good way to distinguish it from a query that
takes a really long time.  But if we're going to think about this,
we might as well think about all the cases.

            regards, tom lane

Re: [PATCH] Prevent hanging on unreachable hosts on startup

From
Ryan Kelly
Date:
On Wed, Jan 04, 2012 at 09:36:57PM -0500, Tom Lane wrote:
> "Ryan P. Kelly" <rpkelly22@gmail.com> writes:
> > The signal handler installed by setup_cancel_handler() will ignore
> > attempts to exit psql should a host be unreachable.
>
> Hm.  That may be worth doing something about, but the proposed patch
> seems more like a quick-and-dirty hack than a solution.  The main
> thing I don't like about it is that if we care about this scenario
> during initial startup, we should also care about it during a \c
> command, but just delaying the installation of the signal handler
> won't fix that case.
Sure, but do you still think moving the signal handler down is correct?
I've attached another patch that handles interrupting \c. I also noticed
there appears to be what looks like duplication of code in startup.c and
command.c in main() and do_connect() respectively. I'm wondering if I
they should be made to share the same code which (with my patch) would
then mean the signal handler could be left where it was.

> More generally, what if the server becomes unreachable mid-session?
> I'm not sure how much there is to be done about that case, since
> there's probably no good way to distinguish it from a query that
> takes a really long time.  But if we're going to think about this,
> we might as well think about all the cases.
Well in this case we can still interrupt the query, no? This will drop
us back into the prompt, where we can safely \q.

>
>             regards, tom lane

-Ryan Kelly


Re: [PATCH] Prevent hanging on unreachable hosts on startup

From
Bruce Momjian
Date:
On Thu, Jan  5, 2012 at 09:23:55AM -0500, Ryan Kelly wrote:
> On Wed, Jan 04, 2012 at 09:36:57PM -0500, Tom Lane wrote:
> > "Ryan P. Kelly" <rpkelly22@gmail.com> writes:
> > > The signal handler installed by setup_cancel_handler() will ignore
> > > attempts to exit psql should a host be unreachable.
> >
> > Hm.  That may be worth doing something about, but the proposed patch
> > seems more like a quick-and-dirty hack than a solution.  The main
> > thing I don't like about it is that if we care about this scenario
> > during initial startup, we should also care about it during a \c
> > command, but just delaying the installation of the signal handler
> > won't fix that case.
> Sure, but do you still think moving the signal handler down is correct?
> I've attached another patch that handles interrupting \c. I also noticed
> there appears to be what looks like duplication of code in startup.c and
> command.c in main() and do_connect() respectively. I'm wondering if I
> they should be made to share the same code which (with my patch) would
> then mean the signal handler could be left where it was.
>
> > More generally, what if the server becomes unreachable mid-session?
> > I'm not sure how much there is to be done about that case, since
> > there's probably no good way to distinguish it from a query that
> > takes a really long time.  But if we're going to think about this,
> > we might as well think about all the cases.
> Well in this case we can still interrupt the query, no? This will drop
> us back into the prompt, where we can safely \q.

Tom, can you comment on this patch because you commented on the previous
version?  Thanks.

---------------------------------------------------------------------------


> >From 8f9d0b5088021d944aceac65a96d7bd0c24aa0c6 Mon Sep 17 00:00:00 2001
> From: "Ryan P. Kelly" <rpkelly22@gmail.com>
> Date: Thu, 5 Jan 2012 09:13:38 -0500
> Subject: [PATCH] Allow interrupting hanging \c connection attempts
>
> ---
>  src/bin/psql/command.c |   17 +++++++++++++----
>  1 files changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
> index 69fac83..845705d 100644
> --- a/src/bin/psql/command.c
> +++ b/src/bin/psql/command.c
> @@ -1516,7 +1516,7 @@ static bool
>  do_connect(char *dbname, char *user, char *host, char *port)
>  {
>      PGconn       *o_conn = pset.db,
> -               *n_conn;
> +               *n_conn = NULL;
>      char       *password = NULL;
>
>      if (!dbname)
> @@ -1570,7 +1570,13 @@ do_connect(char *dbname, char *user, char *host, char *port)
>          keywords[7] = NULL;
>          values[7] = NULL;
>
> -        n_conn = PQconnectdbParams(keywords, values, true);
> +        if (sigsetjmp(sigint_interrupt_jmp, 1) != 0) {
> +            /* got here with longjmp */
> +        } else {
> +            sigint_interrupt_enabled = true;
> +            n_conn = PQconnectdbParams(keywords, values, true);
> +            sigint_interrupt_enabled = false;
> +        }
>
>          free(keywords);
>          free(values);
> @@ -1600,7 +1606,8 @@ do_connect(char *dbname, char *user, char *host, char *port)
>           */
>          if (pset.cur_cmd_interactive)
>          {
> -            psql_error("%s", PQerrorMessage(n_conn));
> +            if (n_conn)
> +                psql_error("%s", PQerrorMessage(n_conn));
>
>              /* pset.db is left unmodified */
>              if (o_conn)
> @@ -1608,7 +1615,9 @@ do_connect(char *dbname, char *user, char *host, char *port)
>          }
>          else
>          {
> -            psql_error("\\connect: %s", PQerrorMessage(n_conn));
> +            if (n_conn)
> +                psql_error("\\connect: %s", PQerrorMessage(n_conn));
> +
>              if (o_conn)
>              {
>                  PQfinish(o_conn);
> --
> 1.7.7.4
>

>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs


--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Re: [PATCH] Prevent hanging on unreachable hosts on startup

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Tom, can you comment on this patch because you commented on the previous
> version?  Thanks.

Doesn't that provoke a pile of compiler warnings about local variables
potentially being altered during longjmp?  It sure looks pretty unsafe
from here.  It also fails to print any error message if a connection
attempt is canceled.

A bigger-picture question is just how safe it is to longjmp out of libpq
partway through a connection attempt.  At the very least, that's likely
to leak an open socket and some memory.  Maybe that doesn't matter too
much for psql, but I'm not convinced.  It seems fairly likely that
internal state for libc (malloc), openssl, etc, could be left in a
corrupt state if the interrupt happens at just the wrong time.  (This
didn't matter so much for the original proposal, where we'd just have
been exiting psql anyway on failure.  But if it's in \c then we want
psql to still be operational after a failure.)

We could perhaps dodge the longjmp safety question by using a
PQconnectStartParams/PQconnectPoll loop.  I'm not sure that provides
100% the same functionality --- in particular, I think it wouldn't allow
interrupting out of a DNS name resolution delay.  But it'd be less
likely to result in strange crashes.

            regards, tom lane

Re: [PATCH] Prevent hanging on unreachable hosts on startup

From
Ryan Kelly
Date:
On Mon, Aug 27, 2012 at 09:34:48PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Tom, can you comment on this patch because you commented on the previous
> > version?  Thanks.
>
> Doesn't that provoke a pile of compiler warnings about local variables
> potentially being altered during longjmp?  It sure looks pretty unsafe
> from here.  It also fails to print any error message if a connection
> attempt is canceled.
>
> A bigger-picture question is just how safe it is to longjmp out of libpq
> partway through a connection attempt.  At the very least, that's likely
> to leak an open socket and some memory.  Maybe that doesn't matter too
> much for psql, but I'm not convinced.  It seems fairly likely that
> internal state for libc (malloc), openssl, etc, could be left in a
> corrupt state if the interrupt happens at just the wrong time.  (This
> didn't matter so much for the original proposal, where we'd just have
> been exiting psql anyway on failure.  But if it's in \c then we want
> psql to still be operational after a failure.)
>
> We could perhaps dodge the longjmp safety question by using a
> PQconnectStartParams/PQconnectPoll loop.  I'm not sure that provides
> 100% the same functionality --- in particular, I think it wouldn't allow
> interrupting out of a DNS name resolution delay.  But it'd be less
> likely to result in strange crashes.
>
>             regards, tom lane

The latest on this patch can be found here:
http://archives.postgresql.org/message-id/500151C4.5010509@enterprisedb.com

-Ryan Kelly