Thread: Crash on attempt to connect to nonstarted server

Crash on attempt to connect to nonstarted server

From
Magnus Hagander
Date:
I get a crash on win32 when connecting to a server that's not started.
In fe-connect.c, we have:
    display_host_addr = (conn->pghostaddr == NULL) &&        (strcmp(conn->pghost, host_addr) != 0);

In my case, conn->pghost is NULL at this point, as is
conn->pghostaddr. Thus, it crashes in strcmp().

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: Crash on attempt to connect to nonstarted server

From
Bruce Momjian
Date:
Magnus Hagander wrote:
> I get a crash on win32 when connecting to a server that's not started.
> In fe-connect.c, we have:
>
>         display_host_addr = (conn->pghostaddr == NULL) &&
>             (strcmp(conn->pghost, host_addr) != 0);
>
> In my case, conn->pghost is NULL at this point, as is
> conn->pghostaddr. Thus, it crashes in strcmp().

I have researched this with Magnus, and was able to reproduce the
failure.  It happens only on Win32 because that is missing unix-domain
sockets so "" maps to localhost, which is an IP address.  I have applied
the attached patch.  The new output is:

    $ psql test
    psql: could not connect to server: Connection refused
        Is the server running on host "???" and accepting
        TCP/IP connections on port 5432?

Note the "???".  This happens because the mapping of "" to localhost
happens below the libpq library variable level.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index b1523a6..8d9400b 100644
*** /tmp/pgrevert.7311/PXMjec_fe-connect.c    Thu Dec 16 08:36:11 2010
--- src/interfaces/libpq/fe-connect.c    Thu Dec 16 08:31:51 2010
*************** connectFailureMessage(PGconn *conn, int
*** 1031,1037 ****
              strcpy(host_addr, "???");

          display_host_addr = (conn->pghostaddr == NULL) &&
!             (strcmp(conn->pghost, host_addr) != 0);

          appendPQExpBuffer(&conn->errorMessage,
                            libpq_gettext("could not connect to server: %s\n"
--- 1031,1038 ----
              strcpy(host_addr, "???");

          display_host_addr = (conn->pghostaddr == NULL) &&
!                             (conn->pghost != NULL) &&
!                             (strcmp(conn->pghost, host_addr) != 0);

          appendPQExpBuffer(&conn->errorMessage,
                            libpq_gettext("could not connect to server: %s\n"

Re: Crash on attempt to connect to nonstarted server

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> I get a crash on win32 when connecting to a server that's not started.
> In fe-connect.c, we have:

>         display_host_addr = (conn->pghostaddr == NULL) &&
>             (strcmp(conn->pghost, host_addr) != 0);

> In my case, conn->pghost is NULL at this point, as is
> conn->pghostaddr. Thus, it crashes in strcmp().

[ scratches head... ]  I seem to remember having decided that patch was
OK because what was there before already assumed conn->pghost would be
set.  Under exactly what conditions could we get this far with neither
field being set?
        regards, tom lane


Re: Crash on attempt to connect to nonstarted server

From
Bruce Momjian
Date:
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
> > I get a crash on win32 when connecting to a server that's not started.
> > In fe-connect.c, we have:
> 
> >         display_host_addr = (conn->pghostaddr == NULL) &&
> >             (strcmp(conn->pghost, host_addr) != 0);
> 
> > In my case, conn->pghost is NULL at this point, as is
> > conn->pghostaddr. Thus, it crashes in strcmp().
> 
> [ scratches head... ]  I seem to remember having decided that patch was
> OK because what was there before already assumed conn->pghost would be
> set.  Under exactly what conditions could we get this far with neither
> field being set?

OK, sure, I can explain.  What happens in libpq is that when no host
name is supplied, you get a default.  On Unix, that is unix-domain
sockets, but on Win32, that is localhost, meaning IP.

The problem is that the mapping of "" maps to localhost in
connectDBStart(), specificially here:
#ifdef HAVE_UNIX_SOCKETS        /* pghostaddr and pghost are NULL, so use Unix domain socket */        node = NULL;
  hint.ai_family = AF_UNIX;        UNIXSOCK_PATH(portstr, portnum, conn->pgunixsocket);#else        /* Without Unix
sockets,default to localhost instead */        node = "localhost";        hint.ai_family = AF_UNSPEC;#endif   /*
HAVE_UNIX_SOCKETS*/
 

The problem is that this is setting up the pg_getaddrinfo_all() call,
and is _not_ setting any of the libpq variables that we actually test in
the error message section that had the bug.

The 9.0 code has a convoluted test in the appendPQExpBuffer statement:
       appendPQExpBuffer(&conn->errorMessage,                         libpq_gettext("could not connect to server: %s\n"
                  "\tIs the server running on host \"%s\" and accepting\n"
"\tTCP/IPconnections on port %s?\n"),                         SOCK_STRERROR(errorno, sebuf, sizeof(sebuf)),
           conn->pghostaddr                         ? conn->pghostaddr                         : (conn->pghost
                 ? conn->pghost                            : "???"),                         conn->pgport);
 

but it clearly expects either or both could be NULL.  That code is
actually still in appendPQExpBuffer() in git master.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +