Thread: Crash on attempt to connect to nonstarted server
I get a crash on win32 when connecting to a server that's not started. In fe-connect.c, we have: display_host_addr = (conn->pghostaddr == NULL) && (strcmp(conn->pghost, host_addr) != 0); In my case, conn->pghost is NULL at this point, as is conn->pghostaddr. Thus, it crashes in strcmp(). -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Magnus Hagander wrote: > I get a crash on win32 when connecting to a server that's not started. > In fe-connect.c, we have: > > display_host_addr = (conn->pghostaddr == NULL) && > (strcmp(conn->pghost, host_addr) != 0); > > In my case, conn->pghost is NULL at this point, as is > conn->pghostaddr. Thus, it crashes in strcmp(). I have researched this with Magnus, and was able to reproduce the failure. It happens only on Win32 because that is missing unix-domain sockets so "" maps to localhost, which is an IP address. I have applied the attached patch. The new output is: $ psql test psql: could not connect to server: Connection refused Is the server running on host "???" and accepting TCP/IP connections on port 5432? Note the "???". This happens because the mapping of "" to localhost happens below the libpq library variable level. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c index b1523a6..8d9400b 100644 *** /tmp/pgrevert.7311/PXMjec_fe-connect.c Thu Dec 16 08:36:11 2010 --- src/interfaces/libpq/fe-connect.c Thu Dec 16 08:31:51 2010 *************** connectFailureMessage(PGconn *conn, int *** 1031,1037 **** strcpy(host_addr, "???"); display_host_addr = (conn->pghostaddr == NULL) && ! (strcmp(conn->pghost, host_addr) != 0); appendPQExpBuffer(&conn->errorMessage, libpq_gettext("could not connect to server: %s\n" --- 1031,1038 ---- strcpy(host_addr, "???"); display_host_addr = (conn->pghostaddr == NULL) && ! (conn->pghost != NULL) && ! (strcmp(conn->pghost, host_addr) != 0); appendPQExpBuffer(&conn->errorMessage, libpq_gettext("could not connect to server: %s\n"
Magnus Hagander <magnus@hagander.net> writes: > I get a crash on win32 when connecting to a server that's not started. > In fe-connect.c, we have: > display_host_addr = (conn->pghostaddr == NULL) && > (strcmp(conn->pghost, host_addr) != 0); > In my case, conn->pghost is NULL at this point, as is > conn->pghostaddr. Thus, it crashes in strcmp(). [ scratches head... ] I seem to remember having decided that patch was OK because what was there before already assumed conn->pghost would be set. Under exactly what conditions could we get this far with neither field being set? regards, tom lane
Tom Lane wrote: > Magnus Hagander <magnus@hagander.net> writes: > > I get a crash on win32 when connecting to a server that's not started. > > In fe-connect.c, we have: > > > display_host_addr = (conn->pghostaddr == NULL) && > > (strcmp(conn->pghost, host_addr) != 0); > > > In my case, conn->pghost is NULL at this point, as is > > conn->pghostaddr. Thus, it crashes in strcmp(). > > [ scratches head... ] I seem to remember having decided that patch was > OK because what was there before already assumed conn->pghost would be > set. Under exactly what conditions could we get this far with neither > field being set? OK, sure, I can explain. What happens in libpq is that when no host name is supplied, you get a default. On Unix, that is unix-domain sockets, but on Win32, that is localhost, meaning IP. The problem is that the mapping of "" maps to localhost in connectDBStart(), specificially here: #ifdef HAVE_UNIX_SOCKETS /* pghostaddr and pghost are NULL, so use Unix domain socket */ node = NULL; hint.ai_family = AF_UNIX; UNIXSOCK_PATH(portstr, portnum, conn->pgunixsocket);#else /* Without Unix sockets,default to localhost instead */ node = "localhost"; hint.ai_family = AF_UNSPEC;#endif /* HAVE_UNIX_SOCKETS*/ The problem is that this is setting up the pg_getaddrinfo_all() call, and is _not_ setting any of the libpq variables that we actually test in the error message section that had the bug. The 9.0 code has a convoluted test in the appendPQExpBuffer statement: appendPQExpBuffer(&conn->errorMessage, libpq_gettext("could not connect to server: %s\n" "\tIs the server running on host \"%s\" and accepting\n" "\tTCP/IPconnections on port %s?\n"), SOCK_STRERROR(errorno, sebuf, sizeof(sebuf)), conn->pghostaddr ? conn->pghostaddr : (conn->pghost ? conn->pghost : "???"), conn->pgport); but it clearly expects either or both could be NULL. That code is actually still in appendPQExpBuffer() in git master. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +