Thread: Fallout from PQhost() semantics changes

Fallout from PQhost() semantics changes

From
Tom Lane
Date:
Traditionally (prior to v10), PQhost() returned the "host" connection
parameter if that was nonempty, otherwise the default host name
(DEFAULT_PGSOCKET_DIR or "localhost" depending on platform).

That got whacked around to a state of brokenness in v10 (which I'll return
to in a bit), and then commit 1944cdc98 fixed it to return the active
host's connhost[].host string if nonempty, else the connhost[].hostaddr
string if nonempty, else an empty string.  Together with the fact that the
default host name gets inserted into connhost[].host if neither option is
supplied, that's compatible with the traditional behavior when host is
supplied or when both options are omitted.  It's not the same when only
hostaddr is supplied.  This change is generally a good thing: returning
the default host name is pretty misleading if hostaddr actually points at
some remote server.  However, it seems that insufficient attention was
paid to whether *every* call site is OK with it.

In particular, libpq has several internal calls to PQhost() to get the
host name to be compared to a server SSL certificate, or for comparable
usages in GSS and SSPI authentication.  These changes mean that sometimes
we will be comparing the server's numeric address, not its hostname,
to the server auth information.  I do not think that was the intention;
it's certainly in direct contradiction to our documentation, which clearly
says that the host name parameter and nothing else is used for this
purpose.  It's not clear to me if this could amount to a security problem,
but at the least it's wrongly documented.

What I think we should do about it is change those internal calls to
fetch connhost[].host directly instead of going through PQhost(), as
in the attached libpq-internal-PQhost-usage-1.patch.  This will restore
the semantics to what they were pre-v10, including erroring out when
hostaddr is supplied without host.

I also noted that psql's \conninfo code takes it upon itself to substitute
the value of the hostaddr parameter, if used, for the result of PQhost().
This is entirely wrong/unhelpful if multiple host targets were specified;
moreover, that patch failed to account for the very similar connection
info printout in do_connect().  Given the change in PQhost's behavior
I think it'd be fine to just drop that complexity and print PQhost's
result without any editorialization, as in the attached
psql-conninfo-PQhost-usage-1.patch.

I would also like to make the case for back-patching 1944cdc98 into v10.
I'm not sure why that wasn't done to begin with, because v10's PQhost()
is just completely broken for cases involving a hostaddr specification:

    if (!conn)
        return NULL;
    if (conn->connhost != NULL &&
        conn->connhost[conn->whichhost].type != CHT_HOST_ADDRESS)
        return conn->connhost[conn->whichhost].host;
    else if (conn->pghost != NULL && conn->pghost[0] != '\0')
        return conn->pghost;
    else
    {
#ifdef HAVE_UNIX_SOCKETS
        return DEFAULT_PGSOCKET_DIR;
#else
        return DefaultHost;
#endif
    }

In the CHT_HOST_ADDRESS case, it will either give back the raw host
parameter (again, wrong if multiple hosts are targeted) or give back
DEFAULT_PGSOCKET_DIR/DefaultHost if the host wasn't specified.
Ignoring the brokenness for multiple target hosts, you could argue
that that's compatible with pre-v10 behavior ... but it's still pretty
misleading to give back DefaultHost, much less DEFAULT_PGSOCKET_DIR,
for a remote connection.  (There's at least some chance that the
hostaddr is actually 127.0.0.1 or ::1.  There is no chance that
DEFAULT_PGSOCKET_DIR is an appropriate description.)

Given that we whacked around v10 libpq's behavior for some related corner
cases earlier this week, I think it'd be OK to change this in v10.
If we do, it'd make sense to back-patch psql-conninfo-PQhost-usage-1.patch
into v10 as well.  I think that libpq-internal-PQhost-usage-1.patch should
be back-patched to v10 in any case, since whether or not you want to live
with the existing behavior of PQhost() in v10, it's surely not appropriate
for comparing to server SSL certificates.

In fact, I think there's probably a good case for doing something
comparable to libpq-internal-PQhost-usage-1.patch all the way back.
In exactly what scenario is it sane to be comparing "/tmp" or
"localhost" to a server's SSL certificate?

            regards, tom lane

diff --git a/src/interfaces/libpq/fe-auth.c b/src/interfaces/libpq/fe-auth.c
index 3b2073a..09012c5 100644
--- a/src/interfaces/libpq/fe-auth.c
+++ b/src/interfaces/libpq/fe-auth.c
@@ -199,7 +199,7 @@ pg_GSS_startup(PGconn *conn, int payloadlen)
                 min_stat;
     int            maxlen;
     gss_buffer_desc temp_gbuf;
-    char       *host = PQhost(conn);
+    char       *host = conn->connhost[conn->whichhost].host;

     if (!(host && host[0] != '\0'))
     {
@@ -414,7 +414,7 @@ pg_SSPI_startup(PGconn *conn, int use_negotiate, int payloadlen)
 {
     SECURITY_STATUS r;
     TimeStamp    expire;
-    char       *host = PQhost(conn);
+    char       *host = conn->connhost[conn->whichhost].host;

     if (conn->sspictx)
     {
diff --git a/src/interfaces/libpq/fe-secure-common.c b/src/interfaces/libpq/fe-secure-common.c
index 40203f3..b3f580f 100644
--- a/src/interfaces/libpq/fe-secure-common.c
+++ b/src/interfaces/libpq/fe-secure-common.c
@@ -88,10 +88,17 @@ pq_verify_peer_name_matches_certificate_name(PGconn *conn,
 {
     char       *name;
     int            result;
-    char       *host = PQhost(conn);
+    char       *host = conn->connhost[conn->whichhost].host;

     *store_name = NULL;

+    if (!(host && host[0] != '\0'))
+    {
+        printfPQExpBuffer(&conn->errorMessage,
+                          libpq_gettext("host name must be specified\n"));
+        return -1;
+    }
+
     /*
      * There is no guarantee the string returned from the certificate is
      * NULL-terminated, so make a copy that is.
@@ -145,7 +152,7 @@ pq_verify_peer_name_matches_certificate_name(PGconn *conn,
 bool
 pq_verify_peer_name_matches_certificate(PGconn *conn)
 {
-    char       *host = PQhost(conn);
+    char       *host = conn->connhost[conn->whichhost].host;
     int            rc;
     int            names_examined = 0;
     char       *first_name = NULL;
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index f82f361..5b4d54a 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -595,25 +595,7 @@ exec_command_conninfo(PsqlScanState scan_state, bool active_branch)
             printf(_("You are currently not connected to a database.\n"));
         else
         {
-            char       *host;
-            PQconninfoOption *connOptions;
-            PQconninfoOption *option;
-
-            host = PQhost(pset.db);
-            /* A usable "hostaddr" overrides the basic sense of host. */
-            connOptions = PQconninfo(pset.db);
-            if (connOptions == NULL)
-            {
-                psql_error("out of memory\n");
-                exit(EXIT_FAILURE);
-            }
-            for (option = connOptions; option && option->keyword; option++)
-                if (strcmp(option->keyword, "hostaddr") == 0)
-                {
-                    if (option->val != NULL && option->val[0] != '\0')
-                        host = option->val;
-                    break;
-                }
+            char       *host = PQhost(pset.db);

             /* If the host is an absolute path, the connection is via socket */
             if (is_absolute_path(host))
@@ -623,8 +605,6 @@ exec_command_conninfo(PsqlScanState scan_state, bool active_branch)
                 printf(_("You are connected to database \"%s\" as user \"%s\" on host \"%s\" at port \"%s\".\n"),
                        db, PQuser(pset.db), host, PQport(pset.db));
             printSSLInfo();
-
-            PQconninfoFree(connOptions);
         }
     }


Re: Fallout from PQhost() semantics changes

From
Haribabu Kommi
Date:
On Fri, Aug 3, 2018 at 2:24 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Traditionally (prior to v10), PQhost() returned the "host" connection
parameter if that was nonempty, otherwise the default host name
(DEFAULT_PGSOCKET_DIR or "localhost" depending on platform).

That got whacked around to a state of brokenness in v10 (which I'll return
to in a bit), and then commit 1944cdc98 fixed it to return the active
host's connhost[].host string if nonempty, else the connhost[].hostaddr
string if nonempty, else an empty string.  Together with the fact that the
default host name gets inserted into connhost[].host if neither option is
supplied, that's compatible with the traditional behavior when host is
supplied or when both options are omitted.  It's not the same when only
hostaddr is supplied.  This change is generally a good thing: returning
the default host name is pretty misleading if hostaddr actually points at
some remote server.  However, it seems that insufficient attention was
paid to whether *every* call site is OK with it.

Thanks for finding out the problem. I didn't give close attention to the callers
of the PQhost() function if it returns a hostaddress.
 
In particular, libpq has several internal calls to PQhost() to get the
host name to be compared to a server SSL certificate, or for comparable
usages in GSS and SSPI authentication.  These changes mean that sometimes
we will be comparing the server's numeric address, not its hostname,
to the server auth information.  I do not think that was the intention;
it's certainly in direct contradiction to our documentation, which clearly
says that the host name parameter and nothing else is used for this
purpose.  It's not clear to me if this could amount to a security problem,
but at the least it's wrongly documented.

What I think we should do about it is change those internal calls to
fetch connhost[].host directly instead of going through PQhost(), as
in the attached libpq-internal-PQhost-usage-1.patch.  This will restore
the semantics to what they were pre-v10, including erroring out when
hostaddr is supplied without host.

The Attached patch is good and I also verified that it is not missed anymore
places that needs only a host.

I also noted that psql's \conninfo code takes it upon itself to substitute
the value of the hostaddr parameter, if used, for the result of PQhost().
This is entirely wrong/unhelpful if multiple host targets were specified;
moreover, that patch failed to account for the very similar connection
info printout in do_connect().  Given the change in PQhost's behavior
I think it'd be fine to just drop that complexity and print PQhost's
result without any editorialization, as in the attached
psql-conninfo-PQhost-usage-1.patch.

I applied and tested this patch and it works fine.
 
I would also like to make the case for back-patching 1944cdc98 into v10.
I'm not sure why that wasn't done to begin with, because v10's PQhost()
is just completely broken for cases involving a hostaddr specification:

    if (!conn)
        return NULL;
    if (conn->connhost != NULL &&
        conn->connhost[conn->whichhost].type != CHT_HOST_ADDRESS)
        return conn->connhost[conn->whichhost].host;
    else if (conn->pghost != NULL && conn->pghost[0] != '\0')
        return conn->pghost;
    else
    {
#ifdef HAVE_UNIX_SOCKETS
        return DEFAULT_PGSOCKET_DIR;
#else
        return DefaultHost;
#endif
    }

In the CHT_HOST_ADDRESS case, it will either give back the raw host
parameter (again, wrong if multiple hosts are targeted) or give back
DEFAULT_PGSOCKET_DIR/DefaultHost if the host wasn't specified.
Ignoring the brokenness for multiple target hosts, you could argue
that that's compatible with pre-v10 behavior ... but it's still pretty
misleading to give back DefaultHost, much less DEFAULT_PGSOCKET_DIR,
for a remote connection.  (There's at least some chance that the
hostaddr is actually 127.0.0.1 or ::1.  There is no chance that
DEFAULT_PGSOCKET_DIR is an appropriate description.)

Given that we whacked around v10 libpq's behavior for some related corner
cases earlier this week, I think it'd be OK to change this in v10.
If we do, it'd make sense to back-patch psql-conninfo-PQhost-usage-1.patch
into v10 as well.  I think that libpq-internal-PQhost-usage-1.patch should
be back-patched to v10 in any case, since whether or not you want to live
with the existing behavior of PQhost() in v10, it's surely not appropriate
for comparing to server SSL certificates.

I agree to back-patching the commit 1944cdc98 into v10, because the problems
of libpq-internal-PQhost-usage-1.patch fix are present in v10 when the
connected host is of CHT_HOST_ADDRESS.

In fact, I think there's probably a good case for doing something
comparable to libpq-internal-PQhost-usage-1.patch all the way back.
In exactly what scenario is it sane to be comparing "/tmp" or
"localhost" to a server's SSL certificate?

Yes, I agree that this problem present from a long, but may be till now
everyone using along with host only?

Regards,
Haribabu Kommi
Fujitsu Australia

Re: Fallout from PQhost() semantics changes

From
Tom Lane
Date:
Haribabu Kommi <kommi.haribabu@gmail.com> writes:
> On Fri, Aug 3, 2018 at 2:24 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> What I think we should do about it is change those internal calls to
>> fetch connhost[].host directly instead of going through PQhost(), as
>> in the attached libpq-internal-PQhost-usage-1.patch.  This will restore
>> the semantics to what they were pre-v10, including erroring out when
>> hostaddr is supplied without host.

> The Attached patch is good and I also verified that it is not missed anymore
> places that needs only a host.

Thanks for reviewing!

>> In fact, I think there's probably a good case for doing something
>> comparable to libpq-internal-PQhost-usage-1.patch all the way back.
>> In exactly what scenario is it sane to be comparing "/tmp" or
>> "localhost" to a server's SSL certificate?

> Yes, I agree that this problem present from a long, but may be till now
> everyone using along with host only?

Yeah, after thinking some more I'm not excited about changing this
pre-v10.  There haven't been any field complaints, and there's also
a technical problem: the older versions didn't insert the default host
value (DEFAULT_PGSOCKET_DIR/DefaultHost) into the data structure but
just substituted it on-the-fly.  So these call sites in fe-auth.c etc
would have to do that too, which seems messy and error-prone.

You could argue perhaps that we shouldn't be making that substitution
at all for fe-auth.c's purposes, but I think it's all right when both
host and hostaddr have been omitted.  The default value then accurately
describes where we're connecting, and it's the same as the behavior
you'd get if you'd written out "host=/tmp" or "host=localhost" explicitly.
So I don't want to touch that behavior --- and indeed, I imagine that the
reason these call sites are using PQhost() is exactly that we wanted that
substitution to happen for these purposes.

Now that the default does get injected into the data structure, but only
when hostaddr isn't supplied, looking directly at the .host field does
exactly what we need in these places.  But we'd need a bit of logic to get
comparable behavior pre-v10, and I don't think it's worth messing with.

            regards, tom lane