Thread: PQHost() undefined behavior if connecting string contains both hostand hostaddr types


While working on [1], we find out the inconsistency in PQHost() behavior
if the connecting string that is passed to connect to the server contains
multiple hosts with both host and hostaddr types. For example,

host=host1,host2 hostaddr=127.0.0.1,127.0.0.1 port=5434,5432

As the hostaddr is given preference when both host and hostaddr is 
specified, so the connection type for both addresses of the above
conninfo is CHT_HOST_ADDRESS. So the PQhost() returns the
conn->pghost value i.e "host1,host2" instead of the actual host that
is connected.

Instead of checking the connection type while returning the host
details, it should check whether the host is NULL or not? with this
change it returns the expected value for all the connection types.

Attachment
On Sun, Jan 14, 2018 at 02:19:26PM +1100, Haribabu Kommi wrote:
> While working on [1], we find out the inconsistency in PQHost() behavior
> if the connecting string that is passed to connect to the server contains
> multiple hosts with both host and hostaddr types. For example,
>
> host=host1,host2 hostaddr=127.0.0.1,127.0.0.1 port=5434,5432
>
> As the hostaddr is given preference when both host and hostaddr is
> specified, so the connection type for both addresses of the above
> conninfo is CHT_HOST_ADDRESS. So the PQhost() returns the
> conn->pghost value i.e "host1,host2" instead of the actual host that
> is connected.

During the discussion of adding the client-side connection parameters to
pg_stat_wal_receiver, which is useful when the client specifies multiple
hosts and ports, it has been discussed that introducing a new API in
libpq to get effective host, hostaddr and port values would be necessary
in order to get all the useful information wanted, however this has a
larger impact than initially thought as any user showing the host
information in psql's PROMPT would be equally confused. Any caller of
PQhost have the same problem.

> Instead of checking the connection type while returning the host
> details, it should check whether the host is NULL or not? with this
> change it returns the expected value for all the connection types.

I mentioned that on the other thread, but this seems like an improvement
to me, which leads to less confusion. See here for more details
regarding what we get today on HEAD:
https://www.postgresql.org/message-id/20180109011547.GE76418%40paquier.xyz
--
Michael

Attachment


On Sun, Jan 14, 2018 at 9:44 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Sun, Jan 14, 2018 at 02:19:26PM +1100, Haribabu Kommi wrote:
> While working on [1], we find out the inconsistency in PQHost() behavior
> if the connecting string that is passed to connect to the server contains
> multiple hosts with both host and hostaddr types. For example,
>
> host=host1,host2 hostaddr=127.0.0.1,127.0.0.1 port=5434,5432
>
> As the hostaddr is given preference when both host and hostaddr is
> specified, so the connection type for both addresses of the above
> conninfo is CHT_HOST_ADDRESS. So the PQhost() returns the
> conn->pghost value i.e "host1,host2" instead of the actual host that
> is connected.

During the discussion of adding the client-side connection parameters to
pg_stat_wal_receiver, which is useful when the client specifies multiple
hosts and ports, it has been discussed that introducing a new API in
libpq to get effective host, hostaddr and port values would be necessary
in order to get all the useful information wanted, however this has a
larger impact than initially thought as any user showing the host
information in psql's PROMPT would be equally confused. Any caller of
PQhost have the same problem.

> Instead of checking the connection type while returning the host
> details, it should check whether the host is NULL or not? with this
> change it returns the expected value for all the connection types.

I mentioned that on the other thread, but this seems like an improvement
to me, which leads to less confusion. See here for more details
regarding what we get today on HEAD:
https://www.postgresql.org/message-id/20180109011547.GE76418%40paquier.xyz


In the other thread of enhancing pg_stat_wal_receiver to display the remote
host, the details of why the hostaddr was added and reverted for and how it can be
changed now the PQhost() function to return the host and not the hostaddr is provided
in mail [1]. It will be useful to take some decision with the PQhost() function.

Added to the next open commitfest.



Regards,
Hari Babu
Fujitsu Australia
On 1/13/18 22:19, Haribabu Kommi wrote:
> While working on [1], we find out the inconsistency in PQHost() behavior
> if the connecting string that is passed to connect to the server contains
> multiple hosts with both host and hostaddr types. For example,
> 
> host=host1,host2 hostaddr=127.0.0.1,127.0.0.1 port=5434,5432
> 
> As the hostaddr is given preference when both host and hostaddr is 
> specified, so the connection type for both addresses of the above
> conninfo is CHT_HOST_ADDRESS. So the PQhost() returns the
> conn->pghost value i.e "host1,host2" instead of the actual host that
> is connected.
> 
> Instead of checking the connection type while returning the host
> details, it should check whether the host is NULL or not? with this
> change it returns the expected value for all the connection types.

I agree that something is wrong here.

It seems, however, that PGhost() has always been broken for hostaddr
use.  In 9.6 (before the multiple-hosts stuff was introduced), when
connecting to "hostaddr=127.0.0.1", PGhost() returns "/tmp".  Urgh.

I think we should decide what PGhost() is supposed to mean when hostaddr
is in use, and then make a fix for that consistently across all versions.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


On Fri, Mar 09, 2018 at 04:42:30PM -0500, Peter Eisentraut wrote:
> It seems, however, that PGhost() has always been broken for hostaddr
> use.  In 9.6 (before the multiple-hosts stuff was introduced), when
> connecting to "hostaddr=127.0.0.1", PGhost() returns "/tmp".  Urgh.
>
> I think we should decide what PGhost() is supposed to mean when hostaddr
> is in use, and then make a fix for that consistently across all versions.

Hm.  The only inconsistency I can see here is when "host" is not defined
but "hostaddr" is, so why not make PQhost return the value of "hostaddr"
in this case?
--
Michael

Attachment
I drifted to come here..

At Wed, 14 Mar 2018 11:17:35 +0900, Michael Paquier <michael@paquier.xyz> wrote in <20180314021735.GI1150@paquier.xyz>
> On Fri, Mar 09, 2018 at 04:42:30PM -0500, Peter Eisentraut wrote:
> > It seems, however, that PGhost() has always been broken for hostaddr
> > use.  In 9.6 (before the multiple-hosts stuff was introduced), when
> > connecting to "hostaddr=127.0.0.1", PGhost() returns "/tmp".  Urgh.
> > 
> > I think we should decide what PGhost() is supposed to mean when hostaddr
> > is in use, and then make a fix for that consistently across all versions.
> 
> Hm.  The only inconsistency I can see here is when "host" is not defined
> but "hostaddr" is, so why not make PQhost return the value of "hostaddr"
> in this case?

https://www.postgresql.org/message-id/19297.1493660213%40sss.pgh.pa.us
> And I believe we already considered and rejected the idea of having it
> return the hostaddr string, back in some of the older discussions.
> (We could revisit that decision, no doubt, but let's go back and see
> what the reasoning was first.)

But I haven't found where the decision made. I seems to be
related to commits commits 11003eb55 and 40cb21f70 or in older
discussion. I'm going to search for that for a while.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



At Fri, 16 Mar 2018 09:50:41 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20180316.095041.241173653.horiguchi.kyotaro@lab.ntt.co.jp>
> I drifted to come here..
> 
> At Wed, 14 Mar 2018 11:17:35 +0900, Michael Paquier <michael@paquier.xyz> wrote in
<20180314021735.GI1150@paquier.xyz>
> > On Fri, Mar 09, 2018 at 04:42:30PM -0500, Peter Eisentraut wrote:
> > > It seems, however, that PGhost() has always been broken for hostaddr
> > > use.  In 9.6 (before the multiple-hosts stuff was introduced), when
> > > connecting to "hostaddr=127.0.0.1", PGhost() returns "/tmp".  Urgh.
> > > 
> > > I think we should decide what PGhost() is supposed to mean when hostaddr
> > > is in use, and then make a fix for that consistently across all versions.
> > 
> > Hm.  The only inconsistency I can see here is when "host" is not defined
> > but "hostaddr" is, so why not make PQhost return the value of "hostaddr"
> > in this case?

I proposed that before and I strongly agree to that. I suppose
that the return value of PQhost is to used for the following
purpose.

- It is for authentication subject.
- It provides a string for a human to identify the connection peer.

So it is incovenient that PQhost returns pghost before trying
hostaddr. Attached patch does that based on Haribabu's patch at
the beginning of this thread.

================
> https://www.postgresql.org/message-id/19297.1493660213%40sss.pgh.pa.us
> > And I believe we already considered and rejected the idea of having it
> > return the hostaddr string, back in some of the older discussions.
> > (We could revisit that decision, no doubt, but let's go back and see
> > what the reasoning was first.)
> 
> But I'havent find where the decision made. I'm going to search
> for that for a while.

After all, I didn't find much..

I found this thread back to 2014.

https://www.postgresql.org/message-id/CAHGQGwHsnMxh97UdXH5uif8eitD0WXDK_PSb3tipaGzJJVsCMw@mail.gmail.com

It preserved the behavior that PQhost() returns '/tmp' for the
case under discussion intentionally.

After that, 274bb2b385 (mistakenly?) put priority on hostaddr to
host in connectOptions2() so 11003eb556 reverted PQhost to return
pghost if connhost[] is of CHT_HOST_ADDRESS.

The last commit also reverted it to return /tmp if pghost is not
specified.

https://www.postgresql.org/message-id/CA+Tgmob1Y1KyK_Twi9oF3tj4p3J4c5YoNtAUh7DiajpeWynuLg@mail.gmail.com

I agree to the conclusion that PQhost() shouldn't return hostaddr
"if it has any host name to return". But I still haven't found
the reason for returning '/tmp' for IP connection.

The attached patch is revised version of that in the following thread.

https://www.postgresql.org/message-id/CA%2BTgmoZ%2B9h%3DmiD4%2BwYc9QztezgtLfeA59XtxVAL0NUjvfwKmaA%40mail.gmail.com
> Kyotaro Horiguchi argues that the current behavior is "not useful" and
> that's probably true, but I don't think it's the job of an API to try
> as hard as possible to do something useful in every case.  It's more
> important that the behavior is predictable and understandable.  In
> short, if we're going to change the behavior of PQhost() here, then
> there should be a documentation change to go with it, because the
> current documentation around the interaction between host and hostaddr
> does not make it at all clear that the current behavior is wrong, at
> least not as far as I can see.  To the contrary, it suggests that if
> you use hostaddr without host, you may find the results surprising or
> even unfortunate:

I believe the behavior is not surprising and not a hard thing to do.

https://www.postgresql.org/message-id/20170510.153403.28896042.horiguchi.kyotaro%40lab.ntt.co.jp
> However, it might be a problem that the documentation doesn't
> mention what the returned value from PQhost is.

If it is the string that was given as host parameter, returning
"" instead of NULL might be reasonable. If it is any string that
points to the connecting host, I think that returning IP address
is not surprising at all.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 77eebb0ba1..8e450b21fe 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -6018,8 +6018,12 @@ PQhost(const PGconn *conn)
     if (!conn)
         return NULL;
     if (conn->connhost != NULL &&
-        conn->connhost[conn->whichhost].type != CHT_HOST_ADDRESS)
+        conn->connhost[conn->whichhost].host != NULL &&
+        conn->connhost[conn->whichhost].host[0] != '\0')
         return conn->connhost[conn->whichhost].host;
+    else if (conn->connhost[conn->whichhost].hostaddr != NULL &&
+             conn->connhost[conn->whichhost].hostaddr[0] != '\0')
+        return conn->connhost[conn->whichhost].hostaddr;
     else if (conn->pghost != NULL && conn->pghost[0] != '\0')
         return conn->pghost;
     else

On 3/16/18 00:03, Kyotaro HORIGUCHI wrote:
> I agree to the conclusion that PQhost() shouldn't return hostaddr
> "if it has any host name to return". But I still haven't found
> the reason for returning '/tmp' for IP connection.
> 
> The attached patch is revised version of that in the following thread.

That patch looks good to me.

Moreover, I wonder whether we shouldn't remove the branch where
conn->connhost is NULL.  When would that be the case?  The current
behavior is to sometimes return the actual host connected to, and
sometimes the host list.  That doesn't make sense.

I think we should also revert 64f86fb11e20b55fb742af72d55806f8bdd9cd2d,
in which psql's \conninfo digs out the raw hostaddr value to display,
which could contain a list of things.  I think \conninfo should just
display the value returned by PQhost(), and if we change PQhost() to
take hostaddr into account, then this should address the original complaint.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




On Wed, Mar 21, 2018 at 6:06 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 3/16/18 00:03, Kyotaro HORIGUCHI wrote:
> I agree to the conclusion that PQhost() shouldn't return hostaddr
> "if it has any host name to return". But I still haven't found
> the reason for returning '/tmp' for IP connection.
>
> The attached patch is revised version of that in the following thread.

That patch looks good to me.

Moreover, I wonder whether we shouldn't remove the branch where
conn->connhost is NULL.  When would that be the case?  The current
behavior is to sometimes return the actual host connected to, and
sometimes the host list.  That doesn't make sense.

Scenarios where the connection is not yet established, in that scenario
the PQhost() can return the provided connection host information.

Other than the above, it always returns the proper host details.
 
I think we should also revert 64f86fb11e20b55fb742af72d55806f8bdd9cd2d,
in which psql's \conninfo digs out the raw hostaddr value to display,
which could contain a list of things.  I think \conninfo should just
display the value returned by PQhost(), and if we change PQhost() to
take hostaddr into account, then this should address the original complaint.

As per my understanding of the commit 64f86fb11e20b55fb742af72d55806f8bdd9cd2d,
the hostaddr is given the preference while displaying instead of host.

Even with the above PQhost() patch, If user provides both host and hostaddr as options,
the PQhost() function returns host and not the hostaddr. In commit 7b02ba62, the support
of "Allow multiple hostaddrs to go with multiple hostnames".

If it is fine to display the host in combination of both host and hostaddr, then it is fine
remove the commit 64f86fb11e20b55fb742af72d55806f8bdd9cd2d.

Regards,
Hari Babu
Fujitsu Australia
On Wed, Mar 21, 2018 at 10:33:19AM +1100, Haribabu Kommi wrote:
> On Wed, Mar 21, 2018 at 6:06 AM, Peter Eisentraut <
> peter.eisentraut@2ndquadrant.com> wrote:
>
>> On 3/16/18 00:03, Kyotaro HORIGUCHI wrote:
>>> I agree to the conclusion that PQhost() shouldn't return hostaddr
>>> "if it has any host name to return". But I still haven't found
>>> the reason for returning '/tmp' for IP connection.
>>>
>>> The attached patch is revised version of that in the following thread.
>>
>> That patch looks good to me.

The case where we are complaining is this one.
"hostaddr=127.0.0.1,127.0.0.1 port=5432,5433"
This makes PQhost return "127.0.0.1" with the patch, and "local" on
HEAD.

>> Moreover, I wonder whether we shouldn't remove the branch where
>> conn->connhost is NULL.  When would that be the case?  The current
>> behavior is to sometimes return the actual host connected to, and
>> sometimes the host list.  That doesn't make sense.
>
> Scenarios where the connection is not yet established, in that scenario
> the PQhost() can return the provided connection host information.
>
> Other than the above, it always returns the proper host details.

That remark is from me upthread.  In the case of a non-established
connection, I think that we ought to return that.

> Even with the above PQhost() patch, If user provides both host and
> hostaddr as options, the PQhost() function returns host and not the
> hostaddr. In commit 7b02ba62, the support of "Allow multiple
> hostaddrs to go with multiple hostnames".

Sentence is unfinished here?

> If it is fine to display the host in combination of both host and
> hostaddr, then it is fine to remove the commit
> 64f86fb11e20b55fb742af72d55806f8bdd9cd2d.

Showing host when both host and hostaddr are present is more intuitive
in my opinion.
--
Michael

Attachment
On 3/21/18 03:40, Michael Paquier wrote:
>>> Moreover, I wonder whether we shouldn't remove the branch where
>>> conn->connhost is NULL.  When would that be the case?  The current
>>> behavior is to sometimes return the actual host connected to, and
>>> sometimes the host list.  That doesn't make sense.
>> Scenarios where the connection is not yet established, in that scenario
>> the PQhost() can return the provided connection host information.
>>
>> Other than the above, it always returns the proper host details.
> That remark is from me upthread.  In the case of a non-established
> connection, I think that we ought to return that.

So, if the connection object is NULL, PQhost() returns NULL.  While the
connection is being established (whatever that means), it returns
whatever was specified as host.  And when the connection is established,
it returns the host actually connected to.  That seems pretty crazy.  It
should do only one or the other.  Especially since there is, AFAICT, no
way to know at run time whether the value it returned just then is one
or the other.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On Thu, Mar 22, 2018 at 12:28 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 3/21/18 03:40, Michael Paquier wrote:
>>> Moreover, I wonder whether we shouldn't remove the branch where
>>> conn->connhost is NULL.  When would that be the case?  The current
>>> behavior is to sometimes return the actual host connected to, and
>>> sometimes the host list.  That doesn't make sense.
>> Scenarios where the connection is not yet established, in that scenario
>> the PQhost() can return the provided connection host information.
>>
>> Other than the above, it always returns the proper host details.
> That remark is from me upthread.  In the case of a non-established
> connection, I think that we ought to return that.

So, if the connection object is NULL, PQhost() returns NULL.  While the
connection is being established (whatever that means), it returns
whatever was specified as host.  And when the connection is established,
it returns the host actually connected to.  That seems pretty crazy.  It
should do only one or the other.  Especially since there is, AFAICT, no
way to know at run time whether the value it returned just then is one
or the other.

OK.

Here I attached the updated patch that returns either the connected host/hostaddr
or NULL in case if the connection is not established.

I removed the returning default host details, because the default host
details are also available with the connhost member itself.

Regards,
Hari Babu
Fujitsu Australia
Attachment
On Sat, Mar 24, 2018 at 01:49:28AM +1100, Haribabu Kommi wrote:
> Here I attached the updated patch that returns either the connected
> host/hostaddr
> or NULL in case if the connection is not established.
>
> I removed the returning default host details, because the default host
> details are also available with the connhost member itself.

As shaped, PQhost generates complains from gcc with -Wreturn-type.  So I
would suggest to return NULL for the default code path.  As far as I can
see from the code, PGconn->connhost cannot be NULL and it should have
at least one "host" or "hostaddr" defined, so I think that we could
consider adding an assertion about that and comment out this cannot
normally be reached.

If the connection is bad, then whichhost points to 0, which would cause
PQhost to return the first host or hostaddr value.  I think that we
should document properly to not trust the value of PQhost if the status
is CONNECTION_BAD, or to return NULL if the connection is bad as this
would have no real value for multiple hosts.  I am a bit afraid of
potential breakages if we do that, so the first method may make the most
sense.  The same things apply to PQport as multiple ports can be
defined.  Thoughts?

I have quickly looked at the callers of PQhost in the core code and
those seem safe.  Something to keep in mind.

More details in the documentation would be nice.  Let's detail the
following:
- PQhost returns NULL if the connection is not established yet.
- PQhost may return an incorrect value when with CONNECTION_BAD as
status.
- If both hostaddr and host are precised in the conneciton string, then
host has the priority.
- If only hostaddr is precised, then this is the value returned.

I would really love to see this patch go into v11, and this could unlock
the other one you wrote for pg_stat_wal_receiver.

Thanks,
--
Michael

Attachment

On Sun, Mar 25, 2018 at 12:56 AM, Michael Paquier <michael@paquier.xyz> wrote:
On Sat, Mar 24, 2018 at 01:49:28AM +1100, Haribabu Kommi wrote:
> Here I attached the updated patch that returns either the connected
> host/hostaddr
> or NULL in case if the connection is not established.
>
> I removed the returning default host details, because the default host
> details are also available with the connhost member itself.


Thanks for the review.
 
As shaped, PQhost generates complains from gcc with -Wreturn-type.  So I
would suggest to return NULL for the default code path.  As far as I can
see from the code, PGconn->connhost cannot be NULL and it should have
at least one "host" or "hostaddr" defined, so I think that we could
consider adding an assertion about that and comment out this cannot
normally be reached.

Ok. Added an assert with an explanation comment.
 
If the connection is bad, then whichhost points to 0, which would cause
PQhost to return the first host or hostaddr value.  I think that we
should document properly to not trust the value of PQhost if the status
is CONNECTION_BAD, or to return NULL if the connection is bad as this
would have no real value for multiple hosts.  I am a bit afraid of
potential breakages if we do that, so the first method may make the most
sense. 

The existing behavior is currently returning wrong value when the connection
status is CONNECTION_BAD. As we are changing the behavior of the function,
it may be better to handle the CONNECTION_BAD scenario also instead of
providing note in the manual?

 
The same things apply to PQport as multiple ports can be
defined.  Thoughts?

Yes, I changed PQport also to return the connected port or NULL,
Removed the returning of all the ports specified in the connection string.
 
I have quickly looked at the callers of PQhost in the core code and
those seem safe.  Something to keep in mind.

More details in the documentation would be nice.  Let's detail the
following:
- PQhost returns NULL if the connection is not established yet.
- PQhost may return an incorrect value when with CONNECTION_BAD as
status.
- If both hostaddr and host are precised in the conneciton string, then
host has the priority.
- If only hostaddr is precised, then this is the value returned.

Docs are updated with the new behavior of the functions.
 
Updated patch attached with behavior of returning NULL for connections of
CONNECTION_BAD status.
 
Regards,
Hari Babu
Fujitsu Australia
Attachment
At Sun, 25 Mar 2018 22:27:09 +1100, Haribabu Kommi <kommi.haribabu@gmail.com> wrote in
<CAJrrPGeaQxWsU3RZ0yxGa=WY+wnA+nvE_YPuKzBDaqRMUt9SyA@mail.gmail.com>
> On Sun, Mar 25, 2018 at 12:56 AM, Michael Paquier <michael@paquier.xyz>
> wrote:
> 
> > On Sat, Mar 24, 2018 at 01:49:28AM +1100, Haribabu Kommi wrote:
> > > Here I attached the updated patch that returns either the connected
> > > host/hostaddr
> > > or NULL in case if the connection is not established.
> > >
> > > I removed the returning default host details, because the default host
> > > details are also available with the connhost member itself.
> >
> >
> Thanks for the review.
> 
> 
> > As shaped, PQhost generates complains from gcc with -Wreturn-type.  So I
> > would suggest to return NULL for the default code path.  As far as I can
> > see from the code, PGconn->connhost cannot be NULL and it should have
> > at least one "host" or "hostaddr" defined, so I think that we could
> > consider adding an assertion about that and comment out this cannot
> > normally be reached.
> >
> 
> Ok. Added an assert with an explanation comment.

As the commit message of 50cb21f70, the function is intended not
to return NULL in order to prevent the user functions from a
crash in corner cases.

https://www.postgresql.org/message-id/19297.1493660213%40sss.pgh.pa.us

> > If the connection is bad, then whichhost points to 0, which would cause
> > PQhost to return the first host or hostaddr value.  I think that we
> > should document properly to not trust the value of PQhost if the status
> > is CONNECTION_BAD, or to return NULL if the connection is bad as this
> > would have no real value for multiple hosts.  I am a bit afraid of
> > potential breakages if we do that, so the first method may make the most
> > sense.
> 
> 
> The existing behavior is currently returning wrong value when the connection
> status is CONNECTION_BAD. As we are changing the behavior of the function,
> it may be better to handle the CONNECTION_BAD scenario also instead of
> providing note in the manual?

There's no reason to behave so only for the functions. PQdb() and
PQuser() returns names that is not actually connected. Since the
purpose of PGhost is not strictly defined, especially on
connection failure, returning the given host list can be another
candidate (and the same can be said for returning ""). I think
users shouldn't (and I believe no one does) rely on the values
from the functions when CONNECTION_BAD, anyway.

My opninon is to add a description that is saying like "these
functions return unreliable values for a failed connection".

> > The same things apply to PQport as multiple ports can be
> > defined.  Thoughts?
> >
> 
> Yes, I changed PQport also to return the connected port or NULL,
> Removed the returning of all the ports specified in the connection string.
> 
> 
> > I have quickly looked at the callers of PQhost in the core code and
> > those seem safe.  Something to keep in mind.
> >
> > More details in the documentation would be nice.  Let's detail the
> > following:
> > - PQhost returns NULL if the connection is not established yet.
> > - PQhost may return an incorrect value when with CONNECTION_BAD as
> > status.
> > - If both hostaddr and host are precised in the conneciton string, then
> > host has the priority.
> > - If only hostaddr is precised, then this is the value returned.
> >
> 
> Docs are updated with the new behavior of the functions.
> 
> Updated patch attached with behavior of returning NULL for connections of
> CONNECTION_BAD status.

The patch does Assert() in PQhost. I suppose that Assert() in
client library is usable only when no more (library's) operation
cannot continue. It would be better to return a fallback value in
this criteria.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



On Mon, Mar 26, 2018 at 11:28:41AM +0900, Kyotaro HORIGUCHI wrote:
> At Sun, 25 Mar 2018 22:27:09 +1100, Haribabu Kommi <kommi.haribabu@gmail.com> wrote in
<CAJrrPGeaQxWsU3RZ0yxGa=WY+wnA+nvE_YPuKzBDaqRMUt9SyA@mail.gmail.com>
>> On Sun, Mar 25, 2018 at 12:56 AM, Michael Paquier <michael@paquier.xyz>
>> wrote:
> As the commit message of 50cb21f70, the function is intended not
> to return NULL in order to prevent the user functions from a
> crash in corner cases.

The commit number is not correct here.  You mean 40cb21f.

> https://www.postgresql.org/message-id/19297.1493660213%40sss.pgh.pa.us

I quite like the idea of using an empty string value in those cases.
This could prevent crashes at leat for applications not doing NULL-ness
checks.

>> The existing behavior is currently returning wrong value when the connection
>> status is CONNECTION_BAD. As we are changing the behavior of the function,
>> it may be better to handle the CONNECTION_BAD scenario also instead of
>> providing note in the manual?
>
> There's no reason to behave so only for the functions. PQdb() and
> PQuser() returns names that is not actually connected.

For PQdb() and PQuser(), the interface is basicaly intuitive, because you
cannot specify multiple entries.  So if a client specifies a value, it
will be sure that the value returned is unique.

> Since the
> purpose of PGhost is not strictly defined, especially on
> connection failure, returning the given host list can be another
> candidate (and the same can be said for returning ""). I think
> users shouldn't (and I believe no one does) rely on the values
> from the functions when CONNECTION_BAD, anyway.

Yeah, this should really be documented and also should refer to the fact
that this happens when specifying multiple hosts.

> My opinion is to add a description that is saying like "these
> functions return unreliable values for a failed connection".

At the same time I don't think that this is sufficient either, because
for multiple hosts, PQhost() just returns the first one, which makes
absolutely no sense because the value is wrong.  So I think that using a
third, separate value has some advantages:
- If NULL, this just means that the initialization did not happen.
- If using an empty string, then the value cannot be evaluated.
- If this returns a host or hostaddr (if host has not been specified),
then that's the host which is actually used for the connection.
Having those three states has value for applications in my opinion.

The same can apply to PQport, or any other functions which for whatever
reason add support for multiple values like host, hostaddr or port.

>> Updated patch attached with behavior of returning NULL for connections of
>> CONNECTION_BAD status.
>
> The patch does Assert() in PQhost. I suppose that Assert() in
> client library is usable only when no more (library's) operation
> cannot continue. It would be better to return a fallback value in
> this criteria.

The patch has to return a value as well.  A shaped the patch causes
again compilation warnings because not all code paths have a return
value.  So my previous remark has not been fixed.  Hari, what do you use
as compiler, my gcc blows a warning and reading the patch that's
obviously incorrect.

+       PQHost returns NULL when the connection is not established or
In the docs, this is wrong for two reasons:
- There should be a <function> markup.
- The name of the function is PQhost, not PGHost.
--
Michael

Attachment


On Mon, Mar 26, 2018 at 4:17 PM, Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Mar 26, 2018 at 11:28:41AM +0900, Kyotaro HORIGUCHI wrote:
> At Sun, 25 Mar 2018 22:27:09 +1100, Haribabu Kommi <kommi.haribabu@gmail.com> wrote in <CAJrrPGeaQxWsU3RZ0yxGa=WY+wnA+nvE_YPuKzBDaqRMUt9SyA@mail.gmail.com>
>> On Sun, Mar 25, 2018 at 12:56 AM, Michael Paquier <michael@paquier.xyz>
>> wrote:

Thanks for the review.
 
> As the commit message of 50cb21f70, the function is intended not
> to return NULL in order to prevent the user functions from a
> crash in corner cases.

The commit number is not correct here.  You mean 40cb21f.

> https://www.postgresql.org/message-id/19297.1493660213%40sss.pgh.pa.us

I quite like the idea of using an empty string value in those cases.
This could prevent crashes at leat for applications not doing NULL-ness
checks.

I also agree to return an empty string. I did this only for the cases where the conn is
not NULL but the status is not proper or the connhost is NULL.
 
> Since the
> purpose of PGhost is not strictly defined, especially on
> connection failure, returning the given host list can be another
> candidate (and the same can be said for returning ""). I think
> users shouldn't (and I believe no one does) rely on the values
> from the functions when CONNECTION_BAD, anyway.

Yeah, this should really be documented and also should refer to the fact
that this happens when specifying multiple hosts.
 
Added.
 
> My opinion is to add a description that is saying like "these
> functions return unreliable values for a failed connection".

At the same time I don't think that this is sufficient either, because
for multiple hosts, PQhost() just returns the first one, which makes
absolutely no sense because the value is wrong.  So I think that using a
third, separate value has some advantages:
- If NULL, this just means that the initialization did not happen.
- If using an empty string, then the value cannot be evaluated.
- If this returns a host or hostaddr (if host has not been specified),
then that's the host which is actually used for the connection.
Having those three states has value for applications in my opinion.

The same can apply to PQport, or any other functions which for whatever
reason add support for multiple values like host, hostaddr or port.

I hope that I updated the documentation properly to explain all the above cases.
 
>> Updated patch attached with behavior of returning NULL for connections of
>> CONNECTION_BAD status.
>
> The patch does Assert() in PQhost. I suppose that Assert() in
> client library is usable only when no more (library's) operation
> cannot continue. It would be better to return a fallback value in
> this criteria.

The patch has to return a value as well.  A shaped the patch causes
again compilation warnings because not all code paths have a return
value.  So my previous remark has not been fixed.  Hari, what do you use
as compiler, my gcc blows a warning and reading the patch that's
obviously incorrect.

In my assert enabled build, I didn't get any warning. Yes that patch to fix the
warning is clearly wrong. I corrected in a different way of adding Assert checks
for the hostaddr, because definitely host or hostaddr must be present.
 
+       PQHost returns NULL when the connection is not established or
In the docs, this is wrong for two reasons:
- There should be a <function> markup.
- The name of the function is PQhost, not PGHost.

Corrected.

Updated patch attached.

Regards,
Hari Babu
Fujitsu Australia
Attachment
Hello.

At Mon, 26 Mar 2018 17:49:22 +1100, Haribabu Kommi <kommi.haribabu@gmail.com> wrote in
<CAJrrPGdYQ92R1hNArCByu+gN_SGsFMjGvbErjH0N9W8Ry24CxA@mail.gmail.com>
> On Mon, Mar 26, 2018 at 4:17 PM, Michael Paquier <michael@paquier.xyz>
> wrote:
> 
> > On Mon, Mar 26, 2018 at 11:28:41AM +0900, Kyotaro HORIGUCHI wrote:
> > > At Sun, 25 Mar 2018 22:27:09 +1100, Haribabu Kommi <
> > kommi.haribabu@gmail.com> wrote in <CAJrrPGeaQxWsU3RZ0yxGa=WY+
> > wnA+nvE_YPuKzBDaqRMUt9SyA@mail.gmail.com>
> > >> On Sun, Mar 25, 2018 at 12:56 AM, Michael Paquier <michael@paquier.xyz>
> > >> wrote:
> >
> 
> Thanks for the review.
> 
> 
> > > As the commit message of 50cb21f70, the function is intended not
> > > to return NULL in order to prevent the user functions from a
> > > crash in corner cases.
> >
> > The commit number is not correct here.  You mean 40cb21f.
> >
> > > https://www.postgresql.org/message-id/19297.1493660213%40sss.pgh.pa.us
> >
> > I quite like the idea of using an empty string value in those cases.
> > This could prevent crashes at leat for applications not doing NULL-ness
> > checks.
> >
> 
> I also agree to return an empty string. I did this only for the cases where
> the conn is
> not NULL but the status is not proper or the connhost is NULL.
> 
> 
> > > Since the
> > > purpose of PGhost is not strictly defined, especially on
> > > connection failure, returning the given host list can be another
> > > candidate (and the same can be said for returning ""). I think
> > > users shouldn't (and I believe no one does) rely on the values
> > > from the functions when CONNECTION_BAD, anyway.
> >
> > Yeah, this should really be documented and also should refer to the fact
> > that this happens when specifying multiple hosts.
> >
> 
> Added.


+       The <function>PQhost</function> function returns NULL when the
+       connection is not established, or returns an empty string when status
+       of the connection is not <literal>CONNECTION_OK</literal>.

This may be wrong. NULL is only for the case conn == NULL and ""
for other connection failure. I'm not sure how to express the OOM
case in the documentation but we could reuturn "" for the
conn==NULL case if we don't want to distinguish the state in
PQhost and its family.

> > > My opinion is to add a description that is saying like "these
> > > functions return unreliable values for a failed connection".
> >
> > At the same time I don't think that this is sufficient either, because
> > for multiple hosts, PQhost() just returns the first one, which makes
> > absolutely no sense because the value is wrong.  So I think that using a
> > third, separate value has some advantages:
> > - If NULL, this just means that the initialization did not happen.
> > - If using an empty string, then the value cannot be evaluated.
> > - If this returns a host or hostaddr (if host has not been specified),
> > then that's the host which is actually used for the connection.
> > Having those three states has value for applications in my opinion.
> >
> > The same can apply to PQport, or any other functions which for whatever
> > reason add support for multiple values like host, hostaddr or port.
> >
> 
> I hope that I updated the documentation properly to explain all the above
> cases.

I'm not sure but I'm afraid that some authentication methods
requires that PQhost() returns a host name for the states other
than CONNECTION_OK, perhaps CONNECTION_MADE, AWAITING_RESPONSE
and so, which happens after a connection is established. Even
without considering that, we can return a sane value after raw
connection (not a PGconn) is established.

> > >> Updated patch attached with behavior of returning NULL for connections
> > of
> > >> CONNECTION_BAD status.
> > >
> > > The patch does Assert() in PQhost. I suppose that Assert() in
> > > client library is usable only when no more (library's) operation
> > > cannot continue. It would be better to return a fallback value in
> > > this criteria.
> >
> > The patch has to return a value as well.  A shaped the patch causes
> > again compilation warnings because not all code paths have a return
> > value.  So my previous remark has not been fixed.  Hari, what do you use
> > as compiler, my gcc blows a warning and reading the patch that's
> > obviously incorrect.
> >
> 
> In my assert enabled build, I didn't get any warning. Yes that patch to fix
> the
> warning is clearly wrong. I corrected in a different way of adding Assert
> checks
> for the hostaddr, because definitely host or hostaddr must be present.

As I wrote upthread, the assertion doesn't seem to be needed. I
think that a library should allow callers to decide how to handle
error cases if it is possible.

> > + PQHost returns NULL when the connection is not established
or
> > In the docs, this is wrong for two reasons:
> > - There should be a <function> markup.
> > - The name of the function is PQhost, not PGHost.
> >
> 
> Corrected.
> 
> Updated patch attached.

The documentation is written as the following.

-       Returns the server host name of the connection.
+       Returns the server host name or host address of the active connection.
        This can be a host name, an IP address, or a directory path if the

Is the replacement is required? The following line is stating the
same thing including the local-socket case.




regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center





On Mon, Mar 26, 2018 at 6:34 PM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Hello.

At Mon, 26 Mar 2018 17:49:22 +1100, Haribabu Kommi <kommi.haribabu@gmail.com> wrote in <CAJrrPGdYQ92R1hNArCByu+gN_SGsFMjGvbErjH0N9W8Ry24CxA@mail.gmail.com>


Thanks for the review.
 
+       The <function>PQhost</function> function returns NULL when the
+       connection is not established, or returns an empty string when status
+       of the connection is not <literal>CONNECTION_OK</literal>.

This may be wrong. NULL is only for the case conn == NULL and ""
for other connection failure. I'm not sure how to express the OOM
case in the documentation but we could reuturn "" for the
conn==NULL case if we don't want to distinguish the state in
PQhost and its family.

All the other PQ* functions checks and return NULL when conn==NULL,
returning NULL here may not be good?

And if we are not going to change the above, then PQhost() function
returns 3 values,
- NULL when the conn==NULL
- Actual host or hostaddr of the active connection
- Empty string when the conn is not able to evaluate.

Is it fine to proceed like above?
 
> > > My opinion is to add a description that is saying like "these
> > > functions return unreliable values for a failed connection".
> >
> > At the same time I don't think that this is sufficient either, because
> > for multiple hosts, PQhost() just returns the first one, which makes
> > absolutely no sense because the value is wrong.  So I think that using a
> > third, separate value has some advantages:
> > - If NULL, this just means that the initialization did not happen.
> > - If using an empty string, then the value cannot be evaluated.
> > - If this returns a host or hostaddr (if host has not been specified),
> > then that's the host which is actually used for the connection.
> > Having those three states has value for applications in my opinion.
> >
> > The same can apply to PQport, or any other functions which for whatever
> > reason add support for multiple values like host, hostaddr or port.
> >
>
> I hope that I updated the documentation properly to explain all the above
> cases.

I'm not sure but I'm afraid that some authentication methods
requires that PQhost() returns a host name for the states other
than CONNECTION_OK, perhaps CONNECTION_MADE, AWAITING_RESPONSE
and so, which happens after a connection is established. Even
without considering that, we can return a sane value after raw
connection (not a PGconn) is established.

Yes, PQhost() function is used even when the connection is fully established,
so we cannot use the status to return the host name. So the logic of verifying the
status needs to be removed.

 
> > >> Updated patch attached with behavior of returning NULL for connections
> > of
> > >> CONNECTION_BAD status.
> > >
> > > The patch does Assert() in PQhost. I suppose that Assert() in
> > > client library is usable only when no more (library's) operation
> > > cannot continue. It would be better to return a fallback value in
> > > this criteria.
> >
> > The patch has to return a value as well.  A shaped the patch causes
> > again compilation warnings because not all code paths have a return
> > value.  So my previous remark has not been fixed.  Hari, what do you use
> > as compiler, my gcc blows a warning and reading the patch that's
> > obviously incorrect.
> >
>
> In my assert enabled build, I didn't get any warning. Yes that patch to fix
> the
> warning is clearly wrong. I corrected in a different way of adding Assert
> checks
> for the hostaddr, because definitely host or hostaddr must be present.

As I wrote upthread, the assertion doesn't seem to be needed. I
think that a library should allow callers to decide how to handle
error cases if it is possible.

OK. Will correct it.
 
> > + PQHost returns NULL when the connection is not established
or
> > In the docs, this is wrong for two reasons:
> > - There should be a <function> markup.
> > - The name of the function is PQhost, not PGHost.
> >
>
> Corrected.
>
> Updated patch attached.

The documentation is written as the following.

-       Returns the server host name of the connection.
+       Returns the server host name or host address of the active connection.
        This can be a host name, an IP address, or a directory path if the

Is the replacement is required? The following line is stating the
same thing including the local-socket case.

Ok, will update it to just include the active connection.

Regards,
Hari Babu
Fujitsu Australia
On Mon, Mar 26, 2018 at 11:39:51PM +1100, Haribabu Kommi wrote:
> And if we are not going to change the above, then PQhost() function
> returns 3 values,
> - NULL when the conn==NULL
> - Actual host or hostaddr of the active connection
> - Empty string when the conn is not able to evaluate.
>
> Is it fine to proceed like above?

Yeah, I would think that this is a sensible approach.  Keeping
consistent with NULL is good for the other function, and we still need a
way to identify the dont-know state.  Peter, what's your input on that?
--
Michael

Attachment

On Tue, Mar 27, 2018 at 12:23 AM, Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Mar 26, 2018 at 11:39:51PM +1100, Haribabu Kommi wrote:
> And if we are not going to change the above, then PQhost() function
> returns 3 values,
> - NULL when the conn==NULL
> - Actual host or hostaddr of the active connection
> - Empty string when the conn is not able to evaluate.
>
> Is it fine to proceed like above?

Yeah, I would think that this is a sensible approach.  Keeping
consistent with NULL is good for the other function, and we still need a
way to identify the dont-know state.  Peter, what's your input on that?

Patch attached with the above behavior along with other comments from upthread.


Regards,
Hari Babu
Fujitsu Australia
Attachment
On Tue, Mar 27, 2018 at 11:43:27AM +1100, Haribabu Kommi wrote:
> Patch attached with the above behavior along with other comments from
> upthread.

Thanks for the updated version.

The function changes look logically good to me.

+      <para>
+       The <function>PQhost</function> function returns NULL when the
+       input conn parameter is NULL or an empty string if conn cannot be evaluated.
+       Applications of this function must carefully evaluate the return value.
+      </para>
+
+      <note>
+       <para>
+        when both <literal>host</literal> and <literal>hostaddr</literal>
+        parameters are specified in the connection string, the connection
+        <literal>host</literal> parameter is returned.
+       </para>
Some nitpicking about the documentation changes:
- NULL needs to use the markup <symbol>.
- conn should use the markup <parameter>.  As the docs describe a value
of the input parameter, we cannot talk about "the connection" either in
those cases.
- "Applications of this function must carefully evaluate the return
value" is rather vague, so I would append to the end "depending on the
state of the connection involved."
The same applies to PQport() for consistency.

Perhaps the documentation should mention as well that making the
difference between those different values is particularly relevant when
using multiple-value strings?  I would rather add one paragraph on the
matter than nothing.  I really think that we have been lacking clarity
in the documentation for those APIs for too long, and people always
argue about what they should do.  If we have a base documented, then we
can more easily argue for the future as well, and things are clear to
the user.
--
Michael

Attachment
On Mon, Mar 26, 2018 at 8:24 PM, Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Mar 27, 2018 at 11:43:27AM +1100, Haribabu Kommi wrote:
> Patch attached with the above behavior along with other comments from
> upthread.

Thanks for the updated version.

The function changes look logically good to me.

+      <para>
+       The <function>PQhost</function> function returns NULL when the
+       input conn parameter is NULL or an empty string if conn cannot be evaluated.
+       Applications of this function must carefully evaluate the return value.
+      </para>

- "Applications of this function must carefully evaluate the return
value" is rather vague, so I would append to the end "depending on the
state of the connection involved."
The same applies to PQport() for consistency.

Perhaps the documentation should mention as well that making the
difference between those different values is particularly relevant when
using multiple-value strings?  I would rather add one paragraph on the
matter than nothing.  I really think that we have been lacking clarity
in the documentation for those APIs for too long, and people always
argue about what they should do.  If we have a base documented, then we
can more easily argue for the future as well, and things are clear to
the user.


"depending on the state of the connection" doesn't move the goal-posts that far though...and "Applications of this function" would be better written as "Callers of this function" if left in place.

In any case something like the following framework seems more useful to the reader who doesn't want to scan the source code for the PQhost/PQport functions.

The PQhost function returns NULL when the input conn parameter is NULL or an empty string if conn cannot be evaluated.  Otherwise, the return value depends on the state of the conn: specifically (translate code to documentation here).  Furthermore, if both host and hostaddr properties exist on conn the return value will contain only the host.

I'm undecided on the need for a <note> element but would lean against it in favor of the above, slightly longer, paragraph.

And yes, while I'm not sure right now what the multi-value condition logic results in it should be mentioned - at least if the goal of the docs is to be a sufficient resource for using these functions.  In particular what happens when the connection is inactive (unless that falls under "cannot be evaluated"...).

David J.

On Mon, Mar 26, 2018 at 09:03:17PM -0700, David G. Johnston wrote:
> And yes, while I'm not sure right now what the multi-value condition logic
> results in it should be mentioned - at least if the goal of the docs is to
> be a sufficient resource for using these functions.  In particular what
> happens when the connection is inactive (unless that falls under "cannot be
> evaluated"...).

Yes, the fast that it is not possible to rely on the return value for
multiple hosts as long as the connection is not sanely established needs
to be documented anyway.  For example if a client uses PQconnectStart()
for an asynchronous start then connhost is initialized with whichhost
set to 0, so PQhost won't return an empty string.
--
Michael

Attachment


On Tue, Mar 27, 2018 at 3:03 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Mon, Mar 26, 2018 at 8:24 PM, Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Mar 27, 2018 at 11:43:27AM +1100, Haribabu Kommi wrote:
> Patch attached with the above behavior along with other comments from
> upthread.

Thanks for the updated version.

The function changes look logically good to me.

+      <para>
+       The <function>PQhost</function> function returns NULL when the
+       input conn parameter is NULL or an empty string if conn cannot be evaluated.
+       Applications of this function must carefully evaluate the return value.
+      </para>

- "Applications of this function must carefully evaluate the return
value" is rather vague, so I would append to the end "depending on the
state of the connection involved."
The same applies to PQport() for consistency.

Perhaps the documentation should mention as well that making the
difference between those different values is particularly relevant when
using multiple-value strings?  I would rather add one paragraph on the
matter than nothing.  I really think that we have been lacking clarity
in the documentation for those APIs for too long, and people always
argue about what they should do.  If we have a base documented, then we
can more easily argue for the future as well, and things are clear to
the user.


"depending on the state of the connection" doesn't move the goal-posts that far though...and "Applications of this function" would be better written as "Callers of this function" if left in place.

In any case something like the following framework seems more useful to the reader who doesn't want to scan the source code for the PQhost/PQport functions.

The PQhost function returns NULL when the input conn parameter is NULL or an empty string if conn cannot be evaluated.  Otherwise, the return value depends on the state of the conn: specifically (translate code to documentation here).  Furthermore, if both host and hostaddr properties exist on conn the return value will contain only the host.

I'm undecided on the need for a <note> element but would lean against it in favor of the above, slightly longer, paragraph.

And yes, while I'm not sure right now what the multi-value condition logic results in it should be mentioned - at least if the goal of the docs is to be a sufficient resource for using these functions.  In particular what happens when the connection is inactive (unless that falls under "cannot be evaluated"...).

Thanks for the review. 

updated patch attached with additional doc updates as per the suggestion from the upthreads.


Regards,
Hari Babu
Fujitsu Australia
Attachment
On Tue, Mar 27, 2018 at 04:47:41PM +1100, Haribabu Kommi wrote:
> updated patch attached with additional doc updates as per the suggestion
> from the upthreads.

Thanks Hari for the quick update.  It looks to me that this is shaped as
suggested.  Any input from other folks?  I don't have more to say.
--
Michael

Attachment
On Mon, Mar 26, 2018 at 10:47 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
updated patch attached with additional doc updates as per the suggestion from the upthreads.
​---------------------------------------------------------
Some comments if the patch remains in-tact:
​Lower-case "i" in "It is not" in the second paragraphs

The ... function returns NULL when the input conn parameter is NULL or an empty string if conn cannot be evaluated.  Otherwise, the return value is the first non-NULL, non-empty, value of either the host or hostaddr property of conn.  <omit the entire last sentence, Callers of this function..., for PQport too>

Omit "properly" after established - if the connection is established one can assume it was done "properly".

Add "the" after calling:  "can be checked by calling the <function>PQstatus</function> function.

------------------------------------------------------
I'm having a bit of concern about the actual specification and wording...but don't have time right now to thoroughly read the thread history, docs, and/or code at this moment to know whether its just inexperience on my part or an actual confusion.  But here's where I'm at presently...

I'm not sure why there is confusion here in the first place...the docs[1] say:

"The following functions return parameter values established at connection."

At which point there is only one meaningful value for them - the value that pertains to the established connection.

As far as "or an empty string if conn cannot be evaluated" should just become "an empty string if conn is inactive".

----------------------------
Another odd piece is:

"The value for host is ignored unless the authentication method requires it, in which case it will be used as the host name."

There is no coverage of "1 host, many hostaddr; n host, n hostaddr; and n host, m hostaddr" combinations - namely which are valid and how the NxM combination would even work if it is even allowed.

Then, if we do connect using hostaddr, and authenticate with host, should we indicate that fact in the PQhost output or not?

Potentially PQhost would be defined to return an actual hostname if it can figure one out - even if the active connection was made using hostaddr.  My take is that PQhost is meant to be user-informative rather than technical, and should ever only return the single most appropriate value it can compute.  Just need to decide on the logic for determining "appropriate" then document and implement it.

-----------------------------
"The following functions return parameter values established at connection. These values are fixed for the life of the connection. If a multi-host connection string is used, the values of PQhost, PQport, and PQpass can change if a new connection is established using the same PGconn object. Other values are fixed for the lifetime of the PGconn object."

Maybe something like:

The following functions return parameters values established at connection and, except for the multi-valued fail-over accessors PQhost and PQport, as well as PQpass, cannot change during the lifetime of the PGconn object.

PQpass is a bit odd here given its not multi-valued...not sure what if anything to make of that.

David J.

I apologize in advance that I'm not proper for wordsmithing.

At Tue, 27 Mar 2018 00:24:07 -0700, "David G. Johnston" <david.g.johnston@gmail.com> wrote in
<CAKFQuwaats6dndVT73xrFqBu+RJ4tsBopCNON5XuV+xT8tGVkQ@mail.gmail.com>
> On Mon, Mar 26, 2018 at 10:47 PM, Haribabu Kommi <kommi.haribabu@gmail.com>
> wrote:
> 
> > updated patch attached with additional doc updates as per the suggestion
> > from the upthreads.
> >
> ​---------------------------------------------------------
> Some comments if the patch remains in-tact:
> ​
> ​Lower-case "i" in "It is not" in the second paragraphs
> 
> The ... function returns NULL when the input conn parameter is NULL or an
> empty string if conn cannot be evaluated.  Otherwise, the return value is
> the first non-NULL, non-empty, value of either the host or hostaddr
> property of conn.  <omit the entire last sentence, Callers of this
> function..., for PQport too>

Mmm. Does the second sentense mean that "host precedes to
hostaddr if any"? The code is written as the above but what users
observe is a connection string. I suppose that that accuracy is
rather confusing. I agree that the sentense "Callers of this.."
is needless since anyway the case is illegal regardsless of this
function.

That is something like the follows.

| The PQhost function returns NULL when the input conn parameter
| is NULL or an empty string if conn cannot be evaluated.
| Otherwise, it returns host if any or otherwise hostaddr of
| connection property.

I personally don't think it is not necessary to mention the NULL
case since other similiar funcions don't have such description.


> Omit "properly" after established - if the connection is established one
> can assume it was done "properly".
> 
> Add "the" after calling:  "can be checked by calling the
> <function>PQstatus</function> function.
> 
> ------------------------------------------------------
> I'm having a bit of concern about the actual specification and
> wording...but don't have time right now to thoroughly read the thread
> history, docs, and/or code at this moment to know whether its just
> inexperience on my part or an actual confusion.  But here's where I'm at
> presently...
> 
> I'm not sure why there is confusion here in the first place...the docs[1]
> say:
> 
> https://www.postgresql.org/docs/10/static/libpq-status.html
> "The following functions return parameter values established at connection."
> 
> At which point there is only one meaningful value for them - the value that
> pertains to the established connection.

True.

> As far as "or an empty string if conn cannot be evaluated" should just
> become "an empty string if conn is inactive".

The description tries to clarify the behavior while the
connection is not established. I'm not sure how "evaluate" sounds
exactly but I understand it as "PQhost can return any pausible
value". An inactive (before establising) connection can return a
value, but it can be wrong. It means that we shouldn't ask
PQhost(conn) whether the connection is established or not. So
without deriberate wording we could say;

| The PQhost function returns host if any or otherwise hostaddr
| of connection property. It may return a wrong value for
| connections before establishing and may return NULL if NULL is
| given as the parameter.

The meaning of "establising a connection" is intentionally
obfuscated since it should mean a CONNECTION_OK PGconn for
ordinary users and "after establishing the underlying connection"
for advanced users who looks into the code.



> ----------------------------
> Another odd piece is:
> 
> https://www.postgresql.org/docs/10/static/libpq-connect.html#LIBPQ-MULTIPLE-HOSTS
> "The value for host is ignored unless the authentication method requires
> it, in which case it will be used as the host name."
> 
> There is no coverage of "1 host, many hostaddr; n host, n hostaddr; and n
> host, m hostaddr" combinations - namely which are valid and how the NxM
> combination would even work if it is even allowed.

In the multi-host case, host and hostaddr *must* have the same
number of elements. Otherwise you get an error.

> psql: could not match 2 host names to 1 hostaddr values

It is described in the same page as below.

| In the Keyword/Value format, the host, hostaddr, and port options
| accept a comma-separated list of values. The same number of
| elements must be given in each option, such that e.g. the first
| hostaddr corresponds to the first host name, the second hostaddr
| corresponds to the second host name, and so forth. As an
| exception, if only one port is specified, it applies to all the
| hosts.


> Then, if we do connect using hostaddr, and authenticate with host, should
> we indicate that fact in the PQhost output or not?

We don't need to identify the case. Authentication is always
performed using the return value of PQhost. Connection is made to
hostaddr if any, otherwise to host.

> Potentially PQhost would be defined to return an actual hostname if it can
> figure one out - even if the active connection was made using hostaddr.  My
> take is that PQhost is meant to be user-informative rather than technical,
> and should ever only return the single most appropriate value it can
> compute.  Just need to decide on the logic for determining "appropriate"
> then document and implement it.

hostaddr is defined that "it can be specified in order to get rid
of host name resolution". So generating host from hostaddr is out
of the intention. The description for hostaddr in the page is;

| Using hostaddr instead of host allows the application to avoid a
| host name look-up, which might be important in applications with
| time constraints.

And the value of host should be used for authentication
unmodified since it is the user's intention.

> -----------------------------
> Also at: https://www.postgresql.org/docs/10/static/libpq-status.html
> "The following functions return parameter values established at connection.
> These values are fixed for the life of the connection. If a multi-host
> connection string is used, the values of PQhost, PQport, and PQpass can
> change if a new connection is established using the same PGconn object.
> Other values are fixed for the lifetime of the PGconn object."
> 
> Maybe something like:
> 
> The following functions return parameters values established at connection
> and, except for the multi-valued fail-over accessors PQhost and PQport, as
> well as PQpass, cannot change during the lifetime of the PGconn object.

It is not directly related to fail-over (if the word menas
reconnection to the next available server after an involuntary
disconnection).

Putting that aside, I'm not sure it makes any difference..

> PQpass is a bit odd here given its not multi-valued...not sure what if
> anything to make of that.

password can vary with host and port using .pgpass.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

On 3/27/18 02:20, Michael Paquier wrote:
> On Tue, Mar 27, 2018 at 04:47:41PM +1100, Haribabu Kommi wrote:
>> updated patch attached with additional doc updates as per the suggestion
>> from the upthreads.
> 
> Thanks Hari for the quick update.  It looks to me that this is shaped as
> suggested.  Any input from other folks?  I don't have more to say.

Committed after fixing up the documentation a bit as suggested by others.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




On Wed, Mar 28, 2018 at 3:35 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 3/27/18 02:20, Michael Paquier wrote:
> On Tue, Mar 27, 2018 at 04:47:41PM +1100, Haribabu Kommi wrote:
>> updated patch attached with additional doc updates as per the suggestion
>> from the upthreads.
>
> Thanks Hari for the quick update.  It looks to me that this is shaped as
> suggested.  Any input from other folks?  I don't have more to say.

Committed after fixing up the documentation a bit as suggested by others.

Thanks.

Regards,
Hari Babu
Fujitsu Australia
On Wed, Mar 28, 2018 at 11:06:23AM +1100, Haribabu Kommi wrote:
> On Wed, Mar 28, 2018 at 3:35 AM, Peter Eisentraut <
> peter.eisentraut@2ndquadrant.com> wrote:
>>
>> Committed after fixing up the documentation a bit as suggested by others.
>
> Thanks.

+1.  Thanks for working on this Hari, Peter for the commit and all
others for your input!
--
Michael

Attachment
At Wed, 28 Mar 2018 10:34:49 +0900, Michael Paquier <michael@paquier.xyz> wrote in <20180328013449.GC1105@paquier.xyz>
> On Wed, Mar 28, 2018 at 11:06:23AM +1100, Haribabu Kommi wrote:
> > On Wed, Mar 28, 2018 at 3:35 AM, Peter Eisentraut <
> > peter.eisentraut@2ndquadrant.com> wrote:
> >>
> >> Committed after fixing up the documentation a bit as suggested by others.
> > 
> > Thanks.
> 
> +1.  Thanks for working on this Hari, Peter for the commit and all
> others for your input!

Me too.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center