Thread: Adding Unix domain socket path and port topg_stat_get_wal_senders()

Adding Unix domain socket path and port topg_stat_get_wal_senders()

From
Tatsuo Ishii
Date:
Currently pg_stat_replication view does not tell useful information
regarding client connections if UNIX domain sockets are used for
communication between sender and receiver. So it is not possible to
tell which row corresponds to which standby server.

test=# select client_addr, client_hostname, client_port,sync_state client_port from pg_stat_replication;
 client_addr | client_hostname | client_port | client_port 
-------------+-----------------+-------------+-------------
             |                 |          -1 | async
             |                 |          -1 | async
(2 rows)

This is due to that pg_stat_replication is created from
pg_stat_get_activity view. pg_stat_get_activity view calls
pg_stat_get_activity() which returns always NULL, NULL, -1 for
client_add, client_hostname and client_port.

                else if (beentry->st_clientaddr.addr.ss_family == AF_UNIX)
                {
                    /*
                     * Unix sockets always reports NULL for host and -1 for
                     * port, so it's possible to tell the difference to
                     * connections we have no permissions to view, or with
                     * errors.
                     */

Changing this behavior would affect existing pg_stat_get_activity view
users and I hesitate to do so. I wonder if we could add receiver's
UNIX domain socket path to from pg_stat_get_wal_senders() (which is
called from pg_stat_replication view) so that the poor UNIX domain
socket users could make their own view or access
pg_stat_get_wal_senders() to get the UNIX domain socket path.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Adding Unix domain socket path and port to pg_stat_get_wal_senders()

From
Euler Taveira
Date:
Em qui, 11 de abr de 2019 às 21:16, Tatsuo Ishii <ishii@sraoss.co.jp> escreveu:
>
> Currently pg_stat_replication view does not tell useful information
> regarding client connections if UNIX domain sockets are used for
> communication between sender and receiver. So it is not possible to
> tell which row corresponds to which standby server.
>
application_name. I'm not sure if it solves your complain but Peter
committed a patch [1] for v12 that distinguishes replicas in the same
host via cluster_name.

> test=# select client_addr, client_hostname, client_port,sync_state client_port from pg_stat_replication;
>  client_addr | client_hostname | client_port | client_port
> -------------+-----------------+-------------+-------------
>              |                 |          -1 | async
>              |                 |          -1 | async
> (2 rows)
>
> This is due to that pg_stat_replication is created from
> pg_stat_get_activity view. pg_stat_get_activity view calls
> pg_stat_get_activity() which returns always NULL, NULL, -1 for
> client_add, client_hostname and client_port.
>
Socket has different semantic from TCP/UDP. We can't add socket
information into client_addr unless we are prepared to break this view
(client_addr has type inet and it would be necessary to change it to
text). It could break a lot of applications.


[1] https://www.postgresql.org/message-id/flat/1257eaee-4874-e791-e83a-46720c72cac7@2ndquadrant.com


--
   Euler Taveira                                   Timbira -
http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento



Re: Adding Unix domain socket path and port topg_stat_get_wal_senders()

From
Michael Paquier
Date:
On Thu, Apr 11, 2019 at 10:19:01PM -0300, Euler Taveira wrote:
> application_name. I'm not sure if it solves your complain but Peter
> committed a patch [1] for v12 that distinguishes replicas in the same
> host via cluster_name.

Let's be honest, this is just a workaround.

> Socket has different semantic from TCP/UDP. We can't add socket
> information into client_addr unless we are prepared to break this view
> (client_addr has type inet and it would be necessary to change it to
> text). It could break a lot of applications.

client_addr does not seem the right place to store this information,
and it is already documented for years that NULL is used when using a
Unix socket.  But I think that we could change *client_hostname* so as
the path name is reported instead of NULL when connecting through a
Unix domain socket, and there is no need to switch the field type for
that.

I agree with Ishii-san that it would be nice to close the gap here.
For pg_stat_wal_receiver, please note that sender_host reports
correctly the domain path when connecting locally.
--
Michael

Attachment
Michael Paquier <michael@paquier.xyz> writes:
> On Thu, Apr 11, 2019 at 10:19:01PM -0300, Euler Taveira wrote:
>> Socket has different semantic from TCP/UDP. We can't add socket
>> information into client_addr unless we are prepared to break this view
>> (client_addr has type inet and it would be necessary to change it to
>> text). It could break a lot of applications.

Agreed.

> client_addr does not seem the right place to store this information,
> and it is already documented for years that NULL is used when using a
> Unix socket.  But I think that we could change *client_hostname* so as
> the path name is reported instead of NULL when connecting through a
> Unix domain socket, and there is no need to switch the field type for
> that.

That seems like a hack, and I think it could still break apps that
are expecting particular semantics for that field.  Why not add a
new column instead?

            regards, tom lane



Re: Adding Unix domain socket path and port topg_stat_get_wal_senders()

From
Tatsuo Ishii
Date:
>> client_addr does not seem the right place to store this information,
>> and it is already documented for years that NULL is used when using a
>> Unix socket.  But I think that we could change *client_hostname* so as
>> the path name is reported instead of NULL when connecting through a
>> Unix domain socket, and there is no need to switch the field type for
>> that.
> 
> That seems like a hack, and I think it could still break apps that
> are expecting particular semantics for that field.  Why not add a
> new column instead?

Actually I aready proposed to add new column to pg_stat_get_wal_senders():

> Changing this behavior would affect existing pg_stat_get_activity view
> users and I hesitate to do so. I wonder if we could add receiver's
> UNIX domain socket path to from pg_stat_get_wal_senders() (which is
> called from pg_stat_replication view) so that the poor UNIX domain
> socket users could make their own view or access
> pg_stat_get_wal_senders() to get the UNIX domain socket path.

If we were ok to add a new column to pg_stat_activity view or
pg_stat_replication view as well, that will be great.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Adding Unix domain socket path and port to pg_stat_get_wal_senders()

From
Euler Taveira
Date:
Em sex, 12 de abr de 2019 às 01:39, Michael Paquier
<michael@paquier.xyz> escreveu:
>
> On Thu, Apr 11, 2019 at 10:19:01PM -0300, Euler Taveira wrote:
> > application_name. I'm not sure if it solves your complain but Peter
> > committed a patch [1] for v12 that distinguishes replicas in the same
> > host via cluster_name.
>
> Let's be honest, this is just a workaround.
>
The question is: what is the problem we want to solve? Ishii-san asked
for a socket path. If we have already figured out the replica (via
application_name), use the replica PID to find the socket path. A new
column as suggested by Tom could show the desired info. Is it *really*
useful? I mean, how many setups have master and replica in the same
server? For a socket connection, directory is important and that
information I can get from unix_socket_directories parameter (I've
never seen a setup with multiple socket directories).


--
   Euler Taveira                                   Timbira -
http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento



Euler Taveira <euler@timbira.com.br> writes:
> The question is: what is the problem we want to solve? Ishii-san asked
> for a socket path. If we have already figured out the replica (via
> application_name), use the replica PID to find the socket path. A new
> column as suggested by Tom could show the desired info. Is it *really*
> useful? I mean, how many setups have master and replica in the same
> server?

Yeah, I think that argument is why we didn't cover the case in the
original view design.  This additional column would be useless on
Windows, too.  Still, since Ishii-san is concerned about this,
I suppose he has a plausible use-case in mind.

> For a socket connection, directory is important and that
> information I can get from unix_socket_directories parameter (I've
> never seen a setup with multiple socket directories).

Those are actually pretty common, for example if you use Red Hat's
packaging you will have both /var/run/postgresql and /tmp as socket
directories (since they consider use of /tmp deprecated, but getting
rid of all clients' use of it turns out to be really hard).  However,
it's definitely fair to question whether anyone *cares* which of
the server's socket directories a given connection used.  Aren't
they going to be pretty much all equivalent?

            regards, tom lane



Re: Adding Unix domain socket path and port topg_stat_get_wal_senders()

From
Tatsuo Ishii
Date:
> The question is: what is the problem we want to solve?

The client_hostname is useful for TCP/IP connections because it
indicates which row of the view is related to which standby server.  I
would like to have the same for UNIX domain socket case as well.

> Ishii-san asked
> for a socket path. If we have already figured out the replica (via
> application_name), use the replica PID to find the socket path.

Well, I would like to avoid to use application_name if possible.

> A new
> column as suggested by Tom could show the desired info. Is it *really*
> useful? I mean, how many setups have master and replica in the same
> server?

For developing/testing purpose I often create master and some replicas
in the same server. The same technique is used in a regression test
for Pgpool-II.

> For a socket connection, directory is important and that
> information I can get from unix_socket_directories parameter (I've
> never seen a setup with multiple socket directories).

Yes, it could be a way to get the same information that
sockaddr_un.sunpath used to provide. But now I realize that it's not
what I want. What I actually wanted was, which row of the view is
related to which standby server. So what I really need is the standby
server's accepting socket path, *not* primary server's. Currently it
seems it's not possible except using the application_name
hack. Probably cleaner way would be walreciver provides socket path
information in startup packet and walsender keeps the info in shared
memory so that pg_stat_replication view can use it later on.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Adding Unix domain socket path and port topg_stat_get_wal_senders()

From
Michael Paquier
Date:
On Fri, Apr 12, 2019 at 11:38:52PM +0900, Tatsuo Ishii wrote:
> If we were ok to add a new column to pg_stat_activity view or
> pg_stat_replication view as well, that will be great.

Okay, no objections with a separate, new, column if that's the
consensus.
--
Michael

Attachment

Re: Adding Unix domain socket path and port topg_stat_get_wal_senders()

From
Peter Eisentraut
Date:
On 2019-04-12 17:57, Tom Lane wrote:
>> For a socket connection, directory is important and that
>> information I can get from unix_socket_directories parameter (I've
>> never seen a setup with multiple socket directories).
> Those are actually pretty common, for example if you use Red Hat's
> packaging you will have both /var/run/postgresql and /tmp as socket
> directories (since they consider use of /tmp deprecated, but getting
> rid of all clients' use of it turns out to be really hard).  However,
> it's definitely fair to question whether anyone *cares* which of
> the server's socket directories a given connection used.  Aren't
> they going to be pretty much all equivalent?

So what is being asked here is really information about which end point
on the server is being connected to.  That is also information for the
TCP/IP case that we don't currently provide.  It's probably of marginal
use, as you also say.

I don't get what this has to do with walsenders specifically.  Do they
have each walsender connect to a different socket?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> On 2019-04-12 17:57, Tom Lane wrote:
>> Those are actually pretty common, for example if you use Red Hat's
>> packaging you will have both /var/run/postgresql and /tmp as socket
>> directories (since they consider use of /tmp deprecated, but getting
>> rid of all clients' use of it turns out to be really hard).  However,
>> it's definitely fair to question whether anyone *cares* which of
>> the server's socket directories a given connection used.  Aren't
>> they going to be pretty much all equivalent?

> So what is being asked here is really information about which end point
> on the server is being connected to.  That is also information for the
> TCP/IP case that we don't currently provide.

Good point.

> It's probably of marginal use, as you also say.

Yeah.  Per downthread discussion, what Tatsuo-san really wants to know
is not that at all, but which client (slave server) is connecting.
It's not very clear how to identify the client, but knowing which socket
it came through doesn't seem to help for that.

            regards, tom lane