Thread: Addition of authenticated ID to pg_stat_activity
Hi all, 9afffcb has added the concept of authenticated identity to the information provided in log_connections for audit purposes, with this data stored in each backend's port. One extra thing that can be really useful for monitoring is the capability to track this information directly in pg_stat_activity. Please find attached a patch to do that, with the following choices made: - Like query strings, authenticated IDs could be rather long, so we need a GUC to control the maximum size allocated for these in shared memory. The attached uses 128 bytes by default, that should be enough in most cases even for DNs, LDAP or krb5. - Multi-byte strings need to be truncated appropriately. As a matter of consistency with the query string code, I have made things so as the truncation is done each time a string is requested, with PgBackendStatus storing the raw information truncated depending on the maximum size allowed at the GUC level. - Some tests are added within the SSL and LDAP code paths. We could add more of that within the authentication and kerberos tests but that did not strike me as mandatory either as the backend logs are checked everywhere already. - The new field has been added at the end of pg_stat_end_activity() mainly as a matter of readability. I'd rather move that just after the application_name, now only pg_stat_activity does that. I am adding that to the next CF. Thanks, -- Michael
Attachment
On Mon, Apr 26, 2021 at 11:34:16AM +0900, Michael Paquier wrote: > +++ b/doc/src/sgml/config.sgml > @@ -7596,6 +7596,24 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; > </listitem> > </varlistentry> > > + <varlistentry id="guc-track-activity-authn-size" xreflabel="track_activity_authn_size"> > + <term><varname>track_activity_authn_size</varname> (<type>integer</type>) > + <indexterm> > + <primary><varname>track_activity_authn_size</varname> configuration parameter</primary> > + </indexterm> > + </term> > + <listitem> > + <para> > + Specifies the amount of memory reserved to store the text of the > + currently executing command for each active session, for the That part looks to be a copy+paste error. > + <structname>pg_stat_activity</structname>.<structfield>authenticated_id</structfield> field. > + If this value is specified without units, it is taken as bytes. > + The default value is 128 bytes. > + This parameter can only be set at server start. > + </para> > + </listitem> > + </varlistentry> I think many/most things in log/CSV should also go in PSA, and vice versa. It seems like there should be a comment about this - in both places - to avoid forgetting it in the future. -- Justin
On Sun, Apr 25, 2021 at 10:14:43PM -0500, Justin Pryzby wrote: > That part looks to be a copy+paste error. Sorry about that. I have fixed that on my own branch. >> + <structname>pg_stat_activity</structname>.<structfield>authenticated_id</structfield> field. >> + If this value is specified without units, it is taken as bytes. >> + The default value is 128 bytes. >> + This parameter can only be set at server start. >> + </para> >> + </listitem> >> + </varlistentry> > > I think many/most things in log/CSV should also go in PSA, and vice versa. > > It seems like there should be a comment about this - in both places - to avoid > forgetting it in the future. I am not sure what you mean here, neither do I see in what this is related to what is proposed on this thread. -- Michael
Attachment
Hi, On 2021-04-26 11:34:16 +0900, Michael Paquier wrote: > 9afffcb has added the concept of authenticated identity to the > information provided in log_connections for audit purposes, with this > data stored in each backend's port. One extra thing that can be > really useful for monitoring is the capability to track this > information directly in pg_stat_activity. I'm getting a bit worried about the incremental increase in pg_stat_activity width - it's probably by far the view that's most viewed interactively. I think we need to be careful not to add too niche things to it. This is especially true for columns that may be wider. It'd be bad for discoverability, but perhaps something like this, that's not that likely to be used interactively, would be better done as a separate function that would need to be used explicitly? A select * from pg_stat_activity on a plain installation, connecting over unix socket, with nothing running, is 411 chars wide for me... Greetings, Andres Freund
Greetings, * Andres Freund (andres@anarazel.de) wrote: > On 2021-04-26 11:34:16 +0900, Michael Paquier wrote: > > 9afffcb has added the concept of authenticated identity to the > > information provided in log_connections for audit purposes, with this > > data stored in each backend's port. One extra thing that can be > > really useful for monitoring is the capability to track this > > information directly in pg_stat_activity. > > I'm getting a bit worried about the incremental increase in > pg_stat_activity width - it's probably by far the view that's most > viewed interactively. I think we need to be careful not to add too niche > things to it. This is especially true for columns that may be wider. > > It'd be bad for discoverability, but perhaps something like this, that's > not that likely to be used interactively, would be better done as a > separate function that would need to be used explicitly? I mean.. we already have separate functions and views for this, though they're auth-method-specific currently, but also provide more details, since it isn't actually a "one size fits all" kind of thing like this entire approach is imagining it to be. Thanks, Stephen
Attachment
On Mon, Apr 26, 2021 at 03:21:46PM -0400, Stephen Frost wrote: > * Andres Freund (andres@anarazel.de) wrote: >> I'm getting a bit worried about the incremental increase in >> pg_stat_activity width - it's probably by far the view that's most >> viewed interactively. I think we need to be careful not to add too niche >> things to it. This is especially true for columns that may be wider. >> >> It'd be bad for discoverability, but perhaps something like this, that's >> not that likely to be used interactively, would be better done as a >> separate function that would need to be used explicitly? > > I mean.. we already have separate functions and views for this, though > they're auth-method-specific currently, but also provide more details, > since it isn't actually a "one size fits all" kind of thing like this > entire approach is imagining it to be. Referring to pg_stat_ssl and pg_stat_gssapi here, right? Yes, that would be very limited as this leads to no visibility for LDAP, all password-based authentications and more. I am wondering if we should take this as an occasion to move some data out of pg_stat_activity into a separate biew, dedicated to the data related to the connection that remains set to the same value for the duration of a backend's life, as of the following set: - the backend PID - client_addr - client_hostname - client_port - authenticated ID - application_name? (well, this one could change on reload, so I am lying). It would be tempting to move the database name and the username but these are popular fields with monitoring. Maybe we could name that pg_stat_connection_status, pg_stat_auth_status or just pg_stat_connection? -- Michael
Attachment
On Tue, Apr 27, 2021 at 09:59:18AM +0900, Michael Paquier wrote: > > I am wondering if we should take this as an occasion to move some data > out of pg_stat_activity into a separate biew, dedicated to the data > related to the connection that remains set to the same value for the > duration of a backend's life, as of the following set: > - the backend PID -1. It's already annoying enough to have to type "WHERE pid != pg_backend_pid()" to exclude my own backend, and I usually need it quite often. Unless we add some new view which integrate that, something like pg_stat_activity_except_me with a better name. I also don't see how we could join a new dedicated view with the old one without that information. > - application_name? (well, this one could change on reload, so I am > lying). No, it can change at any time. And the fact that it's not transactional makes it quite convenient for poor man progress reporting. For instance in powa I use that to report what the bgworker is currently working on, and this has already proven to be useful.
On Tue, Apr 27, 2021 at 09:26:11AM +0800, Julien Rouhaud wrote: > -1. It's already annoying enough to have to type "WHERE pid != > pg_backend_pid()" to exclude my own backend, and I usually need it quite often. > Unless we add some new view which integrate that, something like > pg_stat_activity_except_me with a better name. I also don't see how we could > join a new dedicated view with the old one without that information. Err, sorry for the confusion. What I meant here is to also keep the PID in pg_stat_activity, but also add it to this new view to be able to join things across the board. -- Michael
Attachment
Greetings, * Michael Paquier (michael@paquier.xyz) wrote: > On Mon, Apr 26, 2021 at 03:21:46PM -0400, Stephen Frost wrote: > > * Andres Freund (andres@anarazel.de) wrote: > >> I'm getting a bit worried about the incremental increase in > >> pg_stat_activity width - it's probably by far the view that's most > >> viewed interactively. I think we need to be careful not to add too niche > >> things to it. This is especially true for columns that may be wider. > >> > >> It'd be bad for discoverability, but perhaps something like this, that's > >> not that likely to be used interactively, would be better done as a > >> separate function that would need to be used explicitly? > > > > I mean.. we already have separate functions and views for this, though > > they're auth-method-specific currently, but also provide more details, > > since it isn't actually a "one size fits all" kind of thing like this > > entire approach is imagining it to be. > > Referring to pg_stat_ssl and pg_stat_gssapi here, right? Yes, that > would be very limited as this leads to no visibility for LDAP, all > password-based authentications and more. Yes, of course. The point being made was that we could do the same for the other auth methods rather than adding something to pg_stat_activity. > I am wondering if we should take this as an occasion to move some data > out of pg_stat_activity into a separate biew, dedicated to the data > related to the connection that remains set to the same value for the > duration of a backend's life, as of the following set: > - the backend PID > - client_addr > - client_hostname > - client_port > - authenticated ID > - application_name? (well, this one could change on reload, so I am > lying). application_name certainly changes, as pointed out elsewhere. > It would be tempting to move the database name and the username but > these are popular fields with monitoring. Maybe we could name that > pg_stat_connection_status, pg_stat_auth_status or just > pg_stat_connection? I don't know that there's really any of the existing fields that aren't "popular fields with monitoring".. The issue that Andres brought up wasn't about monitoring though- it was about users looking interactively. Monitoring systems can adjust their queries for the new major version to do whatever joins, et al, they need and that's a once-per-major-version to do. On the other hand, people doing: table pg_stat_activity; Would like to get the info they really want out of that and not anything else. If we're going to adjust the fields returned from that then that's really the lens we should use. So, what fields are people really looking at when querying pg_stat_activity interactively? User, database, pid, last query, transaction start, query start, state, wait event info, maybe backend xmin/xid? I doubt most people looking at pg_stat_activity interactively actually care about the non-user backends (autovacuum, et al). Thanks, Stephen
Attachment
Hi, On 2021-04-27 12:40:29 -0400, Stephen Frost wrote: > So, what fields are people really looking at when querying > pg_stat_activity interactively? User, database, pid, last query, > transaction start, query start, state, wait event info, maybe backend > xmin/xid? I doubt most people looking at pg_stat_activity interactively > actually care about the non-user backends (autovacuum, et al). Not representative, but I personally am about as often interested in one of the non-connection processes as the connection ones. E.g. investigating what is autovacuum's bottleneck, are checkpointer / wal writer / bgwriter io bound or keeping up, etc. Greetings, Andres Freund
On Tue, Apr 27, 2021 at 8:25 PM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2021-04-27 12:40:29 -0400, Stephen Frost wrote: > > So, what fields are people really looking at when querying > > pg_stat_activity interactively? User, database, pid, last query, > > transaction start, query start, state, wait event info, maybe backend > > xmin/xid? I doubt most people looking at pg_stat_activity interactively > > actually care about the non-user backends (autovacuum, et al). > > Not representative, but I personally am about as often interested in one > of the non-connection processes as the connection > ones. E.g. investigating what is autovacuum's bottleneck, are > checkpointer / wal writer / bgwriter io bound or keeping up, etc. I definitely use it all the time to monitor autovacuum all the time. The others as well regularly, but autovacuum continuously. I also see a lot of people doing things like "from pg_stat_activity where query like '%mytablename%'" where they'd want both any regular queries and any autovacuums currently processing the table. I'd say client address is also pretty common to identify which set of app servers connections are coming in from -- but client port and client hostname are a lot less interesting. But it'd be kind of weird to split those out. For *interactive use* I'd find pretty much all other columns interesting and commonly used. Probably not that interested in the oids of the database and user, but again they are the cheap ones. We could get rid of the joints if we only showed the oids, but in interactive use it's really the names that are interesting. But if we're just trying to save column count, I'd say get rid of datid and usesysid. I'd hold everything else as interesting. -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
On Thu, Apr 29, 2021 at 04:56:42PM +0200, Magnus Hagander wrote: > I definitely use it all the time to monitor autovacuum all the time. > The others as well regularly, but autovacuum continuously. I also see > a lot of people doing things like "from pg_stat_activity where query > like '%mytablename%'" where they'd want both any regular queries and > any autovacuums currently processing the table. When it comes to development work, I also look at things different than backend connections, checkpointer and WAL writer included. > I'd say client address is also pretty common to identify which set of > app servers connections are coming in from -- but client port and > client hostname are a lot less interesting. But it'd be kind of weird > to split those out. Yes, I agree that it would be confusing to split the client_* fields across multiple views. > For *interactive use* I'd find pretty much all other columns > interesting and commonly used. Probably not that interested in the > oids of the database and user, but again they are the cheap ones. We > could get rid of the joints if we only showed the oids, but in > interactive use it's really the names that are interesting. But if > we're just trying to save column count, I'd say get rid of datid and > usesysid. > > I'd hold everything else as interesting. Yes, you have an argument here about the removal of usesysid and datid. Now I find joins involving OIDs to be much more natural than the object names, because that's the base of what we use in the catalogs. Not sure if we would be able to agree on something here, but the barrier to what a session and a connection hold is thin when it comes to roles and application_name. Thinking more about that, I would be really tempted to get to do a more straight split with data that's associated to a session, to a transaction and to a connection, say: 1) pg_stat_session, data that may change. - PID - leader PID - the role name - role ID - application_name - wait_event_type - wait_event 2) pg_stat_connection, static data associated to a connection. - PID - database name - database OID - client_addr - client_hostname - client_port - backend_start - authn ID - backend_type 3) pg_stat_transaction, or pg_stat_activity, for the transactional activity. - PID - xact_start - query_start - backend_xid - state_change - query string - query ID - state Or I could just drop a new function that fetches the authn ID for a given PID, even if this makes things potentially less consistent when it comes to the lookup of PgBackendStatus, guarantee given now by pg_stat_get_activity(). -- Michael
Attachment
On Mon, May 17, 2021 at 6:35 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, Apr 29, 2021 at 04:56:42PM +0200, Magnus Hagander wrote: > > I definitely use it all the time to monitor autovacuum all the time. > > The others as well regularly, but autovacuum continuously. I also see > > a lot of people doing things like "from pg_stat_activity where query > > like '%mytablename%'" where they'd want both any regular queries and > > any autovacuums currently processing the table. > > When it comes to development work, I also look at things different > than backend connections, checkpointer and WAL writer included. While I think we should optimize these primarily for users and not developers, I definitely do those things as well. In particular wait events for the background processes. > > I'd say client address is also pretty common to identify which set of > > app servers connections are coming in from -- but client port and > > client hostname are a lot less interesting. But it'd be kind of weird > > to split those out. > > Yes, I agree that it would be confusing to split the client_* fields > across multiple views. > > > For *interactive use* I'd find pretty much all other columns > > interesting and commonly used. Probably not that interested in the > > oids of the database and user, but again they are the cheap ones. We > > could get rid of the joints if we only showed the oids, but in > > interactive use it's really the names that are interesting. But if > > we're just trying to save column count, I'd say get rid of datid and > > usesysid. > > > > I'd hold everything else as interesting. > > Yes, you have an argument here about the removal of usesysid and > datid. Now I find joins involving OIDs to be much more natural than > the object names, because that's the base of what we use in the > catalogs. Agreed. And I'm not sure the actual gain is that big if we can just remove oid columns... > Not sure if we would be able to agree on something here, but the > barrier to what a session and a connection hold is thin when it comes > to roles and application_name. Thinking more about that, I would be > really tempted to get to do a more straight split with data that's > associated to a session, to a transaction and to a connection, say: > 1) pg_stat_session, data that may change. > - PID > - leader PID > - the role name > - role ID > - application_name > - wait_event_type > - wait_event > 2) pg_stat_connection, static data associated to a connection. > - PID > - database name > - database OID > - client_addr > - client_hostname > - client_port > - backend_start > - authn ID > - backend_type > 3) pg_stat_transaction, or pg_stat_activity, for the transactional > activity. > - PID > - xact_start > - query_start > - backend_xid > - state_change > - query string > - query ID > - state This seems extremely user-unfriendly to me. I mean. Timestamps are nso split out between different views so you can't track the process iwthout it. And surely wait_event info is *extremely* related to things like what query is running and what state the session is in. And putting backend_type off in a separate view means most queries are going to have to join that in anyway. Or include it in all views. And if we're forcing the majority of queries to join multiple views, what have we actually gained? Based on your list above, I'd definitely want at least (1) and (2) to be in the same one, but they'd have to also gain at least the database oid/name and backend_type, and maybe also backend_start. So basically, it would be moving out client_*, and authn_id. If we're doing that then as you say maybe pg_stat_connection is a good name and could then *also* gain the information that's currently in the ssl and gss views for a net simplification. tld;dr; I think we have to be really careful here or the cure is going to be way worse than the disease. > Or I could just drop a new function that fetches the authn ID for a > given PID, even if this makes things potentially less consistent when > it comes to the lookup of PgBackendStatus, guarantee given now by > pg_stat_get_activity(). Well, the authnid will never change so I'm not sure the consistency part is a big problem? Or maybe I'm misunderstanding the risk you're referring to? -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
On Mon, May 17, 2021 at 10:28:49AM +0200, Magnus Hagander wrote: > On Mon, May 17, 2021 at 6:35 AM Michael Paquier <michael@paquier.xyz> wrote: >> Not sure if we would be able to agree on something here, but the >> barrier to what a session and a connection hold is thin when it comes >> to roles and application_name. Thinking more about that, I would be >> really tempted to get to do a more straight split with data that's >> associated to a session, to a transaction and to a connection, say: >> 1) pg_stat_session, data that may change. >> - PID >> - leader PID >> - the role name >> - role ID >> - application_name >> - wait_event_type >> - wait_event >> 2) pg_stat_connection, static data associated to a connection. >> - PID >> - database name >> - database OID >> - client_addr >> - client_hostname >> - client_port >> - backend_start >> - authn ID >> - backend_type >> 3) pg_stat_transaction, or pg_stat_activity, for the transactional >> activity. >> - PID >> - xact_start >> - query_start >> - backend_xid >> - state_change >> - query string >> - query ID >> - state > > This seems extremely user-unfriendly to me. > > I mean. Timestamps are nso split out between different views so you > can't track the process iwthout it. And surely wait_event info is > *extremely* related to things like what query is running and what > state the session is in. And putting backend_type off in a separate > view means most queries are going to have to join that in anyway. Or > include it in all views. And if we're forcing the majority of queries > to join multiple views, what have we actually gained? > > Based on your list above, I'd definitely want at least (1) and (2) to > be in the same one, but they'd have to also gain at least the database > oid/name and backend_type, and maybe also backend_start. Okay. > So basically, it would be moving out client_*, and authn_id. So that would mean the addition of one new catalog view, called pg_stat_connection, with the following fields: - PID - all three client_* - authn ID I can live with this split. Thoughts from others? > If we're > doing that then as you say maybe pg_stat_connection is a good name and > could then *also* gain the information that's currently in the ssl and > gss views for a net simplification. I am less enthutiastic about this addition. SSL and GSSAPI have no fields in common, so that would bloat the view for connection data with mostly NULL fields most of the time. > tld;dr; I think we have to be really careful here or the cure is going > to be way worse than the disease. Agreed. >> Or I could just drop a new function that fetches the authn ID for a >> given PID, even if this makes things potentially less consistent when >> it comes to the lookup of PgBackendStatus, guarantee given now by >> pg_stat_get_activity(). > > Well, the authnid will never change so I'm not sure the consistency > part is a big problem? Or maybe I'm misunderstanding the risk you're > referring to? I just mean to keep the consistency we have now with one single call of pg_stat_get_activity() for each catalog view, so as we still fetch once a consistent copy of all PgBackendStatus entries in this code path. -- Michael
Attachment
On Tue, May 18, 2021 at 11:20:49AM +0900, Michael Paquier wrote: > So that would mean the addition of one new catalog view, called > pg_stat_connection, with the following fields: > - PID > - all three client_* > - authn ID > I can live with this split. Thoughts from others? Just to make the discussion move on, attached is an updated version doing that. -- Michael
Attachment
Hi Michael, > Just to make the discussion move on, attached is an updated version > doing that. The code seems OK, but I have mixed feelings about the way that the VIEW currently works. Here is what I get when a single user is connected via a UNIX socket: 43204 (master) =# select * from pg_stat_connection; pid | authenticated_id | client_addr | client_hostname | client_port -------+------------------+-------------+-----------------+------------- 25806 | | | | 25808 | | | | 43204 | | | | -1 25804 | | | | 25803 | | | | 25805 | | | | (6 rows) I bet we could be more user-friendly than this. To begin with, the documentation says: + <para> + The <structname>pg_stat_connection</structname> view will have one row + per server process, showing information related to + the current connection of that process. + </para> [...] + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>client_addr</structfield> <type>inet</type> + </para> + <para> + IP address of the client connected to this backend. + If this field is null, it indicates either that the client is + connected via a Unix socket on the server machine or that this is an + internal process such as autovacuum. + </para></entry> + </row> Any reason why we shouldn't simply exclude internal processes from the view? Do they have a connection that the VIEW could show? Secondly, maybe instead of magic constants like -1, we could use an additional text column, e.g. connection_type: "unix"? Thirdly, not sure if client_hostname is really needed, isn't address:port pair enough to identify the client? Lastly, introducing a new GUC for truncating values in a single view, which can only be set at server start, doesn't strike me as a great idea. What is the worst-case scenario if instead we will always allocate `strlen(MyProcPort->authn_id)` and the user will truncate the result manually if needed? -- Best regards, Aleksander Alekseev
On Mon, Jul 19, 2021 at 04:56:24PM +0300, Aleksander Alekseev wrote: > Any reason why we shouldn't simply exclude internal processes from the > view? Do they have a connection that the VIEW could show? Yeah, they don't really have any useful information. Still, I kept that mainly as a matter of consistency with pg_stat_activity, and this can be useful to find out about internal processes without relying on an extra check based on pg_stat_activity.backend_type. Perhaps we could just add a check in system_views.sql based on the NULL-ness of client_port. > Secondly, maybe instead of magic constants like -1, we could use an > additional text column, e.g. connection_type: "unix"? I am not really incline to break that more, as told by pg_stat_get_activity() there are cases where this looks useful: /* * Unix sockets always reports NULL for host and -1 for * port, so it's possible to tell the difference to * connections we have no permissions to view, or with * errors. */ > Thirdly, not > sure if client_hostname is really needed, isn't address:port pair > enough to identify the client? It seems to me that this is still useful to know about Port->remote_hostname. > Lastly, introducing a new GUC for > truncating values in a single view, which can only be set at server > start, doesn't strike me as a great idea. What is the worst-case > scenario if instead we will always allocate > `strlen(MyProcPort->authn_id)` and the user will truncate the result > manually if needed? The authenticated ID could be a SSL DN longer than the default of 128kB that this patch is proposing. I think that it is a good idea to provide some way to the user to be able to control that without a recompilation. -- Michael
Attachment
On Wed, Jul 21, 2021 at 01:21:17PM +0900, Michael Paquier wrote: > The authenticated ID could be a SSL DN longer than the default of > 128kB that this patch is proposing. I think that it is a good idea to > provide some way to the user to be able to control that without a > recompilation. I got to think about this patch more for the last couple of days, and I'd still think that having a GUC to control how much shared memory we need for the authenticated ID in each BackendStatusArray. Now, the thread has been idle for two months now, and it does not seem to attract much attention. This also includes a split of pg_stat_activity for client_addr, client_hostname and client_port into a new catalog, which may be hard to justify for this feature. So I am dropping the patch. -- Michael