Thread: Added columns to pg_stat_activity
This patch ticks off the following TODO items: * Add session start time and last statement time to pg_stat_activity (we already had last-statement-time, provided command string stats were enabled) * Add the client IP address and port to pg_stat_activity I have tested on Linux (both IPV4 and Unix sockets for the address/port) and Win32 (naturally only IPV4). I don't have a machine around with IPV6 support, so I haven't tested that. The code to handle it is stolen from inet_client_addr() though, so it should hopefully be ok. //Magnus
Attachment
Magnus Hagander wrote: > * Add session start time and last statement time to pg_stat_activity > (we already had last-statement-time, provided command string stats were > enabled) > * Add the client IP address and port to pg_stat_activity Looks pretty good -- barring any objections I'll apply this to HEAD tomorrow or the day after (there's a room for a bit of cleanup to the docs, but I'll do that before committing). > I have tested on Linux (both IPV4 and Unix sockets for the address/port) > and Win32 (naturally only IPV4). Do you mean we only support IPv4 under Win32? (There are certainly versions of Win32 which support IPv6.) -Neil
Neil Conway <neilc@samurai.com> writes: > Looks pretty good -- barring any objections I'll apply this to HEAD > tomorrow or the day after (there's a room for a bit of cleanup to the > docs, but I'll do that before committing). Couple of minor gripes: 1. Can we reduce the amount of stuff #include'd by pgstat.h? I particularly dislike including builtins.h there; that surely should be unnecessary, since there's nothing but externs in builtins.h. 2. WRT the inet.h change: the project standard is that extern declarations are spelled with "extern". regards, tom lane
> > Looks pretty good -- barring any objections I'll apply this to HEAD > > tomorrow or the day after (there's a room for a bit of > cleanup to the > > docs, but I'll do that before committing). > > Couple of minor gripes: > > 1. Can we reduce the amount of stuff #include'd by pgstat.h? > I particularly dislike including builtins.h there; that > surely should be unnecessary, since there's nothing but > externs in builtins.h. Right. Only one needs to be there, the other two can be moved to pgstatfuncs.c. The last one is needed to get the SockAddr structure. > 2. WRT the inet.h change: the project standard is that > extern declarations are spelled with "extern". Fixed. Updated patch attached. //Magnus
Attachment
Magnus Hagander wrote: > Updated patch attached. A few more comments: - why did you add the client address to PgStat_MsgHdr, rather than PgStat_MsgBestart? It would be nice to avoid sending the client address for each and every stats message, as it shouldn't change over the life of the session. - I think the backend_client_addr and backend_client_port view columns should be named "client_addr" and "client_port", respectively. - Is there a reason you need to expose and then call the private network_in() function, rather than using inet_in(), which is already public? - Is the backend startup time sensitive information? Considering this information is available via ps(1), perhaps it would be better to allow any user to see any backend's startup time, rather than providing a false sense of security. - We return (NULL, -1) for the client address/port if connected via a unix domain socket, and (NULL, NULL) for the address/port if the selecting user isn't a superuser / the owner of that client connection. It seems a little ugly to return NULL for the address in both cases, since the same value is used for two completely different meanings. Not sure of a better convention, though -- just idly complaining :) You needn't respin the patch, as I can make the necessary changes myself -- I'm just looking for comments on the above before applying. -Neil
>> Updated patch attached. > >A few more comments: > >- why did you add the client address to PgStat_MsgHdr, rather than >PgStat_MsgBestart? It would be nice to avoid sending the >client address >for each and every stats message, as it shouldn't change over the life >of the session. I guess that's from not reading things carefully enough. I looked for where "backend pid" was transmitted, because I thought that would be a good place to put it. Which might bring up the question, why are things like the backend pid sent in msghdr and not in bestart? (seems bestart is nothing more than the header, which is why I missed it completely) >- I think the backend_client_addr and backend_client_port view columns >should be named "client_addr" and "client_port", respectively. Seems reasonable. >- Is there a reason you need to expose and then call the private >network_in() function, rather than using inet_in(), which is >already public? Not really, that part of the code was a copy from client_addr(). >- Is the backend startup time sensitive information? Considering this >information is available via ps(1), perhaps it would be better >to allow >any user to see any backend's startup time, rather than providing a >false sense of security. Well, ps(1) will show you the client address as well, so the argument can be taken there too. For ps(1) access, you need shell access to the machine. Something at least I would *never* give to people who just "do stuff in the database". That said, I'm not sure that the startup time specifically is sensitive, no. No real rason to hide that, but I'm not sure I buy the argument "considering since it's available via ps(1)" ;-) >- We return (NULL, -1) for the client address/port if connected via a >unix domain socket, and (NULL, NULL) for the address/port if the >selecting user isn't a superuser / the owner of that client >connection. >It seems a little ugly to return NULL for the address in both cases, >since the same value is used for two completely different >meanings. Not >sure of a better convention, though -- just idly complaining :) I first did it with NULL,NULL, before I realised I could not make just that distinction. I guess in theory we could return 0.0.0.0 or 255.255.255.255 (similar for IPv6 of course, don't know offhand what that would be), btu that seems at least as ugly. Another way would be to have yet another column that would be "client address type", but given the options I think the current way is probably the least ugly. >You needn't respin the patch, as I can make the necessary >changes myself >-- I'm just looking for comments on the above before applying. Ok. //Magnus
Neil Conway <neilc@samurai.com> writes: > - Is the backend startup time sensitive information? Considering this > information is available via ps(1), perhaps it would be better to allow > any user to see any backend's startup time, rather than providing a > false sense of security. Remote database users don't have access to ps(1). I think that hiding the startup time is reasonable. > - We return (NULL, -1) for the client address/port if connected via a > unix domain socket, and (NULL, NULL) for the address/port if the > selecting user isn't a superuser / the owner of that client connection. > It seems a little ugly to return NULL for the address in both cases, > since the same value is used for two completely different meanings. Not > sure of a better convention, though -- just idly complaining :) Perhaps 0.0.0.0/0 or some such for the address in the unix domain case? Not sure that this is an improvement over NULL though. Other comments seem on-target to me. regards, tom lane
"Magnus Hagander" <mha@sollentuna.net> writes: > I guess that's from not reading things carefully enough. I looked for > where "backend pid" was transmitted, because I thought that would be a > good place to put it. Which might bring up the question, why are things > like the backend pid sent in msghdr and not in bestart? I think the PID is used as the lookup key to associate messages from the same backend. You need some such key, though BackendId (sinval slot number) is another workable candidate. regards, tom lane
Magnus Hagander wrote: > I guess that's from not reading things carefully enough. I looked for > where "backend pid" was transmitted, because I thought that would be a > good place to put it. On adapting the code to just send the client address in the BE start message, I'm not actually sure this is the right way to go. The problem is that stats messages are unordered -- so we might receive, say, an activity message for a backend before we receive its BE startup message. That means pg_stat_get_backend_client_addr() might be invoked for a backend the stats collector knows about, but doesn't know its client address. We could invent a sentinel value for this case and return SQL NULL, but it seems quite ugly (since the backend entry struct stores SockAddr, not SockAddr *)-- simpler to just send the client address with every message, I guess. > Which might bring up the question, why are things like the backend > pid sent in msghdr and not in bestart? If not for the situation outlined above, we could move at least m_userid out of the message header. > That said, I'm not sure that the startup time specifically is sensitive, > no. No real rason to hide that, but I'm not sure I buy the argument > "considering since it's available via ps(1)" ;-) Yeah, that's true, the ps(1) argument is invalid. May as well limit access to it... > I first did it with NULL,NULL, before I realised I could not make just > that distinction. I guess in theory we could return 0.0.0.0 or > 255.255.255.255 (similar for IPv6 of course, don't know offhand what > that would be), btu that seems at least as ugly. > Another way would be to have yet another column that would be "client > address type", but given the options I think the current way is probably > the least ugly. Yeah, I agree -- I guess we should stick with the current convention. -Neil
Neil Conway <neilc@samurai.com> writes: > On adapting the code to just send the client address in the BE start > message, I'm not actually sure this is the right way to go. The problem > is that stats messages are unordered -- so we might receive, say, an > activity message for a backend before we receive its BE startup message. I think this argument is a red herring ... or at least it leads in a direction I find unacceptable. We are already transmitting three more fields than necessary in every MsgHdr --- PID, database OID, and user ID --- and it will only get worse if we go down this path. I would have to ask what is the point of the BESTART message at all, if the design requires retransmitting everything it could possibly carry in every subsequent message. It is true that the pgstat message mechanism is unreliable, and therefore you will sometimes not find out things you would like to find out. But adding more overhead onto every message is not the way to improve that situation; it will make it worse by increasing the bandwidth demand. I think we should just accept that we will sometimes have to return NULL for these items because we missed a message. (And if it wasn't clear, I'm in favor of stripping the msghdr down. The header should be about 8 bytes not 24, let alone 40+ which is what it would have to be to handle IPv6 client addresses the way you suggest.) regards, tom lane
Tom Lane wrote: > I think this argument is a red herring ... or at least it leads in a > direction I find unacceptable. I agree -- I was just pointing out the reason that, in the current design, there is cause to do things as Magnus' original patch did. > We are already transmitting three more fields than necessary in every > MsgHdr --- PID, database OID, and user ID --- and it will only get > worse if we go down this path. How about changing the statistics collector so that we only include a row in the statistics view when we receive the BESTART message? That would mean the BESTART message could include backend-start metadata (user ID, database ID, client address), and all other messages would only need enough header data to identify the backend process they are associated with (so perhaps backend id and process id). Using the existing dead-backend hash table, we should be able to distinguish between the cases of "seen a message for a new backend before receiving its BESTART" and "seen a message for a dead backend after we've seen its BETERM". -Neil
Neil Conway <neilc@samurai.com> writes: > How about changing the statistics collector so that we only include a > row in the statistics view when we receive the BESTART message? I'd vote against that. The mechanism is lossy by design and so we must design on the assumption that we will sometimes lose the BESTART message. I don't think it's acceptable to refuse to display information we do have (queries, access counts) just because we don't have every element of a rather-arbitrarily-chosen view row. > would mean the BESTART message could include backend-start metadata > (user ID, database ID, client address), and all other messages would > only need enough header data to identify the backend process they are > associated with (so perhaps backend id and process id). Looking at the code, backendid seems sufficient since that is the lookup key. regards, tom lane
Tom Lane wrote: > I'd vote against that. The mechanism is lossy by design Is it _lossy_, or merely unordered? While UDP doesn't guarantee message delivery, I wonder what kind of extreme circumstances would need to exist for you to lose UDP packets outright over the loopback interface. > I don't think it's acceptable to refuse to display information > we do have (queries, access counts) just because we don't have every > element of a rather-arbitrarily-chosen view row. Is there really any point in returning such incomplete statistics data? ISTM it would mostly serve to complicate the lives of people writing automated tools to query statistics data, for example. > Looking at the code, backendid seems sufficient since that is the > lookup key. Not if you want to distinguish between dead and live backends, it's not (see pgstat_add_backend). -Neil
Neil Conway <neilc@samurai.com> writes: > Tom Lane wrote: >> I'd vote against that. The mechanism is lossy by design > Is it _lossy_, or merely unordered? While UDP doesn't guarantee message > delivery, I wonder what kind of extreme circumstances would need to > exist for you to lose UDP packets outright over the loopback interface. If the stats collector gets sufficiently backed up, you will lose messages. The most obvious case is the "statistics buffer is full" error in pgstat_recvbuffer() (and you can find two or three reports of that message in just the last week or so of the mailing list archives). But the entire messaging mechanism is *specifically* designed to drop messages rather than block backends if the stats buffer process gets backed up to the point where it can't accept messages fast enough. For starters, we'd have used TCP not UDP if we didn't want it to do that. >> I don't think it's acceptable to refuse to display information >> we do have (queries, access counts) just because we don't have every >> element of a rather-arbitrarily-chosen view row. > Is there really any point in returning such incomplete statistics data? Absolutely. Given the lossiness of the mechanism, the total stats data is incomplete by definition, and so if we were insistent on returning only 100% guaranteed data, we might as well shut down the collector and return nothing. The entire facility is designed on the assumption that partial data is better than no data (and also better than taking the performance hit that would ensue if we tried to guarantee full data). Please do not try to make piecemeal changes in that basic design decision. >> Looking at the code, backendid seems sufficient since that is the >> lookup key. > Not if you want to distinguish between dead and live backends, it's not > (see pgstat_add_backend). Ah, good point. But still we need not send more than backend PID plus BackendId. (Actually, we could skinny it down to backend PID only ... but that would add a hashtable lookup to every message reception in the stats collector, instead of just indexing into a BackendId array, so it'd probably be a net loss.) regards, tom lane
Tom Lane wrote: > If the stats collector gets sufficiently backed up, you will lose > messages. [...] Please do not try to make piecemeal changes in that > basic design decision. Fair enough. > (Actually, we could skinny it down to backend PID only ... > but that would add a hashtable lookup to every message reception in > the stats collector, instead of just indexing into a BackendId array, > so it'd probably be a net loss.) Yeah, I agree -- backend ID and database ID isn't too bad. I've applied Magnus patch and bumped the catalog version. I'll take a look at reorganizing the statistics collector so that the message header is smaller shortly. -Neil
Is this what broke the regression tests on HEAD? cheers andrew Neil Conway wrote: > Tom Lane wrote: > >> If the stats collector gets sufficiently backed up, you will lose >> messages. [...] Please do not try to make piecemeal changes in that >> basic design decision. > > > Fair enough. > > > (Actually, we could skinny it down to backend PID only ... > >> but that would add a hashtable lookup to every message reception in >> the stats collector, instead of just indexing into a BackendId array, >> so it'd probably be a net loss.) > > > Yeah, I agree -- backend ID and database ID isn't too bad. > > I've applied Magnus patch and bumped the catalog version. I'll take a > look at reorganizing the statistics collector so that the message > header is smaller shortly. > > -
Certainly seems that way. Missed a .diff for the rules output there... Should be a simple replace, from what I can tell what's in the output on the buildfarm boxes I checked is correct, the expected file shuold changed. Oops. //Magnus > > Is this what broke the regression tests on HEAD? > > cheers > > andrew > > Neil Conway wrote: > > > Tom Lane wrote: > > > >> If the stats collector gets sufficiently backed up, you will lose > >> messages. [...] Please do not try to make piecemeal > changes in that > >> basic design decision. > > > > > > Fair enough. > > > > > (Actually, we could skinny it down to backend PID only ... > > > >> but that would add a hashtable lookup to every message > reception in > >> the stats collector, instead of just indexing into a > BackendId array, > >> so it'd probably be a net loss.) > > > > > > Yeah, I agree -- backend ID and database ID isn't too bad. > > > > I've applied Magnus patch and bumped the catalog version. > I'll take a > > look at reorganizing the statistics collector so that the message > > header is smaller shortly. > > > > - > >
Andrew Dunstan wrote: > Is this what broke the regression tests on HEAD? Yes, it is -- fix checked in to HEAD. My apologies for missing this. -Neil